VideoSplit · Guides · How to Extract Audio for Transcription

How-to guide

How to extract audio for transcription

Transcription tools — Whisper, Otter, Rev, Descript, AssemblyAI — all prefer a clean audio source over a video file. VideoSplit gives you a 48 kHz WAV in one click, which is the sample rate Whisper was trained on and the format every ASR pipeline handles without a secondary conversion step.

Whisper's tiny/base models downsample internally to 16 kHz, but giving them 48 kHz WAV input never hurts — it is always better to downsample from a clean source than to upsample from a lossy one.

Step-by-step

  1. Open VideoSplit.io. Any browser.
  2. Drop the source video. MP4, MOV, MKV, WEBM — whatever you have. VideoSplit decodes it locally.
  3. Pick WAV. WAV is what you want for transcription. MP3 works too, but you lose a small amount of fidelity at the low end that ASR models use for speaker segmentation.
  4. Download the WAV. Saves at 48 kHz PCM.
  5. Feed it into your ASR tool of choice. whisper audio.wav, or upload to Otter/Rev/Descript. No further preprocessing needed.

Tips for better results

Free forever. No upload, no account.

Drop a video, get a WAV or MP3. Runs entirely in your browser — nothing uploads, nothing to install.

Try it free