Skip to content

For a long time I’ve written technical documentation and used short screencasts to help explain workflows and concepts — ten-second demos, no audio, uploaded directly to WordPress. My go-to approach was a small bash script that wrapped ffmpeg and helped me remember the right flags without having to look them up every time.

It worked. But I’d never really interrogated it.

One of the things I’ve been doing lately is revisiting old tools and workflows with the help of an AI. Not to replace them, but to actually understand them better than I did when I first wrote them. When I walked through my screencast script in a recent session, I came away with a few things I genuinely didn’t know — including one flag that explains why some of my existing videos buffer slightly before playing in a browser.

What My Original Script Did

The original was two lines:

ffmpeg -i "${file}" -vcodec h264 -acodec aac "${slug}.mp4"

That’s a functional command, but it has a few issues worth addressing:

  • -vcodec and -acodec are deprecated aliases. The modern flags are -c:v and -c:a
  • There’s no explicit quality setting, so ffmpeg uses its default (CRF 23), which is fine but arbitrary
  • It doesn’t handle files without audio cleanly
  • It’s missing -movflags +faststart, which turns out to matter for web playback

The movflags +faststart Flag

This was the most interesting thing I learned. The ffmpeg formats documentation describes it as:

Run a second pass moving the index (moov atom) to the beginning of the file.

In plain terms: an MP4 file contains a metadata block that tells the browser how to decode and play it. By default, ffmpeg writes that block at the end of the file. That means a browser has to download the entire file before it can start playing it. With -movflags +faststart, that block is moved to the front, and playback can begin immediately.

For short screencasts this isn’t dramatic — but it’s the difference between a video that starts playing and one that sits there for a beat before it does.

CRF for Screencasts

The ffmpeg H.264 encoding guide documents CRF (Constant Rate Factor) as the recommended quality control method for libx264. The range is 0–51, lower is better, default is 23.

Screencasts are mostly flat backgrounds and sharp text — a different compression profile than live video or film. CRF 18–20 produces noticeably crisper text at still-small file sizes for this kind of content. The CRF guide from Sebastian Ebner is a good reference if you want to understand the tradeoffs in more depth.

The Updated Script

#!/usr/bin/env bash
# Convert MOV (or any ffmpeg-compatible video) to MP4 optimized for web delivery.

set -euo pipefail

usage() {
  cat >&2 <<'EOF'
mov2mp4 — Convert video to web-ready MP4

Usage:
  mov2mp4 <input> [crf]

Arguments:
  input   Source video (MOV, MP4, MKV, etc.)
  crf     Quality level 0-51, lower = better (default: 20)
          18-20 is the sweet spot for screencasts
          23 is ffmpeg's general-purpose default

Examples:
  mov2mp4 demo.mov
  mov2mp4 demo.mov 18
EOF
}

INPUT="${1:-}"
if [[ -z "$INPUT" || "$INPUT" == "--help" || "$INPUT" == "-h" ]]; then
  usage
  [[ -z "$INPUT" ]] && exit 1 || exit 0
fi

if ! ffprobe -v error -i "$INPUT" > /dev/null 2>&1; then
  echo "Error: not a valid media file: $INPUT" >&2
  exit 1
fi

CRF="${2:-20}"
SLUG="${INPUT%.*}"
OUTPUT="${SLUG}.mp4"

HAS_AUDIO=$(ffprobe -v error -select_streams a \
  -show_entries stream=codec_type \
  -of default=noprint_wrappers=1 "$INPUT" 2>/dev/null)

AUDIO_FLAGS="-an"
[[ -n "$HAS_AUDIO" ]] && AUDIO_FLAGS="-c:a aac -ar 44100"

echo "Converting: $INPUT$OUTPUT (CRF=$CRF)"

ffmpeg -i "$INPUT" -c:v libx264 -crf "$CRF" $AUDIO_FLAGS -movflags +faststart "$OUTPUT"

ORIGINAL=$(du -sh "$INPUT" | cut -f1)
RESULT=$(du -sh "$OUTPUT" | cut -f1)
echo "Done: $ORIGINAL$RESULT"

Save it to ~/.local/bin/mov2mp4, chmod +x ~/.local/bin/mov2mp4, and make sure ~/.local/bin is on your $PATH.

A few things worth noting about how this differs from the original:

  • Accepts any ffmpeg-compatible input, not just .mov — so uppercase .MOV from an iPhone works the same as anything from QuickTime or OBS
  • Detects whether the input has an audio track and handles it correctly either way: silent screencasts get -an (explicit and clean), files with audio get AAC at 44100Hz
  • Input validation via ffprobe before attempting conversion
  • Prints file size before and after, which is a small thing but useful for spot-checking output