Live encoding with VP9 using FFmpeg
Encoding parameters
VP9 provides a range of parameters to optimize live encoding. Some broad principles of these are discussed in Bitrate Modes.
FFmpeg VP9 encoding example
The table below describes the parameters of an example ffmpeg
call for VP9
encoding.
Parameter | Description |
---|---|
-quality realtime |
realtime is essential for live streaming and for speeds above 5 . |
-speed 6 |
Speed 5 to 8 should be used for live / real-time encoding. Lower numbers (5 or 6 ) are higher quality but require more CPU power. Higher numbers (7 or 8 ) will be lower quality but more manageable for lower latency use cases and also for lower CPU power devices such as mobile. |
-tile-columns 4 |
Tiling splits the video into rectangular regions, which allows multi-threading for encoding and decoding. The number of tiles is always a power of two. 0 = 1 tile, 1 = 2, 2 = 4, 3 = 8, 4 = 16, 5 = 32. |
-frame-parallel 1 |
Enable parallel decodability features. |
-threads 8 |
Maximum number of threads to use. |
-static-thresh 0 |
Motion detection threshold. |
-max-intra-rate 300 |
Maximum i-Frame bitrate (pct) |
-deadline realtime |
Alternative (legacy) version of -quality realtime |
-lag-in-frames 0 |
Maximum number of frames to lag |
-qmin 4 -qmax 48 |
Minimum and maximum values for the quantizer. The values here are merely a suggestion and adjusting this will help increase/decrease video quality at the expense of compression efficiency. |
-row-mt 1 |
Enable row-multithreading. Allows use of up to 2x thread as tile columns. 0 = off, 1 = on. |
-error-resilient 1 |
Enable error resiliency features. |
Choosing encoding parameters
The information below uses constant bitrate (CBR) encoding for live adaptive bitrate streaming (ABR), where each target rate is explicitly set in the packager's manifest. This will result in cleaner "switching" between rates for clients. Variable bitrate (VBR) encoding and CQ mode are also options if the bitrate can be more flexible or the encoding is being chunked. Q mode will struggle with the realtime encoding required for live video. See Bitrate Modes for more information.
For further details on how to manipulate VP9 it is also worth referring to the accompanying article on VOD settings, but taking into consideration a focus on CBR.
Tips and tricks
Remember that, when live streaming, everything is constrained to a minimum realtime encoding speed of 1x (FFmpeg reports encoding speed as it progresses). If your encoding speed drops below 1x then the encoding process will not keep up with the input of live video, and users will experience buffering, and breaks in the transmission will render the stream unusable during the live broadcast (although the archive will be generally usable).
Examples of encoding parameters in action
The following shows CPU utilization at 25 fps for various frame sizes on a quad-core i5 3.6Ghz desktop running Linux:
Target Resolution | FFmpeg VP9 Parameters | CPU / Speed (example) |
---|---|---|
3840x2160 (2160p) | -r 30 -g 90 -s 3840x2160 -quality realtime -speed 5 -threads 16 -row-mt 1 -tile-columns 3 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 7800k | ~88% 0.39x |
2560x1440 (1440p) | -r 30 -g 90 -s 2560x1440 -quality realtime -speed 5 -threads 16 -row-mt 1 -tile-columns 3 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 6000k | ~86% 0.68x |
1920x1080 (1080p) | -r 30 -g 90 -s 1920x1080 -quality realtime -speed 5 -threads 8 -row-mt 1 -tile-columns 2 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 4500k | ~82% 1.04x |
1280x720 (720p) | -r 30 -g 90 -s 1280x720 -quality realtime -speed 5 -threads 8 -row-mt 1 -tile-columns 2 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 3000k | ~78% 1.77x |
854x480 (480p) | -r 30 -g 90 -s 854x480 -quality realtime -speed 6 -threads 4 -row-mt 1 -tile-columns 1 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 1800k | ~64% 3.51x |
640x360 (360p) | -r 30 -g 90 -s 640x360 -quality realtime -speed 7 -threads 4 -row-mt 1 -tile-columns 1 -frame-parallel 0 -qmin 4 -qmax 48 -b:v 730k | ~62% 5.27x |
426x240 (240p) | -r 30 -g 90 -s 426x240 -quality realtime -speed 8 -threads 2 -row-mt 1 -tile-columns 0 -frame-parallel 0 -qmin 4 -qmax 48 -b:v 365k | ~66% 8.27x |
An example FFmpeg might look like this:
ffmpeg -stream_loop 100 -i /home/id3as/Videos/120s_tears_of_steel_1080p.webm \
-r 30 -g 90 -s 3840x2160 -quality realtime -speed 5 -threads 16 -row-mt 1 \
-tile-columns 3 -frame-parallel 1 -qmin 4 -qmax 48 -b:v 7800k -c:v vp9 \
-b:a 128k -c:a libopus -f webm pipe1
Tips and tricks
Note that here we are outputting to a FIFO pipe ("pipe1"), which should be created before execution, before running the FFmpeg command. To do this, give the command
mkfifo pipe1
in your working directory. When using Shaka Packager, it will listen to that pipe as its input source for the given stream. Other packaging models may require a different method.To ensure
-row-mt
commands are recognized, use the latest stable release of FFmpeg (3.3.3 currently) from https://www.ffmpeg.org/download.html
Example adaptive bitrate set
Depending on the power of the machine running the FFmpeg encode it may or may not be possible to deliver all of the following encodings at the same time, so a subset suited to your own available resources and target audiences should be selected from the list.
FFmpeg full ABR set
In an ideal scenario we combine the encoding examples outlined in the preceding section to create a single command that produces them all at the same time:
ffmpeg -stream_loop 100 -i lakes1080p.mp4 \
-y -r 25 -g 75 -s 3840x2160 -quality realtime -speed 5 -threads 8 \
-tile-columns 2 -frame-parallel 1 \
-b:v 7800k -c:v vp9 -b:a 196k -c:a libopus -f webm pipe1 \
-y -r 25 -g 75 -s 2560x1440 -quality realtime -speed 5 -threads 8 \
-tile-columns 2 -frame-parallel 1 \
-b:v 6000k -c:v vp9 -b:a 196k -c:a libopus -f webm pipe2 \
-y -r 25 -g 75 -s 1920x1080 -quality realtime -speed 5 -threads 4 \
-tile-columns 2 -frame-parallel 1 \
-b:v 4500k -c:v vp9 -b:a 196k -c:a libopus -f webm pipe3 \
-y -r 25 -g 75 -s 1280x720 -quality realtime -speed 5 -threads 4 \
-tile-columns 2 -frame-parallel 1 \
-b:v 3000k -c:v vp9 -b:a 196k -c:a libopus -f webm pipe4 \
-y -r 25 -g 75 -s 854x480 -quality realtime -speed 6 -threads 4 \
-tile-columns 2 -frame-parallel 1 \
-b:v 2000k -c:v vp9 -b:a 196k -c:a libopus -f webm pipe5 \
-y -r 25 -g 75 -s 640x360 -quality realtime -speed 7 -threads 2 \
-tile-columns 1 -frame-parallel 0 \
-b:v 730k -c:v vp9 -b:a 128k -c:a libopus -f webm pipe6 \
-y -r 25 -g 75 -s 426x240 -quality realtime -speed 8 -threads 2 \
-tile-columns 1 -frame-parallel 0 \
-b:v 365k -c:v vp9 -b:a 64k -c:a libopus -f webm pipe7
The above full set will, however, require a very powerful CPU, or possibly support from hardware GPU offload such as some chipsets increasingly provide. Intel Kabylake (and beyond) has a full hardware encoding pipeline. (Note that the Kabylake GPU can do 8-bit VP9 encode, but not 10-bit).
A practical desktop example using Shaka Packager
A more practical example for common desktop machines might use Shaka Packager. A simple way to setup Shaka is to install it within a Docker container, using Google's DockerHub image. Instructions can be found here:
https://github.com/google/shaka-packager#using-docker-for-testing--development
For this example, we used a machine with the following configuration:
System | Host: obs Kernel: 4.4.0-91-lowlatency x86_64 (64 bit) |
Desktop | Xfce 4.12.3 Distro: OS: https://ubuntustudio.org/2016/10/ubuntu-studio-16-10-released/ |
CPU | Quad core Intel Core i5-6500 (-MCP-) cache: 6144 KB clock speeds: max: 3600 MHz 1: 800 MHz 2: 800 MHz 3: 800 MHz 4: 800 MHz |
Graphics Card | Intel Skylake integrated graphics |
Memory | 8GB RAM |
In practice this machine could optimally produce the following usable range of ABR encodes, with FFmpeg consistently reporting 1x encode speed:
ffmpeg -stream_loop 100 -i 120s_tears_of_steel_1080p.webm \
-y -r 30 -g 90 -s 1920x1080 -quality realtime -speed 7 -threads 8 \
-row-mt 1 -tile-columns 2 -frame-parallel 1 -qmin 4 -qmax 48 \
-b:v 4500k -c:v vp9 -b:a 128k -c:a libopus -f webm pipe1 \
-y -r 30 -g 90 -s 1280x720 -quality realtime -speed 8 -threads 6 \
-row-mt 1 -tile-columns 2 -frame-parallel 1 -qmin 4 -qmax 48 \
-b:v 3000k -c:v vp9 -b:a 128k -c:a libopus -f webm pipe2 \
-y -r 30 -g 90 -s 640x360 -quality realtime -speed 8 -threads 2 \
-row-mt 1 -tile-columns 1 -frame-parallel 1 -qmin 4 -qmax 48 \
-b:v 730k -c:v vp9 -b:a 128k -c:a libopus -f webm pipe3
Note that the -speed
settings are quite high. These settings were established
experimentally and will vary from machine to machine.
Shaka Packager overhead
Packaging is not a particularly CPU-intensive activity. Shaka Packager can be set to listen for all the outputs, even if only a subset are being delivered by FFmpeg. These are the packager settings tested on the machine outlined above:
packager \
in=pipe1,stream=audio,init_segment=livehd-audio-1080.webm,segment_template=livehd-audio-1080-\$Number\$.webm \
in=pipe1,stream=video,init_segment=livehd-video-1080.webm,template=livehd-video-1080-\$Number\$.webm \
in=pipe2,stream=audio,init_segment=livehd-audio-720.webm,segment_template=livehd-audio-720-\$Number\$.webm \
in=pipe2,stream=video,init_segment=livehd-video-720.webm,template=livehd-video-720-\$Number\$.webm \
in=pipe3,stream=audio,init_segment=livehd-audio-360.webm,segment_template=livehd-audio-360-\$Number\$.webm \
in=pipe3,stream=video,init_segment=livehd-video-360.webm,template=livehd-video-360-\$Number\$.webm \
--mpd_output livehd.mpd --dump_stream_info --min_buffer_time=10 --time_shift_buffer_depth=300 \
--segment_duration=3 --io_block_size 65536