Decoding a video with WebCodecs and @remotion/media-parser
parseMedia()
is able to extract tracks and samples from audio and video in a format that is suitable for usage with WebCodecs APIs.
Unstable API: This package is experimental. We might change the API at any time, until we remove this notice.
Minimal example
Reading video framestsx
import {parseMedia ,OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });videoDecoder .configure (track );return (sample ) => {videoDecoder .decode (newEncodedVideoChunk (sample ));};};constresult = awaitparseMedia ({src : 'https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4',onVideoTrack ,});
Reading video framestsx
import {parseMedia ,OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });videoDecoder .configure (track );return (sample ) => {videoDecoder .decode (newEncodedVideoChunk (sample ));};};constresult = awaitparseMedia ({src : 'https://commondatastorage.googleapis.com/gtv-videos-bucket/sample/BigBuckBunny.mp4',onVideoTrack ,});
Why is this useful?
WebCodecs is the fastest way to decode videos in the browser.
WebAssembly solutions need to strip the CPU-specific optimizations and cannot benefit from hardware acceleration.
To decode a video using WebCodecs, the binary format of a video needs to be understood and parsed.
This is a meticulous task that requires a lot of domain knowledge.
Video files usually come in one of two container formats: ISO BMFF (.mp4, .mov) or Matroska (.webm, .mkv).
Libraries like mp4box.js do a good job of parsing these containers, but are scoped to the specific container format, meaning you need to mix multiple libraries.
parseMedia()
allows to to read an arbitrary video file (in the future: an arbitrary media file) and interface with it regardless of container, video codec and audio codec.
It uses modern Web APIs like fetch()
, ReadableStream
and resizeable ArrayBuffers, and returns data structures that are designed to be used with WebCodecs APIs.
Will Remotion switch to WebCodecs?
Not in the foreseeable future - Remotion currently renders videos with FFmpeg and a headless browser.
FFmpeg is just as fast as WebCodecs (they share the same code) - therefore it is not necessary to switch to WebCodecs.
Remotion cannot export videos in the browser, because browsers don't have APIs for capturing the viewport.
An exception is the canvas
element, however Remotion supports all ways of drawing to the viewport: HTML, CSS, SVG, and Canvas.
We are interested in WebCodecs because it still has the potential to solve a lot of problems for developers and @remotion/media-parser
as a whole can solve a lot of problems for users.
See also: Can I render videos in the browser?.
Practical considerations
If you use parseMedia()
with codecs, make the following considerations for your implementation.
Check browser support for @remotion/media-parser
Remotion requires the fetch()
and Resizeable ArrayBuffer APIs to be present.
Check if your runtime supports these APIs before you use parseMedia()
.
tsx
const canUseMediaParser = typeof fetch === 'function' && typeof new ArrayBuffer().resize === 'function';
tsx
const canUseMediaParser = typeof fetch === 'function' && typeof new ArrayBuffer().resize === 'function';
Check if browser has VideoDecoder
and AudioDecoder
Chrome has both VideoDecoder
and AudioDecoder
.
Firefox has support for VideoDecoder
and AudioDecoder
only if the dom.media.webcodecs.enabled
flag is enabled.
Safari has support for VideoDecoder
, but not AudioDecoder
. You can decode the video track but not the audio track.
Please help improve this page if this information is outdated.
You can choose to not receive samples if the corresponding decoder is not supported in the browser.
Rejecting samplestsx
import type {OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = (track ) => {if (typeofVideoDecoder === 'undefined') {return null;}constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });// ...};constonAudioTrack :OnAudioTrack = (track ) => {if (typeofAudioDecoder === 'undefined') {return null;}constaudioDecoder = newAudioDecoder ({output :console .log ,error :console .error });// ...};
Rejecting samplestsx
import type {OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = (track ) => {if (typeofVideoDecoder === 'undefined') {return null;}constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });// ...};constonAudioTrack :OnAudioTrack = (track ) => {if (typeofAudioDecoder === 'undefined') {return null;}constaudioDecoder = newAudioDecoder ({output :console .log ,error :console .error });// ...};
Check if the browser supports the codec
Not all browsers support all codecs that parseMedia()
emits.
The best way is to use AudioDecoder.isConfigSupported()
and VideoDecoder.isConfigSupported()
to check if the browser supports the codec.
These are async APIs, fortunately onAudioTrack
and onVideoTrack
allow async code as well.
Checking if the browser supports the codectsx
import type {OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });const {supported } = awaitVideoDecoder .isConfigSupported (track );if (!supported ) {return null;}// ...};constonAudioTrack :OnAudioTrack = async (track ) => {constaudioDecoder = newAudioDecoder ({output :console .log ,error :console .error });const {supported } = awaitAudioDecoder .isConfigSupported (track );if (!supported ) {return null;}// ...};
Checking if the browser supports the codectsx
import type {OnAudioTrack ,OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });const {supported } = awaitVideoDecoder .isConfigSupported (track );if (!supported ) {return null;}// ...};constonAudioTrack :OnAudioTrack = async (track ) => {constaudioDecoder = newAudioDecoder ({output :console .log ,error :console .error });const {supported } = awaitAudioDecoder .isConfigSupported (track );if (!supported ) {return null;}// ...};
Perform these checks in addition to the previously mentioned ones.
Error handling
If an error occurs, you get the error in the error
callback that you passed to the VideoDecoder
or AudioDecoder
constructor.
The decoder state
will switch to "closed"
, however, you will still receive samples.
If the decoder is in "closed"
state, you should stop passing them to VideoDecoder.
Error handlingtsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });return async (sample ) => {if (videoDecoder .state === 'closed') {return;}};};
Error handlingtsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });return async (sample ) => {if (videoDecoder .state === 'closed') {return;}};};
- The same logic goes for
AudioDecoder
. - You should still perform the checks previously mentioned, but they are omitted from this example.
Queuing samples
Extracting samples is the fast part, decoding them is the slow part.
If too many samples are in the queue, it will negatively impact the performance of the page.
Fortunately, the parsing process can be temporarily paused while the decoder is busy.
For this, make the sample processing function async. Remotion will await it before processing the file further.
This will make it so that samples that are not yet needed are not kept in memory, keeping the decoding process efficient.
Only keeping 10 samples in the queue at a timetsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });return async (sample ) => {if (videoDecoder .decodeQueueSize > 10) {letresolve = () => {};constcb = () => {resolve ();};await newPromise <void>((r ) => {resolve =r ;videoDecoder .addEventListener ('dequeue',cb );});videoDecoder .removeEventListener ('dequeue',cb );}videoDecoder .decode (newEncodedVideoChunk (sample ));}};
Only keeping 10 samples in the queue at a timetsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });return async (sample ) => {if (videoDecoder .decodeQueueSize > 10) {letresolve = () => {};constcb = () => {resolve ();};await newPromise <void>((r ) => {resolve =r ;videoDecoder .addEventListener ('dequeue',cb );});videoDecoder .removeEventListener ('dequeue',cb );}videoDecoder .decode (newEncodedVideoChunk (sample ));}};
- The same logic goes for
AudioDecoder
. - You should still perform the checks previously mentioned, but they are omitted from this example.
Handling stretched videos
Some videos don't have the same dimensions internally as they are presented.
For example, this sample video has a coded width of 1440, but a presentation width of 1920.
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });videoDecoder .configure (track );return async (sample ) => {console .log (sample )// {// codedWidth: 1440,// codedHeight: 1080,// displayAspectWidth: 1920,// displayAspectHeight: 1080,// ...// }};};
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {constvideoDecoder = newVideoDecoder ({output :console .log ,error :console .error });videoDecoder .configure (track );return async (sample ) => {console .log (sample )// {// codedWidth: 1440,// codedHeight: 1080,// displayAspectWidth: 1920,// displayAspectHeight: 1080,// ...// }};};
This means the frame is internally encoded in a 4:3 aspect ratio, but the frame should as a 16:9.
By passing all of codedWidth
, codedHeight
, displayAspectWidth
and displayAspectHeight
to new EncodedVideoChunk()
, the decoder should handle the stretching correctly.
Handling rotation
WebCodecs do not seem to consider rotation.
For example, this video recorded with an iPhone has metadata that it should be displayed at 90 degrees rotation.
VideoDecoder
is not able to rotate the video for you, so you might need to do it yourself, for example by drawing it to a canvas.
Fortunately parseMedia()
returns the rotation of the track:
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {console .log (track .rotation ) // -90return null};
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {console .log (track .rotation ) // -90return null};
See here for an example of how a video frame is turned into a bitmap and rotated.
Understanding the different dimensions of a video
As just mentioned, some videos might be stretched or rotated.
In an extreme case, it is possible that you stumble opon a video that has three different dimensions.
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {console .log (track );// {// codedWidth: 1440,// codedHeight: 1080,// displayAspectWidth: 1920,// displayAspectHeight: 1080,// width: 1080,// height: 1900,// ...// }return null};
Handling stretched videostsx
import type {OnVideoTrack } from '@remotion/media-parser';constonVideoTrack :OnVideoTrack = async (track ) => {console .log (track );// {// codedWidth: 1440,// codedHeight: 1080,// displayAspectWidth: 1920,// displayAspectHeight: 1080,// width: 1080,// height: 1900,// ...// }return null};
The meaning of it is:
codedWidth
andcodedHeight
are the dimensions of the video in the codec's internal format.displayAspectWidth
anddisplayAspectHeight
are scaled dimensions of how the video should be displayed, but with rotation not yet applied.
These are not necessarily the actual dimensions of how a video is presented to the user, because rotation is not yet applied.
The fields are named like this because they correspond with what should be passed to new EncodedVideoChunk()
.
width
andheight
are the dimensions of the video how it would be displayed by a Player.
Google Chrome quirks
We find that as of now, AudioDecoder.isConfigSupported()
are not 100% reliable. For example, Chrome marks this config as supported, but then throws an error nonetheless.
tsx
const config = {codec: 'opus', numberOfChannels: 6, sampleRate: 44100};console.log(await AudioDecoder.isConfigSupported(config)); // {supported: true}const decoder = new AudioDecoder({ error: console.error, output: console.log });decoder.configure(config); // Unsupported configuration. Check isConfigSupported() prior to calling configure().
tsx
const config = {codec: 'opus', numberOfChannels: 6, sampleRate: 44100};console.log(await AudioDecoder.isConfigSupported(config)); // {supported: true}const decoder = new AudioDecoder({ error: console.error, output: console.log });decoder.configure(config); // Unsupported configuration. Check isConfigSupported() prior to calling configure().
Consider this in your implementation.
Safari performance
We find that with our reference implementation, Safari chokes on decoding the full Big Buck Bunny video. Tips are welcome, and otherwise we encourage to consider if and which parts of WebCodecs APIs you want to support.
Reference implementation
A testbed with many different codecs and edge cases is available here.
Follow these instructions to run the testbed locally.
License reminder
Like Remotion itself, this package is licensed under the Remotion License.
TL;DR: Individuals and small teams can use this package, but teams of 4+ people need a company license.