Andy
making things with software

Reversing videos efficiently with AVFoundation

The problem

One of the features I wanted to add to my app GIF Grabber in the next version was the ability to set different loop types. In order to support "reverse" and "forward-reverse" style loops, we needed a way to reverse a video (AVAsset) in objective-c.

Note: GIFGrabber, despite its name, actually keeps all the recordings in MP4 video format until the user actually wants to save it as a GIF. Manipulating video and adding effects are much easier and more efficient than dealing with GIFs.

Example of forward (normal) looping GIF:

Forward-style loop

The same GIF with a reversed loop:

Forward-style loop

Again, with a forward-reverse (ping-ping) style loop:

Forward-style loop

Existing solutions

Most of the answers and tutorials I found online suggested using AVAssetImageGenerator to output frames as images and then compositing them back in reverse order into a video. Because of the way AVAssetImageGenerator works, there are some major drawbacks to this solution:

Since the reversed video was going to be concatenated with the original, any difference in quality or timing would be very noticable. We needed them to be exactly the same.

A more efficient solution

Since we deal with relatively short videos (30 seconds or less) we wanted to perform the procedure completely in-memory.

This can be achieved by:

  1. Use AVAssetReader to read in the video as an array of CMSampleBufferRef[] (this struct contains the raw pixel data along with timing info for each frame).
  2. Extract the image/pixel data for each frame and append it with the timing info of its mirror frame. (This step is neccessary because we can't just append the CMSampleBufferRef structs in reverse order because the timing info is embedded.
  3. Use AVAssetWriter to write it back out to a video file.

You can find the source code here. In the next section, we'll walk through some of the more complicated parts.

1 Read in the video samples

// Initialize the reader

AVAssetReader *reader = [[AVAssetReader alloc] initWithAsset:asset error:&error];
AVAssetTrack *videoTrack = [[asset tracksWithMediaType:AVMediaTypeVideo] lastObject];

NSDictionary *readerOutputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                        [NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange], kCVPixelBufferPixelFormatTypeKey,
                                        nil];
AVAssetReaderTrackOutput* readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:videoTrack
                                                                                    outputSettings:readerOutputSettings];
[reader addOutput:readerOutput];
[reader startReading];

First, we initialize the AVAssetReader object that will be used to read in the video as a series of samples (frames). We also configure the pixel format for the frame. You can read more about the different pixel format types here.

// Read in the samples

NSMutableArray *samples = [[NSMutableArray alloc] init];

CMSampleBufferRef sample;
while(sample = [readerOutput copyNextSampleBuffer]) {
    [samples addObject:(__bridge id)sample];
    CFRelease(sample);
}

Next, we store the array of samples. Note that because CMSampleBufferRef is a native C type, we cast it to an objective-c type of id using __bridge.

2 Prepare the writer that will convert the frames back to video

// Initialize the writer

AVAssetWriter *writer = [[AVAssetWriter alloc] initWithURL:outputURL
                                                  fileType:AVFileTypeMPEG4
                                                     error:&error];

This part is pretty straightforward, the AVAssetWriter object takes in an output path and the file-type of the output file.

NSDictionary *videoCompressionProps = [NSDictionary dictionaryWithObjectsAndKeys:
                                        @(videoTrack.estimatedDataRate), AVVideoAverageBitRateKey,
                                        nil];

NSDictionary *writerOutputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                        AVVideoCodecH264, AVVideoCodecKey,
                                        [NSNumber numberWithInt:videoTrack.naturalSize.width], AVVideoWidthKey,
                                        [NSNumber numberWithInt:videoTrack.naturalSize.height], AVVideoHeightKey,
                                        videoCompressionProps, AVVideoCompressionPropertiesKey,
                                        nil];

AVAssetWriterInput *writerInput = [[AVAssetWriterInput alloc] initWithMediaType:AVMediaTypeVideo
                                                                 outputSettings:writerOutputSettings
                                                               sourceFormatHint:(__bridge CMFormatDescriptionRef)[videoTrack.formatDescriptions lastObject]];

[writerInput setExpectsMediaDataInRealTime:NO];

Next, we create the AVAssetWriterInput object that will feed the frames to the AVAssetWriter. The configuration will depend on your source video - here, we specify the codec, dimensions, and compression properties.

We set the expectsMediaDataInRealTime property to NO since we are not processing a live video stream and therefore the writer can take its time without dropping frames.

3 Reversing the frames and save to file

// Initialize an input adaptor so that we can append PixelBuffer

AVAssetWriterInputPixelBufferAdaptor *pixelBufferAdaptor = [[AVAssetWriterInputPixelBufferAdaptor alloc] initWithAssetWriterInput:writerInput
                                                                                                      sourcePixelBufferAttributes:nil];

[writer addInput:writerInput];

[writer startWriting];
[writer startSessionAtSourceTime:CMSampleBufferGetPresentationTimeStamp((__bridge CMSampleBufferRef)samples[0])];

First, we create a AVAssetWriterInputPixelBufferAdaptor object that acts as an adaptor to the writer input. This will allow the input to read in the pixel buffer of each frame.

// Append the frames to the output.
// Notice we append the frames from the tail end, using the timing of the frames from the front.

for(NSInteger i = 0; i < samples.count; i++) {
    // Get the presentation time for the frame

    CMTime presentationTime = CMSampleBufferGetPresentationTimeStamp((__bridge CMSampleBufferRef)samples[i]);

    // take the image/pixel buffer from tail end of the array

    CVPixelBufferRef imageBufferRef = CMSampleBufferGetImageBuffer((__bridge CMSampleBufferRef)samples[samples.count - i - 1]);

    while (!writerInput.readyForMoreMediaData) {
        [NSThread sleepForTimeInterval:0.1];
    }

    [pixelBufferAdaptor appendPixelBuffer:imageBufferRef
                     withPresentationTime:presentationTime];
}

[writer finishWriting];

Note: The structure of each sample (CMSampleBufferRef) contains two key pieces of information. A pixel buffer (CVPixelBufferRef) describing the pixel data for the frame, and a presentation timestamp that describes when it is to be displayed.

And finally, we loop through all the frames to get the presentation timestamps and use the pixel buffer from it's mirror (count - i -1) frame. We pass this on to the pixelBufferAdaptor we created earlier which will feed it into the writer. We also make sure that the writerInput has finished processing before passing it the next frame.

And finally, we write the output to disk.

That's it! Your reversed video should be saved and accessible at the output path you specified when initializing the writer.

Download the source code

You can download the final source code here.