Using FFMPEG to Concatenate and Embed Subtitles

I recently upgraded my drone to a DJI Mavic Air 2. Among other things, it can create h.265 videos directly. It still uses the MP4 container format and the separate SRT format for storing video subtitles, the flight data. Following most camera standards, it creates video files that are individually smaller than 4 GB, which works out to be about five minutes in 4k video.

If I want to upload a longer raw video to social media, the video files need to be concatenated before uploading.

Concatenating the video with FFMPEG has been something I’ve known how to do for a long time using either of two methods. Today I learned how to properly embed the subtitles in either the MKV or MP4 container format.

The MP4 format is more widely supported than the MKV format, but is less flexible as to what it can contain.  The MKV (Matroska Multimedia Container) container format can hold almost any type of media, and so I’m able to copy the SRT format directly. The MP4 (MPEG-4 Part 14) container format only supports a limited selection of subtitle formats, so I’m required to have FFMPEG convert the SRT stream to a MP4 compatible stream.  If you are interested in video container formats, these tables are very helpful.

I’ll give several examples using the two video files and their associated subtitle files created by the drone named DJI00001.MP4, DJI00002.MP4, DJI00001.SRT, and DJI00002.SRT. The method I’m using should work for any number of files, up to the largest filesize you can store on your filesystem.

To simply concatenate the video files, create a text input file (I’m using mp4files.txt) with the contents as follows

file DJI00001.MP4
file DJI00001.MP4

then use the ffmpeg command to create a new concatenated file.

ffmpeg -f concat -safe 0 -i mp4files.txt -c copy ConcatenatedVideo.MP4

If you want to embed the subtitles, you need to create a second text file, do some stream mapping, and specify what format the subtitles should be. In this case I’m using srtfiles.txt

file DJI00001.SRT
file DJI00002.SRT

My FFMPEG command to create an MP4 file gets a lot more involved because now I’m specifying multiple inputs and have to specify the subtitle format.

ffmpeg -f concat -safe 0 -i mp4files.txt -f concat -safe 0 -i srtfiles.txt -map 0:v -map 1 -c:v copy -c:s mov_text ConcatenatedVideo.MP4

The FFMPEG command to create an MKV command is only a tiny bit different, and the resulting file is only a tiny bit smaller.

ffmpeg -f concat -safe 0 -i mp4files.txt -f concat -safe 0 -i srtfiles.txt -map 0:v -map 1 -c:v copy -c:s copy ConcatenatedVideo.MKV

When playing the ConcatenatedVideo files on my local machine, I can now enable or disable the closed caption track properly in the player for either format. Unfortunately in my initial testing with YouTube, neither format maintains the second stream of subtitles.

This is not all a waste of time and effort, because an advantage of embedding the subtitles into the container format is that the timing has been matched to the video, and can now be extracted in a concatenated form for use with YouTube.

ffmpeg -i ConcatenatedVideo.MKV -c copy ConcatenatedVideo.SRT

You can exclude the “-c copy” when extracting the subtitles and FFMPEG fill run it through its subrip codec and produce nearly identical results. It will only work with the MKV file because the subtitle format stored in the MP4 file is not easily converted to a SRT file.

Using the -f concat option invokes the concat demuxer in FFMPEG, which has the limitation that the format needs to be exactly the same for each file. If there are any changes between files you want to concatenate, you must use a more involved command invoking the concat filter. In a different project I ran into a command issue with the concat filter command when the command got to be much over 900 characters long.

 

Raspberry Pi ZeroW Camera Focus with FFMPEG

I wanted a quick and dirty method to test my camera module installation on my Raspberry Pi ZeroW installation. I don’t have a monitor connected to the Raspberry, and explicitly did not install the desktop version of the operating system. This is especially important because the camera itself may not be properly focused after installation in the case, and the only way to easily focus the camera is with a video stream allowing you to make small adjustments and see them nearly real time.

I’ve used FFMPEG for years as it handles almost any kind of video or audio I can throw at it. I use VLC on my desktop machine for similar reasons.

I did a quick install of ffmpeg on my Pi with the following command, allowing it to install all the requirements, adding up to almost 126 new packages and 56MB that needed to be downloaded and installed.

sudo apt-get install ffmpeg -y

After it finished installing, I was able to run the following command with the 192.168.0.16 address being my desktop computer.

ffmpeg -f video4linux2 -input_format h264 -video_size 1280x720 -framerate 30 -i /dev/video0 -vcodec copy -an -f mpegts udp://192.168.0.16:5000?pkt_size=1316

On my desktop computer I ran VLC, under the Media menu, selected Open Network Stream, and opened:

udp://@0.0.0.0:5000

2019-09-23 (1)2019-09-23 (2)

What I’m doing is to use FFMPEG to pull video from the device and push it using UDP datagrams at my desktop on port 5000. Then VLC opens a port on the local machine at port 5000 to receive the datagrams and it decodes and displays the video. An interesting thing about this method is that I can stop transmitting from the raspberry, then restart it, and VLC will accept the packets since UDP is a connectionless protocol.

What really surprised me was that when I logged in a second time to my Raspberry Pi to view the CPU usage for streaming, it was only running around 12% of the CPU. I was interested in knowing what native formats the camera supported..

ffmpeg -f v4l2 -list_formats all -i /dev/video0
ffmpeg version 4.1.4-1+rpt1~deb10u1 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 8 (Raspbian 8.3.0-6+rpi1)
  configuration: --prefix=/usr --extra-version='1+rpt1~deb10u1' --toolchain=hardened --libdir=/usr/lib/arm-linux-gnueabihf --incdir=/usr/include/arm-linux-gnueabihf --arch=arm --enable-gpl --disable-stripping --enable-avresample --disable-filter=resample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librsvg --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-omx-rpi --enable-mmal --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
  libavutil      56. 22.100 / 56. 22.100
  libavcodec     58. 35.100 / 58. 35.100
  libavformat    58. 20.100 / 58. 20.100
  libavdevice    58.  5.100 / 58.  5.100
  libavfilter     7. 40.101 /  7. 40.101
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  3.100 /  5.  3.100
  libswresample   3.  3.100 /  3.  3.100
  libpostproc    55.  3.100 / 55.  3.100
[video4linux2,v4l2 @ 0x2367e40] Raw       :     yuv420p :     Planar YUV 4:2:0 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :     yuyv422 :           YUYV 4:2:2 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :       rgb24 :     24-bit RGB 8-8-8 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Compressed:       mjpeg :            JFIF JPEG : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Compressed:        h264 :                H.264 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Compressed:       mjpeg :          Motion-JPEG : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       : Unsupported :           YVYU 4:2:2 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       : Unsupported :           VYUY 4:2:2 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :     uyvy422 :           UYVY 4:2:2 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :        nv12 :         Y/CbCr 4:2:0 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :       bgr24 :     24-bit BGR 8-8-8 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :     yuv420p :     Planar YVU 4:2:0 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       : Unsupported :         Y/CrCb 4:2:0 : {32-3280, 2}x{32-2464, 2}
[video4linux2,v4l2 @ 0x2367e40] Raw       :        bgr0 : 32-bit BGRA/X 8-8-8-8 : {32-3280, 2}x{32-2464, 2}
/dev/video0: Immediate exit requested

That output leads me to believe that the camera module could output either h264 or mjpeg without significant CPU overhead. What it doesn’t do is tell me efficient frame sizes. It seems to say that horizontal and vertical sizes can be anything between 32 to 3280 and 32 to 2464. I know that the specs on the camera say that it will run still frames at the high resolution, but video is significantly less.

Two Video4Linux commands that return interesting and similar results are:

pi@WimPiZeroCamera:~ $ v4l2-ctl --list-formats-ext
ioctl: VIDIOC_ENUM_FMT
        Type: Video Capture

        [0]: 'YU12' (Planar YUV 4:2:0)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [1]: 'YUYV' (YUYV 4:2:2)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [2]: 'RGB3' (24-bit RGB 8-8-8)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [3]: 'JPEG' (JFIF JPEG, compressed)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [4]: 'H264' (H.264, compressed)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [5]: 'MJPG' (Motion-JPEG, compressed)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [6]: 'YVYU' (YVYU 4:2:2)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [7]: 'VYUY' (VYUY 4:2:2)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [8]: 'UYVY' (UYVY 4:2:2)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [9]: 'NV12' (Y/CbCr 4:2:0)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [10]: 'BGR3' (24-bit BGR 8-8-8)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [11]: 'YV12' (Planar YVU 4:2:0)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [12]: 'NV21' (Y/CrCb 4:2:0)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
        [13]: 'BGR4' (32-bit BGRA/X 8-8-8-8)
                Size: Stepwise 32x32 - 3280x2464 with step 2/2
pi@WimPiZeroCamera:~ $ v4l2-ctl -L

User Controls

                     brightness 0x00980900 (int)    : min=0 max=100 step=1 default=50 value=50 flags=slider
                       contrast 0x00980901 (int)    : min=-100 max=100 step=1 default=0 value=0 flags=slider
                     saturation 0x00980902 (int)    : min=-100 max=100 step=1 default=0 value=0 flags=slider
                    red_balance 0x0098090e (int)    : min=1 max=7999 step=1 default=1000 value=1000 flags=slider
                   blue_balance 0x0098090f (int)    : min=1 max=7999 step=1 default=1000 value=1000 flags=slider
                horizontal_flip 0x00980914 (bool)   : default=0 value=0
                  vertical_flip 0x00980915 (bool)   : default=0 value=0
           power_line_frequency 0x00980918 (menu)   : min=0 max=3 default=1 value=1
                                0: Disabled
                                1: 50 Hz
                                2: 60 Hz
                                3: Auto
                      sharpness 0x0098091b (int)    : min=-100 max=100 step=1 default=0 value=0 flags=slider
                  color_effects 0x0098091f (menu)   : min=0 max=15 default=0 value=0
                                0: None
                                1: Black & White
                                2: Sepia
                                3: Negative
                                4: Emboss
                                5: Sketch
                                6: Sky Blue
                                7: Grass Green
                                8: Skin Whiten
                                9: Vivid
                                10: Aqua
                                11: Art Freeze
                                12: Silhouette
                                13: Solarization
                                14: Antique
                                15: Set Cb/Cr
                         rotate 0x00980922 (int)    : min=0 max=360 step=90 default=0 value=0 flags=modify-layout
             color_effects_cbcr 0x0098092a (int)    : min=0 max=65535 step=1 default=32896 value=32896

Codec Controls

             video_bitrate_mode 0x009909ce (menu)   : min=0 max=1 default=0 value=0 flags=update
                                0: Variable Bitrate
                                1: Constant Bitrate
                  video_bitrate 0x009909cf (int)    : min=25000 max=25000000 step=25000 default=10000000 value=10000000
         repeat_sequence_header 0x009909e2 (bool)   : default=0 value=0
            h264_i_frame_period 0x00990a66 (int)    : min=0 max=2147483647 step=1 default=60 value=60
                     h264_level 0x00990a67 (menu)   : min=0 max=11 default=11 value=11
                                0: 1
                                1: 1b
                                2: 1.1
                                3: 1.2
                                4: 1.3
                                5: 2
                                6: 2.1
                                7: 2.2
                                8: 3
                                9: 3.1
                                10: 3.2
                                11: 4
                   h264_profile 0x00990a6b (menu)   : min=0 max=4 default=4 value=4
                                0: Baseline
                                1: Constrained Baseline
                                2: Main
                                4: High

Camera Controls

                  auto_exposure 0x009a0901 (menu)   : min=0 max=3 default=0 value=0
                                0: Auto Mode
                                1: Manual Mode
         exposure_time_absolute 0x009a0902 (int)    : min=1 max=10000 step=1 default=1000 value=1000
     exposure_dynamic_framerate 0x009a0903 (bool)   : default=0 value=0
             auto_exposure_bias 0x009a0913 (intmenu): min=0 max=24 default=12 value=12
                                0: -4000 (0xfffffffffffff060)
                                1: -3667 (0xfffffffffffff1ad)
                                2: -3333 (0xfffffffffffff2fb)
                                3: -3000 (0xfffffffffffff448)
                                4: -2667 (0xfffffffffffff595)
                                5: -2333 (0xfffffffffffff6e3)
                                6: -2000 (0xfffffffffffff830)
                                7: -1667 (0xfffffffffffff97d)
                                8: -1333 (0xfffffffffffffacb)
                                9: -1000 (0xfffffffffffffc18)
                                10: -667 (0xfffffffffffffd65)
                                11: -333 (0xfffffffffffffeb3)
                                12: 0 (0x0)
                                13: 333 (0x14d)
                                14: 667 (0x29b)
                                15: 1000 (0x3e8)
                                16: 1333 (0x535)
                                17: 1667 (0x683)
                                18: 2000 (0x7d0)
                                19: 2333 (0x91d)
                                20: 2667 (0xa6b)
                                21: 3000 (0xbb8)
                                22: 3333 (0xd05)
                                23: 3667 (0xe53)
                                24: 4000 (0xfa0)
      white_balance_auto_preset 0x009a0914 (menu)   : min=0 max=9 default=1 value=1
                                0: Manual
                                1: Auto
                                2: Incandescent
                                3: Fluorescent
                                4: Fluorescent H
                                5: Horizon
                                6: Daylight
                                7: Flash
                                8: Cloudy
                                9: Shade
            image_stabilization 0x009a0916 (bool)   : default=0 value=0
                iso_sensitivity 0x009a0917 (intmenu): min=0 max=4 default=0 value=0
                                0: 0 (0x0)
                                1: 100000 (0x186a0)
                                2: 200000 (0x30d40)
                                3: 400000 (0x61a80)
                                4: 800000 (0xc3500)
           iso_sensitivity_auto 0x009a0918 (menu)   : min=0 max=1 default=1 value=1
                                0: Manual
                                1: Auto
         exposure_metering_mode 0x009a0919 (menu)   : min=0 max=2 default=0 value=0
                                0: Average
                                1: Center Weighted
                                2: Spot
                     scene_mode 0x009a091a (menu)   : min=0 max=13 default=0 value=0
                                0: None
                                8: Night
                                11: Sports

JPEG Compression Controls

            compression_quality 0x009d0903 (int)    : min=1 max=100 step=1 default=30 value=30

 

 

FFMPEG and ROAV Dash Cam C1 Pro

I recently purchased a dedicated dashcam on sale to replace my GoPro setup for trip videos. This gives me a new need to understand a new file format.

2018-05-18

The Roav Dashcam stores sequential mp4 files. When configuring the camera it’s possible to set the loop time, which is the duration of each mp4. There’s also an option to watermark the files. I have it turned on, and the only thing I’ve noticed is the ROAV logo, timestamp, and speed in the bottom right. It does not appear to have a way of adjusting the size of the text.

My initial recordings were set to run at 1080p 60 fps. I wanted to concatenate multiple files, add some text of my own, and speed up the video. This was my first experience using the -filter_complex option of FFMPEG. Here’s what I came up with to put together three files, speed the output up by a factor of 60, and add some text. I’m dropping the audio completely. The ROAV can record audio inside the car, but I configured it not to, as I don’t want to hear what I was listening to on the radio or what I might be saying if I make a phone call..

ffmpeg.exe -hide_banner -i 2018_0512_130537_050A.MP4 -i 2018_0512_131537_051A.MP4 -i 2018_0512_132537_052A.MP4 -filter_complex "[0:v] [1:v] [2:v] concat=n=3:v=1 [v];[v]setpts=0.01666*PTS,drawtext=fontfile=C\\:/WINDOWS/Fonts/consola.ttf:fontcolor=white:fontsize=80:y=main_h-text_h-50:x=50:text=WimsWorld[o]" -map "[o]" -c:v libx265 -crf 23 -preset veryfast -movflags +faststart -bf 2 -g 15 -pix_fmt yuv420p  FirstMixSpeed60Concat.mp4

This first video was recorded at 1080p60. The camera can record at 1440p30 which I will be trying soon to see if things like license plates are more legible. The setpts factor that I’m currently using was 1/60, so that 1 minute of real time was compressed to 1 second of video, and just dropping the extra frames. I expect to need to change the setpts factor to 1/30 because of the decreased frame rate at the higher resolution.

FFMPEG and h.265

I’ve noticed that YouTube transcodes my videos after I upload them and wanted to know more. It turns out that they are internally using a form of h.265 video encoding, which reduces the data size significantly without reducing perceived quality over h.264 video compression.

I decided to run some tests using my GoPro time lapse program to see how much compression I’d get versus how much extra time for encoding.

First I had to read up on the settings for using h.265 in FFmpeg. According to https://trac.ffmpeg.org/wiki/Encode/H.265, If I add -c:v libx265 into my existing FFmpeg command line without changing anything, I’ll get an h.265 output with the defaults of -crf 23 and -preset medium.

The -preset value effects how much work is done in the compression, but shouldn’t affect the perceived quality. It will effect both time to create and output file size.

The -crf value effects the perceived quality. I’ve been using the defaults in my previous h.264 mp4 files, which should be approximately -crf 23 and supposedly -crf 28 in h.265 is equivalent to the lower h.264 value. A -crf 0 would be a completely lossless conversion. For my tests, I left crf at the default 23.

I ran all of these tests on a set of 7,129 photos I’d captured while sailboat racing on March 24th using my GoPro Hero 3+ Black. Each input image was 4000×3000 and I’m creating an output video with resolution 3840×2160 by cropping it at 3/4 of the height and scaling to fit.

The original h.264 conversion took 28:36 minutes to create and was 1,162,079,866 bytes long.

Here’s a trimmed down and sorted file listing. The first column is the time it took to create, second is filesize, and third is filename.

28:36 1,162,079,866 20180324-2160p30-cropped-h264.mp4
30:25 1,006,292,960 20180324-veryfast-2160p30-cropped-h265-crf20.mp4
21:35   328,700,100 20180324-ultrafast-2160p30-cropped-crf28.mp4
21:59   339,438,778 20180324-superfast-2160p30-cropped-crf28.mp4
25:12   350,085,434 20180324-veryfast-2160p30-cropped-crf28.mp4
25:17   349,720,030 20180324-faster-2160p30-cropped-crf28.mp4
27:24   348,582,310 20180324-fast-2160p30-cropped-crf28.mp4
52:07   365,050,252 20180324-medium-2160p30-cropped-crf28.mp4

I created a single test at -crf 20 because I was interested in seeing a higher quality video and how different it would be in size. It took slightly longer than the original h.264 compression and had a slight improvement in size.

Two things became obvious to me from this test. More time spent in compression doesn’t always mean better compression. For my application, running preset medium actually hurts the performance, both in file size and time taken.

I ran all of these tests only a single time on my desktop computer with an Intel(R) Core(TM) i7-4771 CPU @ 3.50GHz, 3501 Mhz, 4 Core(s), 8 Logical Processor(s) running the latest version of Windows Version 10.0.16299 Build 16299. I was running FFmpeg version N-88668-g723b6baaf8 from https://ffmpeg.zeranoe.com/builds/. The variations in time could be related to background tasks running on the machine, and I should have run a more comprehensive battery of tests, averaging time and with more accurate timekeeping.

FFMPEG and drawtext

Several years ago I wrote a program that consolidates time-lapse pictures into a directory and calls FFMPEG to create a video.

I had been wanting the time-code from when each picture was taken printed on the screen while the video was playing but had not figured out how to get it done until this weekend.

Video TimeCode

Frame from video showing the DateTimeOriginal timecode embedded.

I’d gone down multiple paths in an attempt to get this result before finally getting the drawtext feature to work. My program manually pulled the metadata from the images before feeding them to ffmpg. I’d tried creating both text files and image files for overlaying. none of those got the result that I was looking for.

When I finally got everything working, it seems simple, but the underlying problem has to do with the amount of string escaping required to get the command to work.

Here’s an example command I was issuing to ffmpeg that got the result I was looking for.

ffmpeg.exe -hide_banner -r 30 -i Wim%05d.JPG -vf crop=in_w:3/4*in_h,drawtext=fontfile=C\\:/WINDOWS/Fonts/OCRAEXT.ttf:fontcolor=white:fontsize=160:y=main_h-text_h-50:x=main_w-text_w-50:text=WimsWorld,drawtext=fontfile=C\\:/WINDOWS/Fonts/OCRAEXT.ttf:fontcolor=white:fontsize=160:y=main_h-text_h-50:x=50:text=%{metadata\\:DateTimeOriginal} -s 3840x2160 -pix_fmt yuv420p -n Test-2160p30-cropped.mp4

If you look at the -vf option parameter, I’m cropping my input pictures to 3/4 their original height, then using the drawtext feature twice. First I write the static text to the bottom right of the frame, then I extract metadata from the source image and write it to the bottom left of the frame.

Because I’m calling this from a program, I had extra escaping of the \ character in my code. All of the escaping required a lot of trial and error to get things working. I’m using OCRAEXT as my font, but I could be using any fixed spacing font. because of the fact that the time is changing every frame, it’s important that the font not be proportional to make it easy to read.

GoPro Battery BacPac

I purchased a GoPro Battery BacPack recently because I realized that I’d rather have extended shooting than have two seperate batteries that needed to be charged. I had already purchased a second standard battery for the GoPro, but I didn’t have a way to charge it when it was not in the camera, so found it less useful than I was hoping.

The fact that I am most often using my GoPro in harsh conditions means that I’d rather not open the case any more frequently than I need to. When I’m skiing, if I go into the lodge to change the battery, the very first thing I notice is that the cold GoPro case is suddenly steamed over by the indoor humidity. 

When I’ve been doing my stop motion photography at a picture every two seconds, generally the standard battery lasts just under two hours. My first test with the BacPac attached started at 8:47am and the last picture was at 12:40, so it looks like It gets me to just under 4 hours total.

It created 6999 files in that time frame. I’ve not figured out if the GoPro uses any less battery when taking sequential still photos vs when it’s taking movies. 

Saturday’s race on Different Drummer wasn’t fully captured in the time allotted because we went out early and did some practice work flying the spinnaker.

Hopefully I’ll get around to writing more about what comes with the BacPac in the next couple of days. I was mostly interested in sharing the extension of the recording time.

Time-lapse videos from GoPro

In November 2013 I purchased a GoPro HERO3+ Black Edition to play with video recording, but almost immediately became enthralled with taking sequences of photos over long time periods.

The GoPro can be configured to take a picture every 0.5, 1, 2, 5, 10, 30, or 60 seconds. I’ve found that I like taking pictures every 2 seconds, and then converting them to video at 30fps (Frames per second) which gets me an easy to convert time scale. 1 second of video came from 1 minute of photos, 1 minute of video came from 1 hour of photos.

GoPro has a freely available software package to edit videos, as well as creating videos from sequences of images. Because of my past familiarity with FFMPEG I wanted a more scriptable solution for creating videos from thousands of photos.

https://trac.ffmpeg.org/wiki/Create%20a%20video%20slideshow%20from%20images has nice instructions for creating videos from sequences of images using FFMPEG. What it glosses over is that the first image in the sequence needs to be numbered zero or one. Another complication in the process is that the GoPro uses the standard camera file format where no more than 1000 images will be stored in a single directory. This means that with the 1800 images created in a single hour, at least two directories will hold the source images. An interesting issue I ran across is that sometimes the GoPro will skip a number in its image sequence, especially when it has just moved to the next directory in sequence. This is why I had to write my program using directory listings as opposed to simply looking for known files.

The standard GoPro battery will record just about two hours worth of photos. If the GoPro is connected to an external power supply, you can be limited only by the amount of storage space.

Here’s yesterday morning’s weather changing in Seattle.

Here’s a comparison of cropping vs compressing the video. I took this video on a flight from Seattle to Pullman last weekend. You can see much more of the landscape in the compressed version, and see that the top of the propeller leaves the frame in the cropped version.

Compressed:


Cropped:

I’ve written a program that takes three parameters, copies all of the images to a temporary location with an acceptable filename sequence, runs FFMPEG to create a video, then deletes the temporary images. The GoPro is configured to take full resolution still frames, 4000×3000, and I convert those to a 1080p video format using FFMPEG. Because the aspect ratio is different, and the GoPro uses a fish eye lens to begin with, both vertical and horizontal distortion shows up. I run FFMPEG twice, once creating a compressed video and a second time creating a cropped video. This allows me to chose which level of distortion I prefer after the fact.

The three parameters are the video name, the first image in the sequence, and the last image in the sequence. I am currently doing very little error checking. I’m presenting this code here, just to document what I’ve done so far. If you find this useful, please let me know.

Here’s some helper functions I regularly use.

using namespace std;

/////////////////////////////////////////////////////////////////////////////
CString FindEXEFromPath(const CString & csEXE)
{
	CString csFullPath;
	CFileFind finder;
	if (finder.FindFile(csEXE))
	{
		finder.FindNextFile();
		csFullPath = finder.GetFilePath();
		finder.Close();
	}
	else
	{
		TCHAR filename[MAX_PATH];
		unsigned long buffersize = sizeof(filename) / sizeof(TCHAR);
		// Get the file name that we are running from.
		GetModuleFileName(AfxGetResourceHandle(), filename, buffersize );
		PathRemoveFileSpec(filename);
		PathAppend(filename, csEXE);
		if (finder.FindFile(filename))
		{
			finder.FindNextFile();
			csFullPath = finder.GetFilePath();
			finder.Close();
		}
		else
		{
			CString csPATH;
			csPATH.GetEnvironmentVariable(_T("PATH"));
			int iStart = 0;
			CString csToken(csPATH.Tokenize(_T(";"), iStart));
			while (csToken != _T(""))
			{
				if (csToken.Right(1) != _T("\\"))
					csToken.AppendChar(_T('\\'));
				csToken.Append(csEXE);
				if (finder.FindFile(csToken))
				{
					finder.FindNextFile();
					csFullPath = finder.GetFilePath();
					finder.Close();
					break;
				}
				csToken = csPATH.Tokenize(_T(";"), iStart);
			}
		}
	}
	return(csFullPath);
}
/////////////////////////////////////////////////////////////////////////////
static const CString QuoteFileName(const CString & Original)
{
	CString csQuotedString(Original);
	if (csQuotedString.Find(_T(" ")) >= 0)
	{
		csQuotedString.Insert(0,_T('"'));
		csQuotedString.AppendChar(_T('"'));
	}
	return(csQuotedString);
}
/////////////////////////////////////////////////////////////////////////////
std::string timeToISO8601(const time_t & TheTime)
{
	std::ostringstream ISOTime;
	struct tm UTC;// = gmtime(&timer);
	if (0 == gmtime_s(&UTC, &TheTime))
	{
		ISOTime.fill('0');
		ISOTime << UTC.tm_year+1900 << "-";
		ISOTime.width(2);
		ISOTime << UTC.tm_mon+1 << "-";
		ISOTime.width(2);
		ISOTime << UTC.tm_mday << "T";
		ISOTime.width(2);
		ISOTime << UTC.tm_hour << ":";
		ISOTime.width(2);
		ISOTime << UTC.tm_min << ":";
		ISOTime.width(2);
		ISOTime << UTC.tm_sec;
	}
	return(ISOTime.str());
}
std::wstring getTimeISO8601(void)
{
	time_t timer;
	time(&timer);
	std::string isostring(timeToISO8601(timer));
	std::wstring rval;
	rval.assign(isostring.begin(), isostring.end());
	
	return(rval);
}
/////////////////////////////////////////////////////////////////////////////

Here’s a routine I found useful to parse the standard camera file system naming format.

/////////////////////////////////////////////////////////////////////////////
bool SplitImagePath(
	CString csSrcPath,
	CString & DestParentDir,
	int & DestChildNum,
	CString & DestChildSuffix,
	CString & DestFilePrefix,
	int & DestFileNumDigits,
	int & DestFileNum,
	CString & DestFileExt
	)
{
	bool rval = true;
	DestFileExt.Empty();
	while (csSrcPath[csSrcPath.GetLength()-1] != _T('.'))
	{
		DestFileExt.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	csSrcPath.Truncate(csSrcPath.GetLength()-1); // get rid of dot

	CString csDestFileNum;
	DestFileNumDigits = 0;
	while (iswdigit(csSrcPath[csSrcPath.GetLength()-1]))
	{
		csDestFileNum.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		DestFileNumDigits++;
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	DestFileNum = _wtoi(csDestFileNum.GetString());

	DestFilePrefix.Empty();
	while (iswalpha(csSrcPath[csSrcPath.GetLength()-1]))
	{
		DestFilePrefix.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	csSrcPath.Truncate(csSrcPath.GetLength()-1); // get rid of backslash

	DestChildSuffix.Empty();
	while (iswalpha(csSrcPath[csSrcPath.GetLength()-1]))
	{
		DestChildSuffix.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}

	CString csDestChildNum;
	while (iswdigit(csSrcPath[csSrcPath.GetLength()-1]))
	{
		csDestChildNum.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	DestChildNum = _wtoi(csDestChildNum.GetString());

	DestParentDir = csSrcPath;
	return(rval);
}
/////////////////////////////////////////////////////////////////////////////

And here’s the main program.

/////////////////////////////////////////////////////////////////////////////
int _tmain(int argc, TCHAR* argv[], TCHAR* envp[])
{
	int nRetCode = 0;

	HMODULE hModule = ::GetModuleHandle(NULL);

	if (hModule != NULL)
	{
		// initialize MFC and print and error on failure
		if (!AfxWinInit(hModule, NULL, ::GetCommandLine(), 0))
		{
			// TODO: change error code to suit your needs
			_tprintf(_T("Fatal Error: MFC initialization failed\n"));
			nRetCode = 1;
		}
		else
		{
			CString csFFMPEGPath(FindEXEFromPath(_T("ffmpeg.exe")));
			CString csFirstFileName;
			CString csLastFileName;
			CString csVideoName;

			if (argc != 4)
			{
				std::wcout << "command Line Format:" << std::endl;
				std::wcout << "\t" << argv[0] << " VideoName PathToFirstFile.jpg PathToLastFile.jpg" << std::endl;
			}
			else
			{
				csVideoName = CString(argv[1]);
				csFirstFileName = CString(argv[2]);
				csLastFileName = CString(argv[3]);

				int DirNumFirst = 0;
				int DirNumLast = 0;
				int FileNumFirst = 0;
				int FileNumLast = 0;
				CString csFinderStringFormat;

				CString DestParentDir;
				CString DestChildSuffix;
				CString DestFilePrefix;
				CString DestFileExt;
				int DestFileNumDigits;
				SplitImagePath(csFirstFileName, DestParentDir, DirNumFirst, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNumFirst, DestFileExt);
				csFinderStringFormat.Format(_T("%s%%03d%s\\%s*.%s"), DestParentDir.GetString(), DestChildSuffix.GetString(), DestFilePrefix.GetString(), DestFileExt.GetString());
				SplitImagePath(csLastFileName, DestParentDir, DirNumLast, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNumLast, DestFileExt);

				std::vector<CString> SourceImageList;
				int DirNum = DirNumFirst;
				int FileNum = FileNumFirst;
				do 
				{
					CString csFinderString;
					csFinderString.Format(csFinderStringFormat, DirNum);
					CFileFind finder;
					BOOL bWorking = finder.FindFile(csFinderString.GetString());
					while (bWorking)
					{
						bWorking = finder.FindNextFile();
						SplitImagePath(finder.GetFilePath(), DestParentDir, DirNum, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNum, DestFileExt);
						if ((FileNum >= FileNumFirst) && (FileNum <= FileNumLast))
							SourceImageList.push_back(finder.GetFilePath());
					}
					finder.Close();
					DirNum++;
				} while (DirNum <= DirNumLast);

				std::wcout << "[" << getTimeISO8601() << "] " << "First File: " << csFirstFileName.GetString() << std::endl;
				std::wcout << "[" << getTimeISO8601() << "] " << "Last File:  " << csLastFileName.GetString() << std::endl;
				std::wcout << "[" << getTimeISO8601() << "] " << "Total Files: " << SourceImageList.size() << std::endl;

				TCHAR szPath[MAX_PATH] = _T("");
				SHGetFolderPath(NULL, CSIDL_MYVIDEO, NULL, 0, szPath);
				PathAddBackslash(szPath);
				CString csImageDirectory(szPath);
				csImageDirectory.Append(csVideoName);
				if (CreateDirectory(csImageDirectory, NULL))
				{
					int OutFileIndex = 0;
					for (auto SourceFile = SourceImageList.begin(); SourceFile != SourceImageList.end(); SourceFile++)
					{
						CString OutFilePath(csImageDirectory);
						OutFilePath.AppendFormat(_T("\\Wim%05d.JPG"), OutFileIndex++);
						std::wcout << "[" << getTimeISO8601() << "] " << "CopyFile " << SourceFile->GetString() << " to " << OutFilePath.GetString() << "\r";
						CopyFile(SourceFile->GetString(), OutFilePath, TRUE);
					}
					std::wcout << "\n";

					CString csImagePathSpec(csImageDirectory); csImagePathSpec.Append(_T("\\Wim%05d.JPG"));
					CString csVideoFullPath(csImageDirectory); csVideoFullPath.Append(_T(".mp4"));
					if (csFFMPEGPath.GetLength() > 0)
					{
						csVideoFullPath = csImageDirectory + _T("-1080p-cropped.mp4");
						std::wcout << "[" << getTimeISO8601() << "] " << csFFMPEGPath.GetString() << " -i " << QuoteFileName(csImagePathSpec).GetString() << " -y " << QuoteFileName(csVideoFullPath).GetString() << std::endl;
						if (-1 == _tspawnlp(_P_WAIT, csFFMPEGPath.GetString(), csFFMPEGPath.GetString(), 
							#ifdef _DEBUG
							_T("-report"),
							#endif
							_T("-i"), QuoteFileName(csImagePathSpec).GetString(),
							_T("-vf"), _T("crop=in_w:3/4*in_h"),
							// _T("-vf"), _T("rotate=PI"), // Us this to rotate the movie if we forgot to put the GoPro in upside down mode.
							_T("-s"), _T("1920x1080"),
							_T("-y"), // Cause it to overwrite exiting output files
							QuoteFileName(csVideoFullPath).GetString(), NULL))
							std::wcout << "[" << getTimeISO8601() << "]  _tspawnlp failed: " /* << _sys_errlist[errno] */ << std::endl;
						csVideoFullPath = csImageDirectory + _T("-1080p-compressed.mp4");
						std::wcout << "[" << getTimeISO8601() << "] " << csFFMPEGPath.GetString() << " -i " << QuoteFileName(csImagePathSpec).GetString() << " -y " << QuoteFileName(csVideoFullPath).GetString() << std::endl;
						if (-1 == _tspawnlp(_P_WAIT, csFFMPEGPath.GetString(), csFFMPEGPath.GetString(), 
							#ifdef _DEBUG
							_T("-report"),
							#endif
							_T("-i"), QuoteFileName(csImagePathSpec).GetString(),
							// _T("-vf"), _T("rotate=PI"), // Us this to rotate the movie if we forgot to put the GoPro in upside down mode.
							_T("-s"), _T("1920x1080"),
							_T("-y"), // Cause it to overwrite exiting output files
							QuoteFileName(csVideoFullPath).GetString(), NULL))
							std::wcout << "[" << getTimeISO8601() << "]  _tspawnlp failed: " /* << _sys_errlist[errno] */ << std::endl;
					}
					do 
					{
						CString OutFilePath(csImageDirectory);
						OutFilePath.AppendFormat(_T("\\Wim%05d.JPG"), --OutFileIndex);
						std::wcout << "[" << getTimeISO8601() << "] " << "DeleteFile " << OutFilePath.GetString() << "\r";
						DeleteFile(OutFilePath);
					} while (OutFileIndex > 0);
					std::wcout << "\n[" << getTimeISO8601() << "] " << "RemoveDirectory " << csImageDirectory.GetString() << std::endl;
					RemoveDirectory(csImageDirectory);
				}
			}
		}
	}
	else
	{
		_tprintf(_T("Fatal Error: GetModuleHandle failed\n"));
		nRetCode = 1;
	}
	return nRetCode;
}

Time Lapse Videos using FFMPEG

I’ve been creating time lapse videos using FFMPEG from the output of a GoPro camera since the beginning of summer. I have always been interested in the output but not had easy methods of creating them until recently.

I like the result best when I have the GoPro set to take one image every 5 seconds, and then I have FFMPEG create a default MP4 file at 25 frames per second. The standard GoPro battery seems to record just about two hours worth of full resolution images in my Hero 3+ Black, which works out to just about a minute of video.

My first video using this method was of the sunset over Elliott Bay taken through the window of the elevator waiting room where I live, on the 13th floor.

I’ve written a program that copies the default GoPro image file names to a sequence that starts with image number 0000 so that FFMPEG will recognize a starting sequence with the default globbing method.

An example command line I use to start FFMPEG is:

ffmpeg -i \\MyServer\Pictures\GoPro\Sunset2\Wim%04d.JPG \\MyServer\Pictures\GoPro\Sunset2\Sunset2.mp4

That command line will actually create a video that has a resolution of 4000×3000, which is the resolution I’m taking the individual pictures. I could have specified to FFMPEG to reshape the output, or trim it.

The second video I created using this method was of a sunrise in essentially the same location.

https://trac.ffmpeg.org/wiki/Create%20a%20video%20slideshow%20from%20images has good examples of some of the options.

The most recent video was created after I purchased a suction cup mount for my GoPro. I went to the 17th floor and read a book during the hour before sunset.

Several things are apparent in this process.

  • I need a black out curtain surrounding the camera to block reflections when the light is directed at the camera. The camera has been positioned flush against the surface of the glass, but the thickness of the glass creates reflections inside the glass itself.
  • videos with weather are much more interesting than just the movement of the sun itself.
  • I have been turning on the WiFi in the camera to make sure I’ve positioned the camera correctly. I need to try turning off the WiFi to see how it affects the battery rundown length.
  • I need to change the picture frequency to a longer or short time-span to see if the battery life is affected at all.

Logitech C920 Angle of View

I realized today that the Logitech C920 webcam produces images covering a different field of vision (FOV) for the same width based on the height. I was expecting the horizontal field to be the same for a given width but it was not.

Using the command ffmpeg -f video4linux2 -list_formats all -i /dev/video0 to retrieve the sizes of video available lists the same set of sizes for h264 and mjpeg. 640×480 160×90 160×120 176×144 320×180 320×240 352×288 432×240 640×360 800×448 800×600 864×480 960×720 1024×576 1280×720 1600×896 1920×1080. In Raw/yuyv422 mode two additional sizes are available. 2304×1296 2304×1536.

I pointed my webcam at the building out my window, giving myself a rough grid pattern to look at and ran it through all of the h.264 sizes, and manually counted the horizontal and vertical blocks visible. 

I expected 640×480 and 640×360 to be the same horizontal FOV but have different vertical FOV. What actually happened in the FOV was that they displayed the same vertical FOV but different horizontal FOV.

I ran through all of the h264 resolutions, and the vertical FOV appeared to shrink slightly when I requested resolutions below 200, but otherwise stayed the same. 

Selecting 2304×1536 produced a slightly larger vertical FOV with the same horizontal FOV as 1920×1080. 2304×1296 seemed to produce the same FOV in both directions as 1920×1080.  Both of these resolutions run at lower frame rates and only in raw mode. I was testing them using ffmpeg transcoding and sending to my windows desktop with the command: ffmpeg -re -f v4l2 -video_size 2304×1536 -framerate 2 -input_format yuyv422 -i /dev/video0 -f mpegts udp://192.168.0.10:8090

The C920 advertises a Diagonal FOV of 78°, but I didn’t find official meaning of that.  I found a nice bit of information at http://therandomlab.blogspot.com/2013/03/logitech-c920-and-c910-fields-of-view.html that describes it as explicitly as being when the camera is running in 16×9 mode. 

I will probably get around to writing a program to more accurately produce the results.  Here’s my manual table:

Resolution Width Height Blocks Floors Width/Height Ratio MegaPixels
160×90  160 90 9 8 1.777778 0.01
160×120  160 120 7 8 1.333333 0.01
176×144  176 144 7 9 1.222222 0.02
320×180  320 180 9 8 1.777778 0.05
320×240  320 240 7 9 1.333333 0.07
352×288  352 288 7 9 1.222222 0.1
432×240  432 240 10 9 1.8 0.1
640×360  640 360 10 9 1.777778 0.23
640×480  640 480 7 9 1.333333 0.3
800×448  800 448 10 9 1.785714 0.35
800×600  800 600 7 9 1.333333 0.48
864×480  864 480 10 9 1.8 0.41
960×720  960 720 7 9 1.333333 0.69
1024×576  1024 576 10 9 1.777778 0.58
1280×720  1280 720 10 9 1.777778 0.92
1600×896  1600 896 10 9 1.785714 1.43
1920×1080 1920 1080 10 9 1.777778 2.07
2304×1296 2304 1296     1.777778 2.98
2304×1536 2304 1536     1.5 3.53
  16 9     1.777778
  4 3     1.333333

 

WimTiVoServer changes to use FFProbe

WimTiVoServer was originally written using the libraries that FFMPEG is based on to retrieve details about video files. I had downloaded the packages from http://ffmpeg.zeranoe.com/builds/ and used the DLLs for the library calls. In other program I’m building related to FFMPEG I am updating FFMPEG on a regular basis. Maintaining the correct link path any time I came back for a minor adjustment to WimTiVoServer became more of an effort than I wanted to deal with, so I investigated what else was available.

My solution has been to use FFProbe, which is distributed with FFmpeg. I am using the spawning a child process and capturing the standard output. I read the results of my command and put it into a IStream memory stream object, which I then use the IXmlReader object to parse the XML for the items I’m looking for.

The command line I’m using for FFProbe is ffprobe.exe -show_streams -show_format -print_format xml INPUT. An example of the output it produces is:

<?xml version="1.0" encoding="UTF-8"?>
<ffprobe>
    <streams>
        <stream index="0" codec_name="ac3" codec_long_name="ATSC A/52A (AC-3)" codec_type="audio" codec_time_base="1/48000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" sample_fmt="fltp" sample_rate="48000" channels="6" bits_per_sample="0" dmix_mode="-1" ltrt_cmixlev="-1.000000" ltrt_surmixlev="-1.000000" loro_cmixlev="-1.000000" loro_surmixlev="-1.000000" id="0x27" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="22054844" start_time="2.205484" duration_ts="19133694951" duration="1913.369495" bit_rate="384000">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="1" codec_name="ac3" codec_long_name="ATSC A/52A (AC-3)" codec_type="audio" codec_time_base="1/48000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" sample_fmt="fltp" sample_rate="48000" channels="2" bits_per_sample="0" dmix_mode="-1" ltrt_cmixlev="-1.000000" ltrt_surmixlev="-1.000000" loro_cmixlev="-1.000000" loro_surmixlev="-1.000000" id="0x28" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="23039510" start_time="2.303951" bit_rate="192000">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="2" codec_name="mpeg2video" codec_long_name="MPEG-2 video" profile="Main" codec_type="video" codec_time_base="1001/120000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" width="1280" height="720" has_b_frames="1" sample_aspect_ratio="1:1" display_aspect_ratio="16:9" pix_fmt="yuv420p" level="4" timecode="00:00:00:00" id="0x29" r_frame_rate="60000/1001" avg_frame_rate="60000/1001" time_base="1/10000000" start_pts="31875510" start_time="3.187551">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="3" codec_type="subtitle" codec_time_base="1/10000000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" id="0x2a" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="32209177" start_time="3.220918">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="4" codec_name="mjpeg" codec_long_name="MJPEG (Motion JPEG)" codec_type="video" codec_time_base="1/90000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" width="200" height="113" has_b_frames="0" sample_aspect_ratio="1:1" display_aspect_ratio="200:113" pix_fmt="yuvj420p" level="-99" id="0xffffffff" r_frame_rate="90000/1" avg_frame_rate="0/0" time_base="1/90000" start_pts="198494" start_time="2.205489" duration_ts="172203255" duration="1913.369500">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="1"/>
            <tag key="title" value="TV Thumbnail"/>
        </stream>
    </streams>

    <format filename="d:\Recorded TV\Archer_FXPHD_2013_02_28_22_00_00.wtv" nb_streams="5" nb_programs="0" format_name="wtv" format_long_name="Windows Television (WTV)" start_time="2.205484" duration="1913.369495" size="1956642816" bit_rate="8180930" probe_score="100">
        <tag key="WM/MediaClassPrimaryID" value="db9830bd-3ab3-4fab-8a371a995f7ff74"/>
        <tag key="WM/MediaClassSecondaryID" value="ba7f258a-62f7-47a9-b21f4651c42a000"/>
        <tag key="Title" value="Archer"/>
        <tag key="WM/SubTitle" value="Live and Let Dine"/>
        <tag key="WM/SubTitleDescription" value="Archer, Lana and Cyril go undercover in celebrity chef Lance Casteau&apos;s hellish kitchen."/>
        <tag key="genre" value="Comedy;General;Series"/>
        <tag key="WM/OriginalReleaseTime" value="0"/>
        <tag key="language" value="en-us"/>
        <tag key="WM/MediaCredits" value="H. Jon Benjamin/Jessica Walter/Aisha Tyler/George Coe/Chris Parnell/Judy Greer;;;Anthony Bourdain"/>
        <tag key="service_provider" value="FXPHD"/>
        <tag key="service_name" value="FX HD (Pacific)"/>
        <tag key="WM/MediaNetworkAffiliation" value="Satellite"/>
        <tag key="WM/MediaOriginalChannel" value="728"/>
        <tag key="WM/MediaOriginalChannelSubNumber" value="0"/>
        <tag key="WM/MediaOriginalBroadcastDateTime" value="2013-02-28T08:00:00Z"/>
        <tag key="WM/MediaOriginalRunTime" value="19144791872"/>
        <tag key="WM/MediaIsStereo" value="false"/>
        <tag key="WM/MediaIsRepeat" value="false"/>
        <tag key="WM/MediaIsLive" value="false"/>
        <tag key="WM/MediaIsTape" value="false"/>
        <tag key="WM/MediaIsDelay" value="false"/>
        <tag key="WM/MediaIsSubtitled" value="false"/>
        <tag key="WM/MediaIsMovie" value="false"/>
        <tag key="WM/MediaIsPremiere" value="false"/>
        <tag key="WM/MediaIsFinale" value="false"/>
        <tag key="WM/MediaIsSAP" value="false"/>
        <tag key="WM/MediaIsSport" value="false"/>
        <tag key="WM/Provider" value="MediaCenterDefault"/>
        <tag key="WM/VideoClosedCaptioning" value="false"/>
        <tag key="WM/WMRVEncodeTime" value="2013-03-01 06:00:05"/>
        <tag key="WM/WMRVSeriesUID" value="!MCSeries!225842780"/>
        <tag key="WM/WMRVServiceID" value="!MCService!188913961"/>
        <tag key="WM/WMRVProgramID" value="!MCProgram!285145704"/>
        <tag key="WM/WMRVRequestID" value="0"/>
        <tag key="WM/WMRVScheduleItemID" value="0"/>
        <tag key="WM/WMRVQuality" value="0"/>
        <tag key="WM/WMRVOriginalSoftPrePadding" value="300"/>
        <tag key="WM/WMRVOriginalSoftPostPadding" value="120"/>
        <tag key="WM/WMRVHardPrePadding" value="-300"/>
        <tag key="WM/WMRVHardPostPadding" value="0"/>
        <tag key="WM/WMRVATSCContent" value="true"/>
        <tag key="WM/WMRVDTVContent" value="true"/>
        <tag key="WM/WMRVHDContent" value="true"/>
        <tag key="Duration" value="19151788198"/>
        <tag key="WM/WMRVEndTime" value="2013-03-01 06:32:00"/>
        <tag key="WM/WMRVBitrate" value="8.173201"/>
        <tag key="WM/WMRVKeepUntil" value="-1"/>
        <tag key="WM/WMRVActualSoftPrePadding" value="294"/>
        <tag key="WM/WMRVActualSoftPostPadding" value="120"/>
        <tag key="WM/WMRVContentProtected" value="true"/>
        <tag key="WM/WMRVContentProtectedPercent" value="99"/>
        <tag key="WM/WMRVExpirationSpan" value="9223372036854775807"/>
        <tag key="WM/WMRVInBandRatingSystem" value="255"/>
        <tag key="WM/WMRVInBandRatingLevel" value="255"/>
        <tag key="WM/WMRVInBandRatingAttributes" value="0"/>
        <tag key="WM/WMRVWatched" value="false"/>
        <tag key="WM/MediaThumbWidth" value="352"/>
        <tag key="WM/MediaThumbHeight" value="198"/>
        <tag key="WM/MediaThumbStride" value="1056"/>
        <tag key="WM/MediaThumbRet" value="0"/>
        <tag key="WM/MediaThumbRatingSystem" value="9"/>
        <tag key="WM/MediaThumbRatingLevel" value="17"/>
        <tag key="WM/MediaThumbRatingAttributes" value="0"/>
        <tag key="WM/MediaThumbAspectRatioX" value="16"/>
        <tag key="WM/MediaThumbAspectRatioY" value="9"/>
        <tag key="WM/MediaThumbTimeStamp" value="4647772712253334203"/>
    </format>
</ffprobe>

I am parsing the XML and keeping track of only the first video stream details and the first audio stream details, and then looking for some specific items in the metadata tags. I store the information and return it to the TiVo as information when it’s requesting a list of what programs are available to transfer and then when I transfer the file itself.

An interesting side effect of moving to using XML from using the libraries is that the XML created by FFProbe handles extended characters that are not in the ASCII character set. Because I’m using the XML Parser that works with Unicode by default, it takes care of the characters properly. When I was using the libraries, I was looping on AVDictionaryEntry values and doing comparisons with char values.

Here is the code that I’m currently using. It’s not the prettiest code but it gets the job done and runs quickly enough.

void cTiVoFile::PopulateFromFFProbe(void)
{
	static const CString csFFProbePath(FindEXEFromPath(_T("ffprobe.exe")));
	if (!csFFProbePath.IsEmpty())
	{
		// Set the bInheritHandle flag so pipe handles are inherited. 
		SECURITY_ATTRIBUTES saAttr;  
		saAttr.nLength = sizeof(SECURITY_ATTRIBUTES); 
		saAttr.bInheritHandle = TRUE; 
		saAttr.lpSecurityDescriptor = NULL; 

		// Create a pipe for the child process's STDOUT. 
		HANDLE g_hChildStd_OUT_Rd = NULL;
		HANDLE g_hChildStd_OUT_Wr = NULL;
		if ( ! CreatePipe(&g_hChildStd_OUT_Rd, &g_hChildStd_OUT_Wr, &saAttr, 0x800000) ) 
			std::cout << "[" << getTimeISO8601() << "] "  << __FUNCTION__ << "\t ERROR: StdoutRd CreatePipe" << endl;
		else
		{
			// Ensure the read handle to the pipe for STDOUT is not inherited.
			if ( ! SetHandleInformation(g_hChildStd_OUT_Rd, HANDLE_FLAG_INHERIT, 0) )
				std::cout << "[" << getTimeISO8601() << "] "  << __FUNCTION__ << "\t ERROR: Stdout SetHandleInformation" << endl;
			else
			{
				// Create a child process that uses the previously created pipes for STDIN and STDOUT.
				// Set up members of the PROCESS_INFORMATION structure.  
				PROCESS_INFORMATION piProcInfo; 
				ZeroMemory( &piProcInfo, sizeof(PROCESS_INFORMATION) );
 
				// Set up members of the STARTUPINFO structure. 
				// This structure specifies the STDIN and STDOUT handles for redirection.
				STARTUPINFO siStartInfo;
				ZeroMemory( &siStartInfo, sizeof(STARTUPINFO) );
				siStartInfo.cb = sizeof(STARTUPINFO); 
				siStartInfo.hStdError = GetStdHandle(STD_ERROR_HANDLE);
				siStartInfo.hStdInput = GetStdHandle(STD_INPUT_HANDLE);
				siStartInfo.hStdOutput = g_hChildStd_OUT_Wr;
				siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
 
				CString csCommandLine(QuoteFileName(csFFProbePath));
				csCommandLine.Append(_T(" -show_streams -show_format -print_format xml "));
				csCommandLine.Append(QuoteFileName(m_csPathName));

				TRACE(_T("CreateProcess: %s\n"), csCommandLine.GetString());
				// Create the child process.
				if (CreateProcess(NULL, 
					(LPTSTR) csCommandLine.GetString(),     // command line 
					NULL,          // process security attributes 
					NULL,          // primary thread security attributes 
					TRUE,          // handles are inherited 
					0,             // creation flags 
					NULL,          // use parent's environment 
					NULL,          // use parent's current directory 
					&siStartInfo,  // STARTUPINFO pointer 
					&piProcInfo))  // receives PROCESS_INFORMATION 
				{
					CloseHandle(g_hChildStd_OUT_Wr);	// If I don't do this, then the parent will never exit!
					CComPtr<IStream> spMemoryStreamOne(::SHCreateMemStream(NULL, 0));
					if (spMemoryStreamOne != NULL)
					{
						const int RAWDataBuffSize = 0x1000;	// 0x1000 is 4k
						char * RAWDataBuff = new char[RAWDataBuffSize];
						for (;;)
						{
							DWORD dwRead = 0;
							BOOL bSuccess = ReadFile(g_hChildStd_OUT_Rd, RAWDataBuff, RAWDataBuffSize, &dwRead, NULL);
							if( (!bSuccess) || (dwRead == 0)) break;
							ULONG cbWritten;
							spMemoryStreamOne->Write(RAWDataBuff, dwRead, &cbWritten);
						} 
						delete[] RAWDataBuff;
						// reposition back to beginning of stream
						LARGE_INTEGER position;
						position.QuadPart = 0;
						spMemoryStreamOne->Seek(position, STREAM_SEEK_SET, NULL);
						HRESULT hr = S_OK;
						CComPtr<IXmlReader> pReader; 
						if (SUCCEEDED(hr = CreateXmlReader(__uuidof(IXmlReader), (void**) &pReader, NULL))) 
						{
							if (SUCCEEDED(hr = pReader->SetProperty(XmlReaderProperty_DtdProcessing, DtdProcessing_Prohibit))) 
							{
								if (SUCCEEDED(hr = pReader->SetInput(spMemoryStreamOne))) 
								{
									int indentlevel = 0;
									XmlNodeType nodeType; 
									const WCHAR* pwszLocalName;
									const WCHAR* pwszValue;
									CString csLocalName;
									bool bIsFormat = false;
									bool bVideoStreamInfoNeeded = true;
									bool bAudioStreamInfoNeeded = true;

									//read until there are no more nodes 
									while (S_OK == (hr = pReader->Read(&nodeType))) 
									{
										if (nodeType == XmlNodeType_Element)
										{
											if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
											{
												csLocalName = CString(pwszLocalName);
												if ((bVideoStreamInfoNeeded || bAudioStreamInfoNeeded) && !csLocalName.Compare(_T("stream")))
												{
													CString cs_codec_name;
													CString cs_codec_type;
													CString cs_codec_time_base;
													CString cs_width;
													CString cs_height;
													CString cs_duration;
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															csLocalName = CString(pwszLocalName);
															if (!csLocalName.Compare(_T("codec_name")))
																cs_codec_name = CString(pwszValue);
															else if (!csLocalName.Compare(_T("codec_type")))
																cs_codec_type = CString(pwszValue);
															else if (!csLocalName.Compare(_T("codec_time_base")))
																cs_codec_time_base = CString(pwszValue);
															else if (!csLocalName.Compare(_T("width")))
																cs_width = CString(pwszValue);
															else if (!csLocalName.Compare(_T("height")))
																cs_height = CString(pwszValue);
															else if (!csLocalName.Compare(_T("duration")))
																cs_duration = CString(pwszValue);
														}
													}
													if (!cs_codec_type.Compare(_T("video")))
													{
														bVideoStreamInfoNeeded = false;
														if (!cs_codec_name.Compare(_T("mpeg2video")))
															m_VideoCompatible = true;
														m_SourceFormat = cs_codec_type + CString(_T("/")) + cs_codec_name;
														int width = 0;
														std::wstringstream ss;
														ss << cs_width.GetString();
														ss >> width;
														if (width >= 1280)
															m_VideoHighDefinition = true;
														double duration = 0;
														ss = std::wstringstream();
														ss << cs_duration.GetString();
														ss >> duration;
																												m_Duration = duration * 1000 + 5;													}
													else if (!cs_codec_type.Compare(_T("audio")))
													{
														bAudioStreamInfoNeeded = false;
														if (!cs_codec_name.Compare(_T("ac3")))
															m_AudioCompatible = true;
													}	
												}
												else if (!csLocalName.Compare(_T("format")))
												{
													bIsFormat = true;
													const CString ccs_duration(_T("duration"));
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															if (!ccs_duration.Compare(pwszLocalName))
															{
																double duration = 0;
																std::wstringstream ss;
																ss << pwszValue;
																ss >> duration;
																m_Duration = duration * 1000 + 5;
															}
														}
													}
												}
												// Here's where I need to dig deeper.
												else if (bIsFormat && (!csLocalName.Compare(_T("tag"))))
												{
													CString csAttributeKey;
													CString csAttributeValue;
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															if (!CString(_T("key")).Compare(pwszLocalName))
																csAttributeKey = CString(pwszValue);
															else if (!CString(_T("value")).Compare(pwszLocalName))
																csAttributeValue = CString(pwszValue);
														}
													}
													if (!csAttributeKey.CompareNoCase(_T("title")))
														m_Title = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("episode_id")))
														m_EpisodeTitle = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("description")))
														m_Description = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/SubTitle")))
														m_EpisodeTitle = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/SubTitleDescription")))
														m_Description = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("genre")))
														m_vProgramGenre = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("service_provider")))
														m_SourceStation = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaOriginalChannel")))
														m_SourceChannel = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaCredits")))
													{
														m_vActor = csAttributeValue;
														while (0 < m_vActor.Replace(_T(";;"),_T(";")));
														while (0 < m_vActor.Replace(_T("//"),_T("/")));
													}
													else if (!csAttributeKey.CompareNoCase(_T("WM/WMRVEncodeTime")))
													{
														CTime OriginalBroadcastDate = ISO8601totime(std::string(CStringA(csAttributeValue).GetString()));
														if (OriginalBroadcastDate > 0)
															m_CaptureDate = OriginalBroadcastDate;
													}
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaOriginalBroadcastDateTime")))
													{
														CTime OriginalBroadcastDate = ISO8601totime(std::string(CStringA(csAttributeValue).GetString()));
														if (OriginalBroadcastDate > 0)
															m_CaptureDate = OriginalBroadcastDate;
													}
																										m_Description.Trim();
												}
											}
										}
										else if (nodeType == XmlNodeType_EndElement)
										{
											if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
												if (!CString(pwszLocalName).Compare(_T("format")))
													bIsFormat = false;
										}
									}
								}
							}
						}
					}
					// Close handles to the child process and its primary thread.
					// Some applications might keep these handles to monitor the status
					// of the child process, for example. 
					CloseHandle(piProcInfo.hProcess);
					CloseHandle(piProcInfo.hThread);
				}
			}
			CloseHandle(g_hChildStd_OUT_Rd);
		}
	}
}