FFMPEG and drawtext

Several years ago I wrote a program that consolidates time-lapse pictures into a directory and calls FFMPEG to create a video.

I had been wanting the time-code from when each picture was taken printed on the screen while the video was playing but had not figured out how to get it done until this weekend.

Video TimeCode

Frame from video showing the DateTimeOriginal timecode embedded.

I’d gone down multiple paths in an attempt to get this result before finally getting the drawtext feature to work. My program manually pulled the metadata from the images before feeding them to ffmpg. I’d tried creating both text files and image files for overlaying. none of those got the result that I was looking for.

When I finally got everything working, it seems simple, but the underlying problem has to do with the amount of string escaping required to get the command to work.

Here’s an example command I was issuing to ffmpeg that got the result I was looking for.

ffmpeg.exe -hide_banner -r 30 -i Wim%05d.JPG -vf crop=in_w:3/4*in_h,drawtext=fontfile=C\\:/WINDOWS/Fonts/OCRAEXT.ttf:fontcolor=white:fontsize=160:y=main_h-text_h-50:x=main_w-text_w-50:text=WimsWorld,drawtext=fontfile=C\\:/WINDOWS/Fonts/OCRAEXT.ttf:fontcolor=white:fontsize=160:y=main_h-text_h-50:x=50:text=%{metadata\\:DateTimeOriginal} -s 3840x2160 -pix_fmt yuv420p -n Test-2160p30-cropped.mp4

If you look at the -vf option parameter, I’m cropping my input pictures to 3/4 their original height, then using the drawtext feature twice. First I write the static text to the bottom right of the frame, then I extract metadata from the source image and write it to the bottom left of the frame.

Because I’m calling this from a program, I had extra escaping of the \ character in my code. All of the escaping required a lot of trial and error to get things working. I’m using OCRAEXT as my font, but I could be using any fixed spacing font. because of the fact that the time is changing every frame, it’s important that the font not be proportional to make it easy to read.

Advertisements

GoPro Battery BacPac

I purchased a GoPro Battery BacPack recently because I realized that I’d rather have extended shooting than have two seperate batteries that needed to be charged. I had already purchased a second standard battery for the GoPro, but I didn’t have a way to charge it when it was not in the camera, so found it less useful than I was hoping.

The fact that I am most often using my GoPro in harsh conditions means that I’d rather not open the case any more frequently than I need to. When I’m skiing, if I go into the lodge to change the battery, the very first thing I notice is that the cold GoPro case is suddenly steamed over by the indoor humidity. 

When I’ve been doing my stop motion photography at a picture every two seconds, generally the standard battery lasts just under two hours. My first test with the BacPac attached started at 8:47am and the last picture was at 12:40, so it looks like It gets me to just under 4 hours total.

It created 6999 files in that time frame. I’ve not figured out if the GoPro uses any less battery when taking sequential still photos vs when it’s taking movies. 

Saturday’s race on Different Drummer wasn’t fully captured in the time allotted because we went out early and did some practice work flying the spinnaker.

Hopefully I’ll get around to writing more about what comes with the BacPac in the next couple of days. I was mostly interested in sharing the extension of the recording time.

Time-lapse videos from GoPro

In November 2013 I purchased a GoPro HERO3+ Black Edition to play with video recording, but almost immediately became enthralled with taking sequences of photos over long time periods.

The GoPro can be configured to take a picture every 0.5, 1, 2, 5, 10, 30, or 60 seconds. I’ve found that I like taking pictures every 2 seconds, and then converting them to video at 30fps (Frames per second) which gets me an easy to convert time scale. 1 second of video came from 1 minute of photos, 1 minute of video came from 1 hour of photos.

GoPro has a freely available software package to edit videos, as well as creating videos from sequences of images. Because of my past familiarity with FFMPEG I wanted a more scriptable solution for creating videos from thousands of photos.

https://trac.ffmpeg.org/wiki/Create%20a%20video%20slideshow%20from%20images has nice instructions for creating videos from sequences of images using FFMPEG. What it glosses over is that the first image in the sequence needs to be numbered zero or one. Another complication in the process is that the GoPro uses the standard camera file format where no more than 1000 images will be stored in a single directory. This means that with the 1800 images created in a single hour, at least two directories will hold the source images. An interesting issue I ran across is that sometimes the GoPro will skip a number in its image sequence, especially when it has just moved to the next directory in sequence. This is why I had to write my program using directory listings as opposed to simply looking for known files.

The standard GoPro battery will record just about two hours worth of photos. If the GoPro is connected to an external power supply, you can be limited only by the amount of storage space.

Here’s yesterday morning’s weather changing in Seattle.

Here’s a comparison of cropping vs compressing the video. I took this video on a flight from Seattle to Pullman last weekend. You can see much more of the landscape in the compressed version, and see that the top of the propeller leaves the frame in the cropped version.

Compressed:


Cropped:

I’ve written a program that takes three parameters, copies all of the images to a temporary location with an acceptable filename sequence, runs FFMPEG to create a video, then deletes the temporary images. The GoPro is configured to take full resolution still frames, 4000×3000, and I convert those to a 1080p video format using FFMPEG. Because the aspect ratio is different, and the GoPro uses a fish eye lens to begin with, both vertical and horizontal distortion shows up. I run FFMPEG twice, once creating a compressed video and a second time creating a cropped video. This allows me to chose which level of distortion I prefer after the fact.

The three parameters are the video name, the first image in the sequence, and the last image in the sequence. I am currently doing very little error checking. I’m presenting this code here, just to document what I’ve done so far. If you find this useful, please let me know.

Here’s some helper functions I regularly use.

using namespace std;

/////////////////////////////////////////////////////////////////////////////
CString FindEXEFromPath(const CString & csEXE)
{
	CString csFullPath;
	CFileFind finder;
	if (finder.FindFile(csEXE))
	{
		finder.FindNextFile();
		csFullPath = finder.GetFilePath();
		finder.Close();
	}
	else
	{
		TCHAR filename[MAX_PATH];
		unsigned long buffersize = sizeof(filename) / sizeof(TCHAR);
		// Get the file name that we are running from.
		GetModuleFileName(AfxGetResourceHandle(), filename, buffersize );
		PathRemoveFileSpec(filename);
		PathAppend(filename, csEXE);
		if (finder.FindFile(filename))
		{
			finder.FindNextFile();
			csFullPath = finder.GetFilePath();
			finder.Close();
		}
		else
		{
			CString csPATH;
			csPATH.GetEnvironmentVariable(_T("PATH"));
			int iStart = 0;
			CString csToken(csPATH.Tokenize(_T(";"), iStart));
			while (csToken != _T(""))
			{
				if (csToken.Right(1) != _T("\\"))
					csToken.AppendChar(_T('\\'));
				csToken.Append(csEXE);
				if (finder.FindFile(csToken))
				{
					finder.FindNextFile();
					csFullPath = finder.GetFilePath();
					finder.Close();
					break;
				}
				csToken = csPATH.Tokenize(_T(";"), iStart);
			}
		}
	}
	return(csFullPath);
}
/////////////////////////////////////////////////////////////////////////////
static const CString QuoteFileName(const CString & Original)
{
	CString csQuotedString(Original);
	if (csQuotedString.Find(_T(" ")) >= 0)
	{
		csQuotedString.Insert(0,_T('"'));
		csQuotedString.AppendChar(_T('"'));
	}
	return(csQuotedString);
}
/////////////////////////////////////////////////////////////////////////////
std::string timeToISO8601(const time_t & TheTime)
{
	std::ostringstream ISOTime;
	struct tm UTC;// = gmtime(&timer);
	if (0 == gmtime_s(&UTC, &TheTime))
	{
		ISOTime.fill('0');
		ISOTime << UTC.tm_year+1900 << "-";
		ISOTime.width(2);
		ISOTime << UTC.tm_mon+1 << "-";
		ISOTime.width(2);
		ISOTime << UTC.tm_mday << "T";
		ISOTime.width(2);
		ISOTime << UTC.tm_hour << ":";
		ISOTime.width(2);
		ISOTime << UTC.tm_min << ":";
		ISOTime.width(2);
		ISOTime << UTC.tm_sec;
	}
	return(ISOTime.str());
}
std::wstring getTimeISO8601(void)
{
	time_t timer;
	time(&timer);
	std::string isostring(timeToISO8601(timer));
	std::wstring rval;
	rval.assign(isostring.begin(), isostring.end());
	
	return(rval);
}
/////////////////////////////////////////////////////////////////////////////

Here’s a routine I found useful to parse the standard camera file system naming format.

/////////////////////////////////////////////////////////////////////////////
bool SplitImagePath(
	CString csSrcPath,
	CString & DestParentDir,
	int & DestChildNum,
	CString & DestChildSuffix,
	CString & DestFilePrefix,
	int & DestFileNumDigits,
	int & DestFileNum,
	CString & DestFileExt
	)
{
	bool rval = true;
	DestFileExt.Empty();
	while (csSrcPath[csSrcPath.GetLength()-1] != _T('.'))
	{
		DestFileExt.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	csSrcPath.Truncate(csSrcPath.GetLength()-1); // get rid of dot

	CString csDestFileNum;
	DestFileNumDigits = 0;
	while (iswdigit(csSrcPath[csSrcPath.GetLength()-1]))
	{
		csDestFileNum.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		DestFileNumDigits++;
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	DestFileNum = _wtoi(csDestFileNum.GetString());

	DestFilePrefix.Empty();
	while (iswalpha(csSrcPath[csSrcPath.GetLength()-1]))
	{
		DestFilePrefix.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	csSrcPath.Truncate(csSrcPath.GetLength()-1); // get rid of backslash

	DestChildSuffix.Empty();
	while (iswalpha(csSrcPath[csSrcPath.GetLength()-1]))
	{
		DestChildSuffix.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}

	CString csDestChildNum;
	while (iswdigit(csSrcPath[csSrcPath.GetLength()-1]))
	{
		csDestChildNum.Insert(0, csSrcPath[csSrcPath.GetLength()-1]);
		csSrcPath.Truncate(csSrcPath.GetLength()-1);
	}
	DestChildNum = _wtoi(csDestChildNum.GetString());

	DestParentDir = csSrcPath;
	return(rval);
}
/////////////////////////////////////////////////////////////////////////////

And here’s the main program.

/////////////////////////////////////////////////////////////////////////////
int _tmain(int argc, TCHAR* argv[], TCHAR* envp[])
{
	int nRetCode = 0;

	HMODULE hModule = ::GetModuleHandle(NULL);

	if (hModule != NULL)
	{
		// initialize MFC and print and error on failure
		if (!AfxWinInit(hModule, NULL, ::GetCommandLine(), 0))
		{
			// TODO: change error code to suit your needs
			_tprintf(_T("Fatal Error: MFC initialization failed\n"));
			nRetCode = 1;
		}
		else
		{
			CString csFFMPEGPath(FindEXEFromPath(_T("ffmpeg.exe")));
			CString csFirstFileName;
			CString csLastFileName;
			CString csVideoName;

			if (argc != 4)
			{
				std::wcout << "command Line Format:" << std::endl;
				std::wcout << "\t" << argv[0] << " VideoName PathToFirstFile.jpg PathToLastFile.jpg" << std::endl;
			}
			else
			{
				csVideoName = CString(argv[1]);
				csFirstFileName = CString(argv[2]);
				csLastFileName = CString(argv[3]);

				int DirNumFirst = 0;
				int DirNumLast = 0;
				int FileNumFirst = 0;
				int FileNumLast = 0;
				CString csFinderStringFormat;

				CString DestParentDir;
				CString DestChildSuffix;
				CString DestFilePrefix;
				CString DestFileExt;
				int DestFileNumDigits;
				SplitImagePath(csFirstFileName, DestParentDir, DirNumFirst, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNumFirst, DestFileExt);
				csFinderStringFormat.Format(_T("%s%%03d%s\\%s*.%s"), DestParentDir.GetString(), DestChildSuffix.GetString(), DestFilePrefix.GetString(), DestFileExt.GetString());
				SplitImagePath(csLastFileName, DestParentDir, DirNumLast, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNumLast, DestFileExt);

				std::vector<CString> SourceImageList;
				int DirNum = DirNumFirst;
				int FileNum = FileNumFirst;
				do 
				{
					CString csFinderString;
					csFinderString.Format(csFinderStringFormat, DirNum);
					CFileFind finder;
					BOOL bWorking = finder.FindFile(csFinderString.GetString());
					while (bWorking)
					{
						bWorking = finder.FindNextFile();
						SplitImagePath(finder.GetFilePath(), DestParentDir, DirNum, DestChildSuffix, DestFilePrefix, DestFileNumDigits, FileNum, DestFileExt);
						if ((FileNum >= FileNumFirst) && (FileNum <= FileNumLast))
							SourceImageList.push_back(finder.GetFilePath());
					}
					finder.Close();
					DirNum++;
				} while (DirNum <= DirNumLast);

				std::wcout << "[" << getTimeISO8601() << "] " << "First File: " << csFirstFileName.GetString() << std::endl;
				std::wcout << "[" << getTimeISO8601() << "] " << "Last File:  " << csLastFileName.GetString() << std::endl;
				std::wcout << "[" << getTimeISO8601() << "] " << "Total Files: " << SourceImageList.size() << std::endl;

				TCHAR szPath[MAX_PATH] = _T("");
				SHGetFolderPath(NULL, CSIDL_MYVIDEO, NULL, 0, szPath);
				PathAddBackslash(szPath);
				CString csImageDirectory(szPath);
				csImageDirectory.Append(csVideoName);
				if (CreateDirectory(csImageDirectory, NULL))
				{
					int OutFileIndex = 0;
					for (auto SourceFile = SourceImageList.begin(); SourceFile != SourceImageList.end(); SourceFile++)
					{
						CString OutFilePath(csImageDirectory);
						OutFilePath.AppendFormat(_T("\\Wim%05d.JPG"), OutFileIndex++);
						std::wcout << "[" << getTimeISO8601() << "] " << "CopyFile " << SourceFile->GetString() << " to " << OutFilePath.GetString() << "\r";
						CopyFile(SourceFile->GetString(), OutFilePath, TRUE);
					}
					std::wcout << "\n";

					CString csImagePathSpec(csImageDirectory); csImagePathSpec.Append(_T("\\Wim%05d.JPG"));
					CString csVideoFullPath(csImageDirectory); csVideoFullPath.Append(_T(".mp4"));
					if (csFFMPEGPath.GetLength() > 0)
					{
						csVideoFullPath = csImageDirectory + _T("-1080p-cropped.mp4");
						std::wcout << "[" << getTimeISO8601() << "] " << csFFMPEGPath.GetString() << " -i " << QuoteFileName(csImagePathSpec).GetString() << " -y " << QuoteFileName(csVideoFullPath).GetString() << std::endl;
						if (-1 == _tspawnlp(_P_WAIT, csFFMPEGPath.GetString(), csFFMPEGPath.GetString(), 
							#ifdef _DEBUG
							_T("-report"),
							#endif
							_T("-i"), QuoteFileName(csImagePathSpec).GetString(),
							_T("-vf"), _T("crop=in_w:3/4*in_h"),
							// _T("-vf"), _T("rotate=PI"), // Us this to rotate the movie if we forgot to put the GoPro in upside down mode.
							_T("-s"), _T("1920x1080"),
							_T("-y"), // Cause it to overwrite exiting output files
							QuoteFileName(csVideoFullPath).GetString(), NULL))
							std::wcout << "[" << getTimeISO8601() << "]  _tspawnlp failed: " /* << _sys_errlist[errno] */ << std::endl;
						csVideoFullPath = csImageDirectory + _T("-1080p-compressed.mp4");
						std::wcout << "[" << getTimeISO8601() << "] " << csFFMPEGPath.GetString() << " -i " << QuoteFileName(csImagePathSpec).GetString() << " -y " << QuoteFileName(csVideoFullPath).GetString() << std::endl;
						if (-1 == _tspawnlp(_P_WAIT, csFFMPEGPath.GetString(), csFFMPEGPath.GetString(), 
							#ifdef _DEBUG
							_T("-report"),
							#endif
							_T("-i"), QuoteFileName(csImagePathSpec).GetString(),
							// _T("-vf"), _T("rotate=PI"), // Us this to rotate the movie if we forgot to put the GoPro in upside down mode.
							_T("-s"), _T("1920x1080"),
							_T("-y"), // Cause it to overwrite exiting output files
							QuoteFileName(csVideoFullPath).GetString(), NULL))
							std::wcout << "[" << getTimeISO8601() << "]  _tspawnlp failed: " /* << _sys_errlist[errno] */ << std::endl;
					}
					do 
					{
						CString OutFilePath(csImageDirectory);
						OutFilePath.AppendFormat(_T("\\Wim%05d.JPG"), --OutFileIndex);
						std::wcout << "[" << getTimeISO8601() << "] " << "DeleteFile " << OutFilePath.GetString() << "\r";
						DeleteFile(OutFilePath);
					} while (OutFileIndex > 0);
					std::wcout << "\n[" << getTimeISO8601() << "] " << "RemoveDirectory " << csImageDirectory.GetString() << std::endl;
					RemoveDirectory(csImageDirectory);
				}
			}
		}
	}
	else
	{
		_tprintf(_T("Fatal Error: GetModuleHandle failed\n"));
		nRetCode = 1;
	}
	return nRetCode;
}

Time Lapse Videos using FFMPEG

I’ve been creating time lapse videos using FFMPEG from the output of a GoPro camera since the beginning of summer. I have always been interested in the output but not had easy methods of creating them until recently.

I like the result best when I have the GoPro set to take one image every 5 seconds, and then I have FFMPEG create a default MP4 file at 25 frames per second. The standard GoPro battery seems to record just about two hours worth of full resolution images in my Hero 3+ Black, which works out to just about a minute of video.

My first video using this method was of the sunset over Elliott Bay taken through the window of the elevator waiting room where I live, on the 13th floor.

I’ve written a program that copies the default GoPro image file names to a sequence that starts with image number 0000 so that FFMPEG will recognize a starting sequence with the default globbing method.

An example command line I use to start FFMPEG is:

ffmpeg -i \\MyServer\Pictures\GoPro\Sunset2\Wim%04d.JPG \\MyServer\Pictures\GoPro\Sunset2\Sunset2.mp4

That command line will actually create a video that has a resolution of 4000×3000, which is the resolution I’m taking the individual pictures. I could have specified to FFMPEG to reshape the output, or trim it.

The second video I created using this method was of a sunrise in essentially the same location.

https://trac.ffmpeg.org/wiki/Create%20a%20video%20slideshow%20from%20images has good examples of some of the options.

The most recent video was created after I purchased a suction cup mount for my GoPro. I went to the 17th floor and read a book during the hour before sunset.

Several things are apparent in this process.

  • I need a black out curtain surrounding the camera to block reflections when the light is directed at the camera. The camera has been positioned flush against the surface of the glass, but the thickness of the glass creates reflections inside the glass itself.
  • videos with weather are much more interesting than just the movement of the sun itself.
  • I have been turning on the WiFi in the camera to make sure I’ve positioned the camera correctly. I need to try turning off the WiFi to see how it affects the battery rundown length.
  • I need to change the picture frequency to a longer or short time-span to see if the battery life is affected at all.

Logitech C920 Angle of View

I realized today that the Logitech C920 webcam produces images covering a different field of vision (FOV) for the same width based on the height. I was expecting the horizontal field to be the same for a given width but it was not.

Using the command ffmpeg -f video4linux2 -list_formats all -i /dev/video0 to retrieve the sizes of video available lists the same set of sizes for h264 and mjpeg. 640×480 160×90 160×120 176×144 320×180 320×240 352×288 432×240 640×360 800×448 800×600 864×480 960×720 1024×576 1280×720 1600×896 1920×1080. In Raw/yuyv422 mode two additional sizes are available. 2304×1296 2304×1536.

I pointed my webcam at the building out my window, giving myself a rough grid pattern to look at and ran it through all of the h.264 sizes, and manually counted the horizontal and vertical blocks visible. 

I expected 640×480 and 640×360 to be the same horizontal FOV but have different vertical FOV. What actually happened in the FOV was that they displayed the same vertical FOV but different horizontal FOV.

I ran through all of the h264 resolutions, and the vertical FOV appeared to shrink slightly when I requested resolutions below 200, but otherwise stayed the same. 

Selecting 2304×1536 produced a slightly larger vertical FOV with the same horizontal FOV as 1920×1080. 2304×1296 seemed to produce the same FOV in both directions as 1920×1080.  Both of these resolutions run at lower frame rates and only in raw mode. I was testing them using ffmpeg transcoding and sending to my windows desktop with the command: ffmpeg -re -f v4l2 -video_size 2304×1536 -framerate 2 -input_format yuyv422 -i /dev/video0 -f mpegts udp://192.168.0.10:8090

The C920 advertises a Diagonal FOV of 78°, but I didn’t find official meaning of that.  I found a nice bit of information at http://therandomlab.blogspot.com/2013/03/logitech-c920-and-c910-fields-of-view.html that describes it as explicitly as being when the camera is running in 16×9 mode. 

I will probably get around to writing a program to more accurately produce the results.  Here’s my manual table:

Resolution Width Height Blocks Floors Width/Height Ratio MegaPixels
160×90  160 90 9 8 1.777778 0.01
160×120  160 120 7 8 1.333333 0.01
176×144  176 144 7 9 1.222222 0.02
320×180  320 180 9 8 1.777778 0.05
320×240  320 240 7 9 1.333333 0.07
352×288  352 288 7 9 1.222222 0.1
432×240  432 240 10 9 1.8 0.1
640×360  640 360 10 9 1.777778 0.23
640×480  640 480 7 9 1.333333 0.3
800×448  800 448 10 9 1.785714 0.35
800×600  800 600 7 9 1.333333 0.48
864×480  864 480 10 9 1.8 0.41
960×720  960 720 7 9 1.333333 0.69
1024×576  1024 576 10 9 1.777778 0.58
1280×720  1280 720 10 9 1.777778 0.92
1600×896  1600 896 10 9 1.785714 1.43
1920×1080 1920 1080 10 9 1.777778 2.07
2304×1296 2304 1296     1.777778 2.98
2304×1536 2304 1536     1.5 3.53
  16 9     1.777778
  4 3     1.333333

 

WimTiVoServer changes to use FFProbe

WimTiVoServer was originally written using the libraries that FFMPEG is based on to retrieve details about video files. I had downloaded the packages from http://ffmpeg.zeranoe.com/builds/ and used the DLLs for the library calls. In other program I’m building related to FFMPEG I am updating FFMPEG on a regular basis. Maintaining the correct link path any time I came back for a minor adjustment to WimTiVoServer became more of an effort than I wanted to deal with, so I investigated what else was available.

My solution has been to use FFProbe, which is distributed with FFmpeg. I am using the spawning a child process and capturing the standard output. I read the results of my command and put it into a IStream memory stream object, which I then use the IXmlReader object to parse the XML for the items I’m looking for.

The command line I’m using for FFProbe is ffprobe.exe -show_streams -show_format -print_format xml INPUT. An example of the output it produces is:

<?xml version="1.0" encoding="UTF-8"?>
<ffprobe>
    <streams>
        <stream index="0" codec_name="ac3" codec_long_name="ATSC A/52A (AC-3)" codec_type="audio" codec_time_base="1/48000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" sample_fmt="fltp" sample_rate="48000" channels="6" bits_per_sample="0" dmix_mode="-1" ltrt_cmixlev="-1.000000" ltrt_surmixlev="-1.000000" loro_cmixlev="-1.000000" loro_surmixlev="-1.000000" id="0x27" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="22054844" start_time="2.205484" duration_ts="19133694951" duration="1913.369495" bit_rate="384000">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="1" codec_name="ac3" codec_long_name="ATSC A/52A (AC-3)" codec_type="audio" codec_time_base="1/48000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" sample_fmt="fltp" sample_rate="48000" channels="2" bits_per_sample="0" dmix_mode="-1" ltrt_cmixlev="-1.000000" ltrt_surmixlev="-1.000000" loro_cmixlev="-1.000000" loro_surmixlev="-1.000000" id="0x28" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="23039510" start_time="2.303951" bit_rate="192000">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="2" codec_name="mpeg2video" codec_long_name="MPEG-2 video" profile="Main" codec_type="video" codec_time_base="1001/120000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" width="1280" height="720" has_b_frames="1" sample_aspect_ratio="1:1" display_aspect_ratio="16:9" pix_fmt="yuv420p" level="4" timecode="00:00:00:00" id="0x29" r_frame_rate="60000/1001" avg_frame_rate="60000/1001" time_base="1/10000000" start_pts="31875510" start_time="3.187551">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="3" codec_type="subtitle" codec_time_base="1/10000000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" id="0x2a" r_frame_rate="0/0" avg_frame_rate="0/0" time_base="1/10000000" start_pts="32209177" start_time="3.220918">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="0"/>
        </stream>
        <stream index="4" codec_name="mjpeg" codec_long_name="MJPEG (Motion JPEG)" codec_type="video" codec_time_base="1/90000" codec_tag_string="[0][0][0][0]" codec_tag="0x0000" width="200" height="113" has_b_frames="0" sample_aspect_ratio="1:1" display_aspect_ratio="200:113" pix_fmt="yuvj420p" level="-99" id="0xffffffff" r_frame_rate="90000/1" avg_frame_rate="0/0" time_base="1/90000" start_pts="198494" start_time="2.205489" duration_ts="172203255" duration="1913.369500">
            <disposition default="0" dub="0" original="0" comment="0" lyrics="0" karaoke="0" forced="0" hearing_impaired="0" visual_impaired="0" clean_effects="0" attached_pic="1"/>
            <tag key="title" value="TV Thumbnail"/>
        </stream>
    </streams>

    <format filename="d:\Recorded TV\Archer_FXPHD_2013_02_28_22_00_00.wtv" nb_streams="5" nb_programs="0" format_name="wtv" format_long_name="Windows Television (WTV)" start_time="2.205484" duration="1913.369495" size="1956642816" bit_rate="8180930" probe_score="100">
        <tag key="WM/MediaClassPrimaryID" value="db9830bd-3ab3-4fab-8a371a995f7ff74"/>
        <tag key="WM/MediaClassSecondaryID" value="ba7f258a-62f7-47a9-b21f4651c42a000"/>
        <tag key="Title" value="Archer"/>
        <tag key="WM/SubTitle" value="Live and Let Dine"/>
        <tag key="WM/SubTitleDescription" value="Archer, Lana and Cyril go undercover in celebrity chef Lance Casteau&apos;s hellish kitchen."/>
        <tag key="genre" value="Comedy;General;Series"/>
        <tag key="WM/OriginalReleaseTime" value="0"/>
        <tag key="language" value="en-us"/>
        <tag key="WM/MediaCredits" value="H. Jon Benjamin/Jessica Walter/Aisha Tyler/George Coe/Chris Parnell/Judy Greer;;;Anthony Bourdain"/>
        <tag key="service_provider" value="FXPHD"/>
        <tag key="service_name" value="FX HD (Pacific)"/>
        <tag key="WM/MediaNetworkAffiliation" value="Satellite"/>
        <tag key="WM/MediaOriginalChannel" value="728"/>
        <tag key="WM/MediaOriginalChannelSubNumber" value="0"/>
        <tag key="WM/MediaOriginalBroadcastDateTime" value="2013-02-28T08:00:00Z"/>
        <tag key="WM/MediaOriginalRunTime" value="19144791872"/>
        <tag key="WM/MediaIsStereo" value="false"/>
        <tag key="WM/MediaIsRepeat" value="false"/>
        <tag key="WM/MediaIsLive" value="false"/>
        <tag key="WM/MediaIsTape" value="false"/>
        <tag key="WM/MediaIsDelay" value="false"/>
        <tag key="WM/MediaIsSubtitled" value="false"/>
        <tag key="WM/MediaIsMovie" value="false"/>
        <tag key="WM/MediaIsPremiere" value="false"/>
        <tag key="WM/MediaIsFinale" value="false"/>
        <tag key="WM/MediaIsSAP" value="false"/>
        <tag key="WM/MediaIsSport" value="false"/>
        <tag key="WM/Provider" value="MediaCenterDefault"/>
        <tag key="WM/VideoClosedCaptioning" value="false"/>
        <tag key="WM/WMRVEncodeTime" value="2013-03-01 06:00:05"/>
        <tag key="WM/WMRVSeriesUID" value="!MCSeries!225842780"/>
        <tag key="WM/WMRVServiceID" value="!MCService!188913961"/>
        <tag key="WM/WMRVProgramID" value="!MCProgram!285145704"/>
        <tag key="WM/WMRVRequestID" value="0"/>
        <tag key="WM/WMRVScheduleItemID" value="0"/>
        <tag key="WM/WMRVQuality" value="0"/>
        <tag key="WM/WMRVOriginalSoftPrePadding" value="300"/>
        <tag key="WM/WMRVOriginalSoftPostPadding" value="120"/>
        <tag key="WM/WMRVHardPrePadding" value="-300"/>
        <tag key="WM/WMRVHardPostPadding" value="0"/>
        <tag key="WM/WMRVATSCContent" value="true"/>
        <tag key="WM/WMRVDTVContent" value="true"/>
        <tag key="WM/WMRVHDContent" value="true"/>
        <tag key="Duration" value="19151788198"/>
        <tag key="WM/WMRVEndTime" value="2013-03-01 06:32:00"/>
        <tag key="WM/WMRVBitrate" value="8.173201"/>
        <tag key="WM/WMRVKeepUntil" value="-1"/>
        <tag key="WM/WMRVActualSoftPrePadding" value="294"/>
        <tag key="WM/WMRVActualSoftPostPadding" value="120"/>
        <tag key="WM/WMRVContentProtected" value="true"/>
        <tag key="WM/WMRVContentProtectedPercent" value="99"/>
        <tag key="WM/WMRVExpirationSpan" value="9223372036854775807"/>
        <tag key="WM/WMRVInBandRatingSystem" value="255"/>
        <tag key="WM/WMRVInBandRatingLevel" value="255"/>
        <tag key="WM/WMRVInBandRatingAttributes" value="0"/>
        <tag key="WM/WMRVWatched" value="false"/>
        <tag key="WM/MediaThumbWidth" value="352"/>
        <tag key="WM/MediaThumbHeight" value="198"/>
        <tag key="WM/MediaThumbStride" value="1056"/>
        <tag key="WM/MediaThumbRet" value="0"/>
        <tag key="WM/MediaThumbRatingSystem" value="9"/>
        <tag key="WM/MediaThumbRatingLevel" value="17"/>
        <tag key="WM/MediaThumbRatingAttributes" value="0"/>
        <tag key="WM/MediaThumbAspectRatioX" value="16"/>
        <tag key="WM/MediaThumbAspectRatioY" value="9"/>
        <tag key="WM/MediaThumbTimeStamp" value="4647772712253334203"/>
    </format>
</ffprobe>

I am parsing the XML and keeping track of only the first video stream details and the first audio stream details, and then looking for some specific items in the metadata tags. I store the information and return it to the TiVo as information when it’s requesting a list of what programs are available to transfer and then when I transfer the file itself.

An interesting side effect of moving to using XML from using the libraries is that the XML created by FFProbe handles extended characters that are not in the ASCII character set. Because I’m using the XML Parser that works with Unicode by default, it takes care of the characters properly. When I was using the libraries, I was looping on AVDictionaryEntry values and doing comparisons with char values.

Here is the code that I’m currently using. It’s not the prettiest code but it gets the job done and runs quickly enough.

void cTiVoFile::PopulateFromFFProbe(void)
{
	static const CString csFFProbePath(FindEXEFromPath(_T("ffprobe.exe")));
	if (!csFFProbePath.IsEmpty())
	{
		// Set the bInheritHandle flag so pipe handles are inherited. 
		SECURITY_ATTRIBUTES saAttr;  
		saAttr.nLength = sizeof(SECURITY_ATTRIBUTES); 
		saAttr.bInheritHandle = TRUE; 
		saAttr.lpSecurityDescriptor = NULL; 

		// Create a pipe for the child process's STDOUT. 
		HANDLE g_hChildStd_OUT_Rd = NULL;
		HANDLE g_hChildStd_OUT_Wr = NULL;
		if ( ! CreatePipe(&g_hChildStd_OUT_Rd, &g_hChildStd_OUT_Wr, &saAttr, 0x800000) ) 
			std::cout << "[" << getTimeISO8601() << "] "  << __FUNCTION__ << "\t ERROR: StdoutRd CreatePipe" << endl;
		else
		{
			// Ensure the read handle to the pipe for STDOUT is not inherited.
			if ( ! SetHandleInformation(g_hChildStd_OUT_Rd, HANDLE_FLAG_INHERIT, 0) )
				std::cout << "[" << getTimeISO8601() << "] "  << __FUNCTION__ << "\t ERROR: Stdout SetHandleInformation" << endl;
			else
			{
				// Create a child process that uses the previously created pipes for STDIN and STDOUT.
				// Set up members of the PROCESS_INFORMATION structure.  
				PROCESS_INFORMATION piProcInfo; 
				ZeroMemory( &piProcInfo, sizeof(PROCESS_INFORMATION) );
 
				// Set up members of the STARTUPINFO structure. 
				// This structure specifies the STDIN and STDOUT handles for redirection.
				STARTUPINFO siStartInfo;
				ZeroMemory( &siStartInfo, sizeof(STARTUPINFO) );
				siStartInfo.cb = sizeof(STARTUPINFO); 
				siStartInfo.hStdError = GetStdHandle(STD_ERROR_HANDLE);
				siStartInfo.hStdInput = GetStdHandle(STD_INPUT_HANDLE);
				siStartInfo.hStdOutput = g_hChildStd_OUT_Wr;
				siStartInfo.dwFlags |= STARTF_USESTDHANDLES;
 
				CString csCommandLine(QuoteFileName(csFFProbePath));
				csCommandLine.Append(_T(" -show_streams -show_format -print_format xml "));
				csCommandLine.Append(QuoteFileName(m_csPathName));

				TRACE(_T("CreateProcess: %s\n"), csCommandLine.GetString());
				// Create the child process.
				if (CreateProcess(NULL, 
					(LPTSTR) csCommandLine.GetString(),     // command line 
					NULL,          // process security attributes 
					NULL,          // primary thread security attributes 
					TRUE,          // handles are inherited 
					0,             // creation flags 
					NULL,          // use parent's environment 
					NULL,          // use parent's current directory 
					&siStartInfo,  // STARTUPINFO pointer 
					&piProcInfo))  // receives PROCESS_INFORMATION 
				{
					CloseHandle(g_hChildStd_OUT_Wr);	// If I don't do this, then the parent will never exit!
					CComPtr<IStream> spMemoryStreamOne(::SHCreateMemStream(NULL, 0));
					if (spMemoryStreamOne != NULL)
					{
						const int RAWDataBuffSize = 0x1000;	// 0x1000 is 4k
						char * RAWDataBuff = new char[RAWDataBuffSize];
						for (;;)
						{
							DWORD dwRead = 0;
							BOOL bSuccess = ReadFile(g_hChildStd_OUT_Rd, RAWDataBuff, RAWDataBuffSize, &dwRead, NULL);
							if( (!bSuccess) || (dwRead == 0)) break;
							ULONG cbWritten;
							spMemoryStreamOne->Write(RAWDataBuff, dwRead, &cbWritten);
						} 
						delete[] RAWDataBuff;
						// reposition back to beginning of stream
						LARGE_INTEGER position;
						position.QuadPart = 0;
						spMemoryStreamOne->Seek(position, STREAM_SEEK_SET, NULL);
						HRESULT hr = S_OK;
						CComPtr<IXmlReader> pReader; 
						if (SUCCEEDED(hr = CreateXmlReader(__uuidof(IXmlReader), (void**) &pReader, NULL))) 
						{
							if (SUCCEEDED(hr = pReader->SetProperty(XmlReaderProperty_DtdProcessing, DtdProcessing_Prohibit))) 
							{
								if (SUCCEEDED(hr = pReader->SetInput(spMemoryStreamOne))) 
								{
									int indentlevel = 0;
									XmlNodeType nodeType; 
									const WCHAR* pwszLocalName;
									const WCHAR* pwszValue;
									CString csLocalName;
									bool bIsFormat = false;
									bool bVideoStreamInfoNeeded = true;
									bool bAudioStreamInfoNeeded = true;

									//read until there are no more nodes 
									while (S_OK == (hr = pReader->Read(&nodeType))) 
									{
										if (nodeType == XmlNodeType_Element)
										{
											if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
											{
												csLocalName = CString(pwszLocalName);
												if ((bVideoStreamInfoNeeded || bAudioStreamInfoNeeded) && !csLocalName.Compare(_T("stream")))
												{
													CString cs_codec_name;
													CString cs_codec_type;
													CString cs_codec_time_base;
													CString cs_width;
													CString cs_height;
													CString cs_duration;
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															csLocalName = CString(pwszLocalName);
															if (!csLocalName.Compare(_T("codec_name")))
																cs_codec_name = CString(pwszValue);
															else if (!csLocalName.Compare(_T("codec_type")))
																cs_codec_type = CString(pwszValue);
															else if (!csLocalName.Compare(_T("codec_time_base")))
																cs_codec_time_base = CString(pwszValue);
															else if (!csLocalName.Compare(_T("width")))
																cs_width = CString(pwszValue);
															else if (!csLocalName.Compare(_T("height")))
																cs_height = CString(pwszValue);
															else if (!csLocalName.Compare(_T("duration")))
																cs_duration = CString(pwszValue);
														}
													}
													if (!cs_codec_type.Compare(_T("video")))
													{
														bVideoStreamInfoNeeded = false;
														if (!cs_codec_name.Compare(_T("mpeg2video")))
															m_VideoCompatible = true;
														m_SourceFormat = cs_codec_type + CString(_T("/")) + cs_codec_name;
														int width = 0;
														std::wstringstream ss;
														ss << cs_width.GetString();
														ss >> width;
														if (width >= 1280)
															m_VideoHighDefinition = true;
														double duration = 0;
														ss = std::wstringstream();
														ss << cs_duration.GetString();
														ss >> duration;
																												m_Duration = duration * 1000 + 5;													}
													else if (!cs_codec_type.Compare(_T("audio")))
													{
														bAudioStreamInfoNeeded = false;
														if (!cs_codec_name.Compare(_T("ac3")))
															m_AudioCompatible = true;
													}	
												}
												else if (!csLocalName.Compare(_T("format")))
												{
													bIsFormat = true;
													const CString ccs_duration(_T("duration"));
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															if (!ccs_duration.Compare(pwszLocalName))
															{
																double duration = 0;
																std::wstringstream ss;
																ss << pwszValue;
																ss >> duration;
																m_Duration = duration * 1000 + 5;
															}
														}
													}
												}
												// Here's where I need to dig deeper.
												else if (bIsFormat && (!csLocalName.Compare(_T("tag"))))
												{
													CString csAttributeKey;
													CString csAttributeValue;
													while (S_OK == pReader->MoveToNextAttribute())
													{
														if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
															if (SUCCEEDED(hr = pReader->GetValue(&pwszValue, NULL)))
														{
															if (!CString(_T("key")).Compare(pwszLocalName))
																csAttributeKey = CString(pwszValue);
															else if (!CString(_T("value")).Compare(pwszLocalName))
																csAttributeValue = CString(pwszValue);
														}
													}
													if (!csAttributeKey.CompareNoCase(_T("title")))
														m_Title = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("episode_id")))
														m_EpisodeTitle = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("description")))
														m_Description = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/SubTitle")))
														m_EpisodeTitle = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/SubTitleDescription")))
														m_Description = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("genre")))
														m_vProgramGenre = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("service_provider")))
														m_SourceStation = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaOriginalChannel")))
														m_SourceChannel = csAttributeValue;
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaCredits")))
													{
														m_vActor = csAttributeValue;
														while (0 < m_vActor.Replace(_T(";;"),_T(";")));
														while (0 < m_vActor.Replace(_T("//"),_T("/")));
													}
													else if (!csAttributeKey.CompareNoCase(_T("WM/WMRVEncodeTime")))
													{
														CTime OriginalBroadcastDate = ISO8601totime(std::string(CStringA(csAttributeValue).GetString()));
														if (OriginalBroadcastDate > 0)
															m_CaptureDate = OriginalBroadcastDate;
													}
													else if (!csAttributeKey.CompareNoCase(_T("WM/MediaOriginalBroadcastDateTime")))
													{
														CTime OriginalBroadcastDate = ISO8601totime(std::string(CStringA(csAttributeValue).GetString()));
														if (OriginalBroadcastDate > 0)
															m_CaptureDate = OriginalBroadcastDate;
													}
																										m_Description.Trim();
												}
											}
										}
										else if (nodeType == XmlNodeType_EndElement)
										{
											if (SUCCEEDED(hr = pReader->GetLocalName(&pwszLocalName, NULL)))
												if (!CString(pwszLocalName).Compare(_T("format")))
													bIsFormat = false;
										}
									}
								}
							}
						}
					}
					// Close handles to the child process and its primary thread.
					// Some applications might keep these handles to monitor the status
					// of the child process, for example. 
					CloseHandle(piProcInfo.hProcess);
					CloseHandle(piProcInfo.hThread);
				}
			}
			CloseHandle(g_hChildStd_OUT_Rd);
		}
	}
}

Researching DVD Subtitle Format

I am attempting to stream webcam video from a BeagleBoneBlack to other computers over ethernet. I want to add an overlay with details about the video. I am capturing video from a Logitech C920 webcam, which is doing the hard work of creating the video on H.264 format, using FFMPEG to MUX the video into a network stream. The current video stream runs at 3Mb/s over ethernet, and seems to run at the same bitrate whether I’m sending video 30FPS at 1920×1080, 1280×720, or any other resolution I’ve tried. If I’m running the BBB at 1GHz FFMPEG uses only 3% load on the processor, while at 300MHz it uses 10% load. Either processor speed indicates that I should have plenty of CPU available for creating a subtitle frame a second.

If I transcode the H264 coming from the C920 to h.264 from FFMPEG the BBB CPU is 100% used and I’ve not been able to get over 5 FPS. This has led me to the idea of adding a second stream with much more compressible data and requiring the client computer to know how to enable subtitles.

My understanding of DVD Subtitles is that they are stored as image overlays. The images seem to be 3 color plus transparency, with the color indexed. They are RLE (Run Length Encoded) images but don’t seem to conform to any standard that would be created by an image library such as OpenCV.

The most useful links I’ve come across related to the DVD subtitles are these three:

Using FFMPEG to examine at a video that was ripped from a DVD into an MKV file with several subtitle layers shows the following:

Stream #0:0(eng): Video: mpeg2video (Main), yuv420p, 720x480 [SAR 32:27 DAR 16:9], SAR 186:157 DAR 279:157, 29.97 fps, 29.97 tbr, 1k tbn, 59.94 tbc
Stream #0:1(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 448 kb/s (default)
Metadata:
  title           : 3/2+1
Stream #0:2(eng): Audio: ac3, 48000 Hz, 5.1(side), fltp, 384 kb/s
Metadata:
  title           : 3/2+1
Stream #0:3(spa): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Metadata:
  title           : 2/0
Stream #0:4(fre): Audio: ac3, 48000 Hz, stereo, fltp, 192 kb/s
Metadata:
  title           : 2/0
Stream #0:5(eng): Subtitle: dvd_subtitle (default)
Stream #0:6(spa): Subtitle: dvd_subtitle
Stream #0:7(eng): Subtitle: dvd_subtitle
Stream #0:8(spa): Subtitle: dvd_subtitle
Stream #0:9(fre): Subtitle: dvd_subtitle

All of the descriptions of creating subtitle tracks are directly related to creating textual subtitles using tools that are wonderful for mainstream movie content but not what I want to do. e.g.

I’ve not figured out how to create my own subtitle stream and am still looking for information on that. I’ve not figured out what parameters may need to be passed to FFMPEG to indicate that I’m passing in a subtitle track. I’ve not figured out if there’s a way in FFMPEG to indicate that the subtitles should be on by default, or forced subtitles, while still keeping them as a separate stream.

It doesn’t help that the DVD subtitle files seem to use the STL extension and that same extension is used for the input files for many 3D Printers.