Monday, November 27, 2006

SetSurfaceDesc memory leak

The CVideo object is leaking memory badly, and I'm surprised I never noticed it before. It leaks the entire DirectSurface every time a new video is opened. At 640 x 480 that's around 1 MB per open, which adds up fast. This could explain why Whorld misbehaves after many hours of triggering video clips.

Initially I suspected VfW, but then I verified that VfW definitely cleans up after itself. The problem turns out to be with DirectDraw, specifically with SetSurfaceDesc.

CVideo's constructor creates a default 1 x 1 memory surface. When a video is opened, CVideo attaches this surface to the video frame, using SetSurfaceDesc. This is a major optimization: it allows CVideo to avoid using GDI to blit each video frame to the DirectDraw surface, because the video frame IS the DirectDraw surface. In practice SetSurfaceDesc only needs to be called once, when the video opens, because VfW doesn't change the address of video frame after that. In fact it only changes the address if you open a video with a different frame size or pixel format, sensibly enough. CVideo checks for a change in frame buffer address, and if one occurs, it reattaches its surface to the new address.

According to the MSDN on SetSurface, "The DirectDrawSurface object will not deallocate surface memory that it didn't allocate. Therefore, when the surface memory is no longer needed, it is your responsibility to deallocate it. However, when SetSurfaceDesc is called, DirectDraw frees the original surface memory that it implicitly allocated when creating the surface."

I interpreted this to mean that once you've done at least one SetSurfaceDesc for a surface, you're on your own, as far as memory management. But what happens is, when the video is closed, DirectDraw leaves some object the same size as the frame buffer object allocated. It can't be VfW's frame buffer, because VfW destroys that when you call AVIStreamGetFrameClose. I can't imagine how or why this happens, but it sure isn't documented.

I only found two ways to make DirectDraw release this mysterious hidden frame buffer. The obvious way is to destroy the surface, but I'd prefer not to do this, because it means destroying and re-creating the surface every time a video is opened, which seems wasteful. The other way is to call SetSurfaceDesc again, passing it a 1 x 1 dummy surface. This works fine, and only takes about 50 microseconds. The surface description never changes, so it can even be a static array.

DDSURFACEDESC CVideo::m_DefSurf = {
sizeof(DDSURFACEDESC), // dwSize
DDSD_WIDTH | DDSD_HEIGHT | DDSD_PITCH | DDSD_LPSURFACE | DDSD_PIXELFORMAT, // dwFlags
1, // dwHeight
1, // dwWidth
4, // lPitch (Width * BitCount / 8)
0, 0, 0, 0, // dwBackBufferCount, dwMipMapCount, dwAlphaBitDepth, dwReserved
&m_DefSurfMem, // lpSurface
{0, 0}, {0, 0}, {0, 0}, {0, 0}, // color keys
{
sizeof(DDPIXELFORMAT), // dwSize
DDPF_RGB, // dwFlags
0, // dwFourCC
32, // dwRGBBitCount
0xff0000, // dwRBitMask
0x00ff00, // dwGBitMask
0x0000ff // dwBBitMask
}
};
DWORD CVideo::m_DefSurfMem; // pointed to by m_DefSurf.lpSurface
...
void CVideo::Close()
{
// if surface exists, we must attach it to a default 1 x 1 memory surface,
// otherwise DirectDraw leaves a mysterious hidden frame buffer allocated
if (m_Surface != NULL)
m_Surface->SetSurfaceDesc(&m_DefSurf, 0); // prevents a major leak
...

Friday, November 24, 2006

Undo performance

The undo manager uses CArray to implement the undo history. As a result, the performance of undo notification varies significantly depending on whether undo is limited, or unlimited. Performance was measured using a test function that repeatedly generates the same undo event, as shown below. To simulate realistic conditions, the test function was called from the timer hook, and the results were stored in an array and written after the test, avoiding potential interference from file I/O.

If undo is unlimited, notification time is mostly constant, except when the CArray has to grow. Since growing entails copying the entire array to a new memory location, the time required to grow the CArray increases linearly with the number of undoable edits. In a test of 10000 iterations, undo notification took an average of 50 microseconds. The actual samples were nearly indistinguishable from the average, except when the array grew, resulting in peaks which increased linearly, up to 1.6 milliseconds by the end of the test. The time between peaks also increased linearly as expected, due to MFC's heuristic method of computing the grow size. There were also a few seemingly random, unexplained spikes of nearly 2.5 milliseconds.

If undo is limited, notification time is constant. This is because once the limit is reached, adding a new notification deletes the oldest event from the history. Deleting from the front of a CArray requires copying the entire array down one element, but the array size is constant, so there's no memory reallocation, and the time required to do the copy doesn't change. In a test of 10000 iterations, undo notification took an average of 60 microseconds, only 10 microseconds more than the unlimited case. The actual samples were similar to the average, with randomly-spaced peaks up to around 150 microseconds. Again there were some unexplained spikes, though they were an order of magnitude lower, around 250 microseconds.

Note that OnPlugBypass with undo notification commented out takes an average of 38 microseconds, so in all cases undo notification takes longer than other work performed by OnPlugBypass.

Conclusion: undo performance is suboptimal, due to the use of CArray. An implementation based on CList would almost certainly perform better for unlimited undo, and probably the same or slightly better for limited undo. This optimization needs to be weighed against substantially increased complexity in the undo manager, e.g. array indexing would have to be replaced by iteration.

This hypothesis was tested by slapping together a minimally functional CList-based implementation and repeating the test. The result: for unlimited undo, the average time was 48 microseconds, and the actual samples showed only minor deviations, e.g. 80 or 150 microseconds, except for the occasional unexplained 2.5 millisecond spike. On the other hand, the undo manager complications look pretty formidable.

static const MAX_SAMPS = 10000;
float samp[MAX_SAMPS];
int samps = 0;
void CMainFrame::OnTimer(UINT nIDEvent)
{
if (m_Plugin[0].IsCreated()) {
#if 0 // zero for unlimited undo
if (!samps)
m_UndoMgr.SetLevels(100);
#endif
OnPlugBypass();
if (samps == MAX_SAMPS) {
FILE *fp = fopen("undo bench.txt", "wc");
for (int i = 0; i < samps; i++)
fprintf(fp, "%d\t%f\n", i, samp[i]);
fclose(fp);
exit(0);
}
}
...

#include "benchmark.h"
extern float samp[];
extern int samps;
void CFFPlugsDlg::OnPlugBypass()
{
int sel = GetCurSel();
if (sel >= 0) {
CBenchmark b;
NotifyUndoableEdit(UCODE_BYPASS);
samp[samps++] = float(b.Elapsed());
BypassPlugin(sel, !IsPluginBypassed(sel));
}
}

Thursday, November 09, 2006

Synchronizing automations to clip length

The manual method is pretty straightforward, though it does require a calculator. Take FFRend's ideal frame rate (NOT the video clip's frame rate, that doesn't matter), and divide it by the video clip's frame count. Now multiply the result by 100. Enter that number in the Master speed toolbar, and you're all set, though you might also want to pause, rewind the clip, and sync the oscillators.

This scheme redefines the frequency unit, from Hertz to clip passes. A frequency of 1 will repeat once per clip pass, 2 will repeat twice per clip pass, .5 will repeat every other clip pass, etc.

The X 100 accounts for the fact that master speed is a percentage.

For example, if the clip is 1859 frames long, and it's playing at 25 FPS:

Master Speed = 25 / 1859 * 100 = 1.3448

Monday, November 06, 2006

frame buffer bit counts

PlayerFF works in Resolume and Flowmotion, but not in OpenTZT, because OpenTZT passes 24-bit frames to the plugin, even though the screen resolution is 32-bit. The underlying problem is that you can't use DirectDraw to blit between surfaces with different bit counts. I tell my AVI reader (AviToBmp) to uncompress the video into the best format for the display (by passing AVIStreamGetFrameOpen AVIGETFRAMEF_BESTDISPLAYFMT). I use SetSurfaceDesc to turn the video frame into a DirectDraw surface, which means if my display is set for 32 bits, my video frame is also 32 bits, regardless of the actual color depth of the video. That's optimal if the host frame buffers also have the display's bit count, which you would think they would, but in OpenTZT, they don't for some reason, so the blit fails with error E_NOTIMPL.

AVIStreamGetFrameOpen can be also passed a BITMAPINFO that tells it what format to decompress to. This allows me force the the video format to match the host's format, as follows:

BITMAPINFOHEADER bih;
ZeroMemory(&bih, sizeof(bih));
bih.biSize = sizeof(bih);
bih.biWidth = m_pBmpInfo->bmiHeader.biWidth;
bih.biHeight = m_pBmpInfo->bmiHeader.biHeight;
bih.biPlanes = 1;
bih.biBitCount = 24; // or whatever host wants
m_pGetFrame = AVIStreamGetFrameOpen(m_pStream, &bih);

Another solution is to just accept that PlayerFF won't work in OpenTZT. Most VJ softwares don't need a player plugin anyway, because they already have elaborate media players built into them. Let's not forget that PlayerFF is primarily designed for use in FFRend!

Another problem: OpenTZT and Flowmotion display PlayerFF's output upside-down, but it looks fine in Resolume and FFRend. Something's pretty wrong there...

plugin ID must be unique in Resolume

It appears that Resolume uses the plugin ID to keep track of its freeframe plugins. It came up because PlayerFF was using FFDemoSrc's plugin ID, and since FFDemoSrc happened to also be in Resolume's plugin folder, selecting PlayerFF actually selected FFDemoSrc instead. Flowmotion doesn't exhibit this behavior. So the freeframe documentation doesn't lie: plugin ID really does need to be unique! One wonders what non-authority is responsible for coordinating this...

PlayerFF: freeframe clip player

I got my standalone freeframe clip player up last night. It's called PlayerFF (OK maybe it needs a better name). It handles AVI/BMP/JPG/GIF, and has three parameters so far:

Clip Select (which clip you're playing)
Pause (0 is play, any other value is pause)
Position (0 is the start of the clip, 1 is the end)

The clips are hard-coded at the moment. :(

Here's what I propose for clip management. The plugin should have both a "Clip Select" and a "Bank Select" parameter. It will look in the magical folder "\My Documents\PlayerFF". Any clips it finds there will wind up in bank zero, UNLESS the magical folder contains an optional playlist file. The playlist file must be called playlist.txt, and it contains the paths of the clips to load, one per line, with optional bank separators. Clips are loaded in the order they appear in the playlist, or if there's no playlist, in alphabetical order.

:0
C:\temp\avi files\Night Traffic.avi
C:\temp\avi files\earth1.avi
C:\temp\avi files\Boat Ride to Punta Sal (xvid).avi
C:\temp\avi files\01_24_04-med.avi
:1
C:\temp\avi files\tint.avi
C:\temp\avi files\kissinggirls.avi
C:\Chris\images\debbie\DSC_0080.jpg

plugin and project info can have different parameter counts

I just found a neat bug. I added some parameters to my new PlayerFF plugin, and when I loaded up a FFRend project that uses it, there was garbage in the modulation settings for the new parameters.

It turns out I was assuming that the plugin's number of parameters, and the number of parameters I have information about in the project file, are always the same. That's normally the case of course, but a new version of the plugin with more (or less) parameters violates that assumption. Oops.

And the solution:

// the plugin's number of parameters might not match our info's parameter count,
// for example if it's a different version of the plugin; take whichever is less
int rows = min(GetPluginRows(PlugIdx), Info.m_Parm.GetSize());