[NOTE: this bug also occurs in PeteRadialBlur, PeteSpiralBlur, and PeteTimeBlur. For more info see here.]
The memory violation only occurs with the MMX version of the plugin. Interestingly, the non-MMX version of PeteMixer also fails, but in a different way: it always outputs black.
In the non-MMX version, the output colors are shifted right 16 when 8 was intended.
// nOutputBlue>>=16; // ck: shifting twice as much as needed
nOutputBlue>>=8;
The MMX error is the use of movq when movd was intended. The output pointer pCurrentOutput is a 32-bit pointer, and it's being incremented by one (i.e. four bytes) for each iteration, which means that on the last iteration, writing 64 bits to *pCurrentOutput trashes the first two bytes of whatever happens to be above the output buffer in memory.
// movq [esi],mm7 // ck: TRASHES 2 bytes beyond output buffer
movd [esi],mm7 // ck: 32-bit move
The input pointers are also using movq, and while this doesn't overwrite memory it could cause a protection violation.
Theoretically the MMX version also uses more memory bandwidth than it needs to. I benchmarked it, and found that the corrected version is indeed slightly faster. For the original, 1000 calls to ProcessFrameCopy on a 640 x 480 frame took an average of 15.304 ms per frame, while the corrected version took an average of 14.820 per frame: a gain of 3%. I speculate that the gain is so small because the needless memory operations always hit L2 cache.
Heap-trashing bugs are notoriously difficult to find, and this one was no exception. They don't always cause a crash, and even when they do, it typically happens much later, in some unrelated component. In fact it was precisely this symptom--bizarre failures in things that never failed before--that pointed me in the right direction.
One disadvantage of writing such nice free plugins is that they're everywhere. I shudder to think how many mysterious crashes have been unjustly blamed on host applications. Ah, the joys and perils of inline assembler.
You can download an UNOFFICIAL patched DLL here. For more info see the patched source.
1 comment:
Hi Chris
Good work finding this, let's get this tested on freeframe-develop then we can get a general release out for your fix.
all the best
Russell
www.freeframe.org
www.vjamm.com
Post a Comment