I made a new tool, similar to what I had in DOS times -
http://shelwien.googlepages.com/windpmi.png
It allows to trace the compression progress without modifying
the executable and even wrapping the executable calls
(like with timers) is not necessary, which is convenient for
batch tests.
The method is simple - a replacement kernel32.dll is created,
which redirects most of functions to a real one.
And windows has a feature which allows to use specific local
dlls with an executable - eg. if we create a "ccm.exe.local"
folder and put a replacement kernel32.dll there, it would be
used instead of original one.
Then, ReadFile and WriteFile calls are "extended" with data size
accumulation, and a timer thread is added, which writes the trace log.
I think this is very useful for compressor analysis, as we can
clearly see some behavior patterns (like in the image linked above
DC is generally faster than YBS, but greatly slows down specifically
on redundant data (pic) - we'd not know that just by comparing
results on concatenated CCC.)
So, does anybody want to use this for a benchmark site or something?
The main problem is that some custom batch plotting tool is necessary for
good visualization.
Here're some example traces made while compressing enwik8
(red = memory used, green = bytes read, blue = bytes written)
(ppmd memory size is in kilobytes, others in bytes;
horizontal axis is time in seconds, vertical is volume in bytes)
http://shelwien.googlepages.com/bcm8e25.png
http://shelwien.googlepages.com/ccm_5.png
Here we can see that CCM spends some noticeable
time in memory initialization without any i/o.
http://shelwien.googlepages.com/ppmd_o12_m256_r1.png
Here we can see that ppmd really frees its memory :)