I reproduced it on Debian. It sets CMAKE_BUILD_TYPE to "" after my CmakeLists.txt does to Release.
Need to debug it later, but I'm not sure if it's possible for me to fix it.
For now, please set CMAKE_BUILD_TYPE to Release explicitly.
I reproduced it on Debian. It sets CMAKE_BUILD_TYPE to "" after my CmakeLists.txt does to Release.
Need to debug it later, but I'm not sure if it's possible for me to fix it.
For now, please set CMAKE_BUILD_TYPE to Release explicitly.
i suggest you to put it in some database format and divide entire package into tow sides - "backend" generates the data and puts them into the DB and "frontends" retrive the data from the DDB and renders various representations: plain text, markdown, html, csv...
in particular, frontend should allow to extract subsets of data such as "named run", "all lz4*" and so on
i have developed this architecture for my own benchmarking framework
does anyone have a compiled windows binary of latest FSbench?
You'll probably want to output more than one format, at least, just so that one can be really simple and fool-proof.
For the really simple one, csv is the least common denominator and is easily importable into any spreadsheet software. Actually, tsv (tab separated values) has some advantages, because it's easier to parse (as long as tabs don't appear inside columns), and about equally as universal as csv.
Another really good choice would be html: it's really easy to generate, it supports rich formatting, anyone with a browser can view it, and it can be posted directly to the web.
For the extreme fancy stuff, a good choice would be LaTeX. I just started learning some of it for the ability to write properly-formatted math equations, and it isn't that hard (I looked into several ways of writing math and settled on LaTeX; I may write a post about this). It's pretty standard on Linux and OSX systems, but most Windows users won't have it, and it might be difficult to install/use. The advantages are that it's a standard format, it's easy to generate with a computer and also read and edit by hand, there's no limit to its power, and it converts directly to postscript and PDF.
I have intimate experience with Microsoft formats, and the formats used natively by Excel are probably not great choices, unless you are going to integrate with Microsoft's own interop DLLs. The complexity of these formats is too great and they aren't that useful outside of Excel. "XML Spreadsheet 2003" isn't bad, though, if you want to target Excel. I think this is the documentation: http://msdn.microsoft.com/en-us/libr...ice.10%29.aspx
I always take the path of least resistance, which, in this case, looks like probably dumping the raw data in easily-parseable ASCII, then adding-on scripts to process and add formatting as time and energy permit. Having the raw data is the important thing.
Last edited by nburns; 27th June 2014 at 08:24.
@Stephan Busch
Not me
@nburns
I'm not into making changes this big now...I have too little time for that.
I'm bitten by the same CMAKE_BUILD_TYPE being set to empty string issue, silently causing me to compile unoptimized versions of the benchmark. Maybe as a workaround for this issue you can just add a note to the BUILDING file about running cmake as follows:
cmake -DCMAKE_BUILD_TYPE=Release .
so other people are less likely to run into it? This small issue affects the otherwise simple and effective build for this component, and I think most people won't even notice their numbers are "off".
When running the "fast" codec list against enwik8 on the newest version of fsbench on OSX, I get the following error:
./fsbench fast ../../corpus/enwik8
<other results omitted>
wfLZ r10
fsbench(50759) malloc: *** error for object 0x7fa57940b508: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
I guess wfLZ is doing something naughty. How do I remove it from the list of "fast" codecs?
Last edited by PSHUFB; 5th March 2015 at 07:06.
To answer my own question about disabling wfLZ, just comment out line 573 in codecs.hpp:
// make_pair(raw_find_codec("wfLZ"), ""),
If it fails I suggest that you disable compiling it altogether.
You can do it in CMakeLists.txt, change
toCode:set(USE_WFLZ 1)
If you really want to remove it just from that list, codecs.cppCode:set(USE_WFLZ 0)
toCode:static const pair<Codec*, const string> fast_compressors[] = { make_pair(raw_find_codec("bcl-rle"), ""), make_pair(raw_find_codec("blosc"), ""), make_pair(raw_find_codec("density::chameleon"), ""), make_pair(raw_find_codec("density::mandala"), ""), make_pair(raw_find_codec("fastlz"), ""), make_pair(raw_find_codec("lrrle"), ""), make_pair(raw_find_codec("LZ4"), ""), make_pair(raw_find_codec("LZF"), "very"), make_pair(raw_find_codec("LZJB"), ""), make_pair(raw_find_codec("LZO"), ""), make_pair(raw_find_codec("QuickLZ"), ""), make_pair(raw_find_codec("RLE64"), ""), make_pair(raw_find_codec("Shrinker"), ""), make_pair(raw_find_codec("Snappy"), ""), make_pair(raw_find_codec("wfLZ"), ""), make_pair(raw_find_codec("ZSTD"), "") }; MKLIST(FAST_COMPRESSORS, fast_compressors);
EDIT: I was 3 minutes too slow.Code:static const pair<Codec*, const string> fast_compressors[] = { make_pair(raw_find_codec("bcl-rle"), ""), make_pair(raw_find_codec("blosc"), ""), make_pair(raw_find_codec("density::chameleon"), ""), make_pair(raw_find_codec("density::mandala"), ""), make_pair(raw_find_codec("fastlz"), ""), make_pair(raw_find_codec("lrrle"), ""), make_pair(raw_find_codec("LZ4"), ""), make_pair(raw_find_codec("LZF"), "very"), make_pair(raw_find_codec("LZJB"), ""), make_pair(raw_find_codec("LZO"), ""), make_pair(raw_find_codec("QuickLZ"), ""), make_pair(raw_find_codec("RLE64"), ""), make_pair(raw_find_codec("Shrinker"), ""), make_pair(raw_find_codec("Snappy"), ""), make_pair(raw_find_codec("ZSTD"), "") }; MKLIST(FAST_COMPRESSORS, fast_compressors);![]()
Thanks m^2 (sorry for wasting your time on the removal ?n). I will see if I can file a bug report against wfLZ!
Hi,
I downloaded the fsbench source from https://github.com/Ahti/fsbench.
Compiled successfully.
While running zopfli, (./fsbench zopfli test_data), I encounter the following error:
ERROR: zopfli is just an encoder.
Combine it with some decoder to test it.
If you don't want one, combine it with nop.
Tried with "./fsbench zopfli nop test_data", still no luck!
Can anyone please suggest the correct way to combine one encoder with another decoder?
Platform:
X86_64 GNU/Linux
"./fsbench zopfli/zlib test_data"
Saswat (12th June 2015)
Thanks m^2!
But, the results I get with my Intel Xeon CPU are way less than what is quoted in the results.ods.
Were there any additional optimization flags or run-time options used for taking the numbers?
Example:
for "FNV1a-Jesteress(version: 16-6-2013)", the quoted speed is 27.29 GB/s, whereas I got just 5.4 GB/s !
My setup details:
Intel(R) Xeon(R) CPU E5-1607 v2 @ 3.00GHz
gcc version : 4.8.2 (fno-tree-vectorize -O3)[/TD]
No. of cores : 4
command: ./fsbench FNV1a-Jesteress -i10 -s1000 test_random_1MB
Note: the tests were ran on an isolated system with no other jobs running except the benchmark.
There were no options not mentioned in the odf and you seem to have them all right.
Look into CMakeCache.txt and ensure that:
* CMAKE_BUILD_TYPE is "Release"
* CMAKE_C_FLAGS_RELEASE contains -O3 -DNDEBUG
Last edited by m^2; 12th June 2015 at 22:15.
With the following flags, I am able to get a maximum of 5.6 GB/s with single thread (default configuration).
The speed increases upto 21 GB/s with 4 threads.
But since no "-tX" switch is mentioned in the .ods file, I assume the published numbers were observed with single thread only.
CMAKE_BUILD_TYPE=Release
CMAKE_C_FLAGS_RELEASE=-O3 -DNDEBUG
C_FLAGS = -fno-tree-vectorize -static -DNDEBUG -O3
compilation : cmake -DCMAKE_BUILD_TYPE=Release .
execution : ./fsbench FNV1a-Jesteress -b131072 -i10 -s1000 -t4 ../test_data/random_1MB
BTW, my version of "FNV1a-Jesteress" is : FNV1a-Jesteress 2013-06-16
the one mentioned in the ods is : FNV1a-Jesteress 2013-05-12
I don't see that as the cause for less performance though.
A few more clarifications, please:
1. The ods file also reports Speed in terms of ticks/B. How do I get that number?
2. The C.Ratio (compression ratio?) always is 1.000. Can that be configured?
3. E.Eff and D.Eff show hexa numbers, example: 16e15.
How should I interpret that?
Last edited by Saswat; 15th June 2015 at 15:34.
Oh well, I see no reason for slowness. Maybe you can try to write a trivial benchmark to be able to tell for sure if fsbench is in the right ballpark for your platform?
Pure calculation. I use -c to get raw results as csv and work on that.
No, it's calculated. Checksum is just a transform for fsbench. It doesn't compress data and gets ratio of about 1 (actually it makes data longer by the digest size).
This is scientific notation, 16 * 10^15 .
I should do something to make it clearer.
Saswat (16th June 2015)
Thanks M^2 ! That definitely improved my understanding.
I tried compiling fsbench on Ubuntu(3.13.0-32-generic #57-Ubuntu x86_64 GNU/Linux) and it compiled fine.
But now when I try to compile it on FC21 (3.19.3-200.fc21.x86_64 x86_64 GNU/Linux), it fails with below log!
Linking CXX executable fsbench
/usr/bin/ld: cannot find -lrt
/usr/bin/ld: cannot find -lpthread
/usr/bin/ld: cannot find -lstdc++
/usr/bin/ld: cannot find -lm
/usr/bin/ld: cannot find -lc
collect2: error: ld returned 1 exit status
CMakeFiles/fsbench.dir/build.make:4735: recipe for target 'fsbench' failed
make[2]: *** [fsbench] Error 1
CMakeFiles/Makefile2:91: recipe for target 'CMakeFiles/fsbench.dir/all' failed
make[1]: *** [CMakeFiles/fsbench.dir/all] Error 2
Makefile:76: recipe for target 'all' failed
make: *** [all] Error 2
In CMakeLists.txt, if I replace
ELSEIF (UNIX)
target_link_libraries (fsbench -lrt -lpthread)
WITH
ELSEIF (UNIX)
set(TARGET_LINK_LIBRARIES "fsbench -lrt -lpthread -lstdc++ -lm -lc")
then, -lrt and -lpthread error vanishes but the remaining 3 libraries still don't get linked.
Setup where it fails:
gcc version: gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
cmake version : 3.0.2
OS: 3.19.3-200.fc21.x86_64 x86_64 GNU/Linux
LDD version: ldd (GNU libc) 2.20
NOTE: I cmaked with -DCMAKE_BUILD_TYPE=Release
Last edited by Saswat; 16th June 2015 at 16:33.
I'm not good at debugging linux toolchain issues, I suggest that you ask your OS provider.
Found the "-static" flag to be causing the linker errors.
My lib64 had only the .so libraries and not the static .a ones.
So removing the -static flag from CMakeLists.txt solves the linker issues.
Tried on fc20 / fc21 / fc22. Works cool.
Now got to analyse the cause for low performance on my Intel Xeon CPUs.
Can you please share how to calculate the speed in terms of ticks/B ?
if your cpu runs, for example, at 4000 MHz while performing the operation, each cpu tick is 1/4 ns. so if compression runs at 200 MB/s then each byte requires 4000/200=20 cpu ticks
Saswat (23rd June 2015)
Taking the example of xxhash (sheet 5 of results.ods):
Frequency is 2 GHz.
Speed is 29.12 GB/s
So, as per the calculation, each byte should take : 2 / 29.12 ticks = 0.06 ticks.
But the noted speed is 0.51 ticks/B ! How's that?
Last edited by Saswat; 23rd June 2015 at 13:33.
Or is the speed in Giga "bits" per sec and not giga "bytes" per sec?
Last edited by Saswat; 23rd June 2015 at 16:00.
There is a discrepancy between the performance numbers quoted in results.ods and the actual numbers.
Can anyone please clarify:
1. In result.ods : (for Intel xeon CPUs) the performance numbers are expressed in terms of giga bits per second(Gb/s) or giga bytes per second(GB/s) ?
2. With the latest snapshot of FSbench(https://github.com/Ahti/fsbench), the unit of speed is expressed in terms of Gb/s or GB/s ?
I assume result.ods has in Gb/s and the benchmark displays in GB/s.
Please clarify.
you should ask creator of this odt. from official results, xxhash is 0.5 cpb = 6 gbyte/s on 3 GHz cpu
I need to take a closer look to answer that. I probably made something wrong. Maybe I actually used all cores and didn't mark it? Maybe I had some math wrong?
I'll give you the answer as soon as I can, not yet, sorry.
Thank you for spotting it.
Last edited by m^2; 26th June 2015 at 23:47.
In the sequoia part I have the exact formula that calculates ticks/B entered:
So it's CPU frequency divided by speed * unit conversion.Code:=IF(H2;666.66667*1000*1000/H2/1024/1024;"??")
If I take this formula and apply to E5335 and shuffle a bit, I can calculate CPU frequency from speed in GB/s and ticks/B.
It's 15.41-15.85 Ghz.
That machine had 8 cores, 2 Ghz each.
My hypothesis right now is that I measured speed twice, with -t1 and -t8. GB/s is -t8 and ticks/B -t1.
I'll try to get access to that machine again and verify it, but no promises. And sorry for confusion.
i had this idea but ARK says that this CPU has 4 cores. your file never mentions that this box has two cpus
The default configuration is single core(-t 1).
If the published results are with 8 cores (-t 8 ), then may be it needs to be specified explicitly in the results file.