Page 8 of 9 FirstFirst ... 6789 LastLast
Results 211 to 240 of 249

Thread: Filesystem benchmark

  1. #211
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I reproduced it on Debian. It sets CMAKE_BUILD_TYPE to "" after my CmakeLists.txt does to Release.
    Need to debug it later, but I'm not sure if it's possible for me to fix it.

    For now, please set CMAKE_BUILD_TYPE to Release explicitly.

  2. #212
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    Quote Originally Posted by m^2 View Post
    I'd rather keep it in some format richer than csv because this enables me to add some features like...
    i suggest you to put it in some database format and divide entire package into tow sides - "backend" generates the data and puts them into the DB and "frontends" retrive the data from the DDB and renders various representations: plain text, markdown, html, csv...

    in particular, frontend should allow to extract subsets of data such as "named run", "all lz4*" and so on

    i have developed this architecture for my own benchmarking framework

  3. #213
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    474
    Thanked 175 Times in 85 Posts
    does anyone have a compiled windows binary of latest FSbench?

  4. #214
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Cyan View Post
    I couldn't open the file "results.ods".
    Neither MS Office nor Google docs would accept it.
    I guess Open Office might do the trick, but it's not installed on my system.
    If you wish your document to be easily read by anyone, I would suggest to use a more spread format, such as, typically .csv.

    It's always good to have more reference points.
    I feel it's a good idea to report performance on a wide range of systems.
    Most of them are not easily accessible by regular programmers, so it will help.
    Quote Originally Posted by m^2 View Post
    Thanks for feedback.
    I'd rather keep it in some format richer than csv because this enables me to add some features like:
    * colouring top results
    * charts
    * separation of source and presented data (far future)
    * integrated macro suite to make colouring / separation from the previous point automated (and this is a thing that can be run offline, so users won't have to run them)

    However, I may push it in several formats simultaneously, so there's minimal trouble caused.
    BTW, does Google have troubles only with this spreadsheet or did they drop odf support?
    You'll probably want to output more than one format, at least, just so that one can be really simple and fool-proof.

    For the really simple one, csv is the least common denominator and is easily importable into any spreadsheet software. Actually, tsv (tab separated values) has some advantages, because it's easier to parse (as long as tabs don't appear inside columns), and about equally as universal as csv.

    Another really good choice would be html: it's really easy to generate, it supports rich formatting, anyone with a browser can view it, and it can be posted directly to the web.

    For the extreme fancy stuff, a good choice would be LaTeX. I just started learning some of it for the ability to write properly-formatted math equations, and it isn't that hard (I looked into several ways of writing math and settled on LaTeX; I may write a post about this). It's pretty standard on Linux and OSX systems, but most Windows users won't have it, and it might be difficult to install/use. The advantages are that it's a standard format, it's easy to generate with a computer and also read and edit by hand, there's no limit to its power, and it converts directly to postscript and PDF.

    I have intimate experience with Microsoft formats, and the formats used natively by Excel are probably not great choices, unless you are going to integrate with Microsoft's own interop DLLs. The complexity of these formats is too great and they aren't that useful outside of Excel. "XML Spreadsheet 2003" isn't bad, though, if you want to target Excel. I think this is the documentation: http://msdn.microsoft.com/en-us/libr...ice.10%29.aspx

    I always take the path of least resistance, which, in this case, looks like probably dumping the raw data in easily-parseable ASCII, then adding-on scripts to process and add formatting as time and energy permit. Having the raw data is the important thing.
    Last edited by nburns; 27th June 2014 at 08:24.

  5. #215
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    @Stephan Busch
    Not me
    @nburns
    I'm not into making changes this big now...I have too little time for that.

  6. #216
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    I'm bitten by the same CMAKE_BUILD_TYPE being set to empty string issue, silently causing me to compile unoptimized versions of the benchmark. Maybe as a workaround for this issue you can just add a note to the BUILDING file about running cmake as follows:

    cmake -DCMAKE_BUILD_TYPE=Release .

    so other people are less likely to run into it? This small issue affects the otherwise simple and effective build for this component, and I think most people won't even notice their numbers are "off".

  7. #217
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Done.

  8. Thanks (2):

    Cyan (14th July 2014),PSHUFB (14th July 2014)

  9. #218
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    When running the "fast" codec list against enwik8 on the newest version of fsbench on OSX, I get the following error:


    ./fsbench fast ../../corpus/enwik8
    <other results omitted>
    wfLZ r10
    fsbench(50759) malloc: *** error for object 0x7fa57940b508: incorrect checksum for freed object - object was probably modified after being freed.
    *** set a breakpoint in malloc_error_break to debug
    Abort trap: 6


    I guess wfLZ is doing something naughty. How do I remove it from the list of "fast" codecs?
    Last edited by PSHUFB; 5th March 2015 at 07:06.

  10. #219
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    To answer my own question about disabling wfLZ, just comment out line 573 in codecs.hpp:

    // make_pair(raw_find_codec("wfLZ"), ""),

  11. #220
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    If it fails I suggest that you disable compiling it altogether.
    You can do it in CMakeLists.txt, change
    Code:
    set(USE_WFLZ              1)
    to
    Code:
    set(USE_WFLZ              0)
    If you really want to remove it just from that list, codecs.cpp
    Code:
    static const pair<Codec*, const string> fast_compressors[] =
        { make_pair(raw_find_codec("bcl-rle"), ""),
          make_pair(raw_find_codec("blosc"), ""),
          make_pair(raw_find_codec("density::chameleon"), ""),
          make_pair(raw_find_codec("density::mandala"), ""),
          make_pair(raw_find_codec("fastlz"), ""),
          make_pair(raw_find_codec("lrrle"), ""),
          make_pair(raw_find_codec("LZ4"), ""),
          make_pair(raw_find_codec("LZF"), "very"),
          make_pair(raw_find_codec("LZJB"), ""),
          make_pair(raw_find_codec("LZO"), ""),
          make_pair(raw_find_codec("QuickLZ"), ""),
          make_pair(raw_find_codec("RLE64"), ""),
          make_pair(raw_find_codec("Shrinker"), ""),
          make_pair(raw_find_codec("Snappy"), ""),
          make_pair(raw_find_codec("wfLZ"), ""),
          make_pair(raw_find_codec("ZSTD"), "") };
    MKLIST(FAST_COMPRESSORS, fast_compressors);
    to
    Code:
    static const pair<Codec*, const string> fast_compressors[] =
        { make_pair(raw_find_codec("bcl-rle"), ""),
          make_pair(raw_find_codec("blosc"), ""),
          make_pair(raw_find_codec("density::chameleon"), ""),
          make_pair(raw_find_codec("density::mandala"), ""),
          make_pair(raw_find_codec("fastlz"), ""),
          make_pair(raw_find_codec("lrrle"), ""),
          make_pair(raw_find_codec("LZ4"), ""),
          make_pair(raw_find_codec("LZF"), "very"),
          make_pair(raw_find_codec("LZJB"), ""),
          make_pair(raw_find_codec("LZO"), ""),
          make_pair(raw_find_codec("QuickLZ"), ""),
          make_pair(raw_find_codec("RLE64"), ""),
          make_pair(raw_find_codec("Shrinker"), ""),
          make_pair(raw_find_codec("Snappy"), ""),
          make_pair(raw_find_codec("ZSTD"), "") };
    MKLIST(FAST_COMPRESSORS, fast_compressors);
    EDIT: I was 3 minutes too slow.

  12. #221
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    Thanks m^2 (sorry for wasting your time on the removal ?n). I will see if I can file a bug report against wfLZ!

  13. #222
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Hi,

    I downloaded the fsbench source from https://github.com/Ahti/fsbench.
    Compiled successfully.
    While running zopfli, (./fsbench zopfli test_data), I encounter the following error:

    ERROR: zopfli is just an encoder.
    Combine it with some decoder to test it.
    If you don't want one, combine it with nop.

    Tried with "./fsbench zopfli nop test_data", still no luck!
    Can anyone please suggest the correct way to combine one encoder with another decoder?

    Platform:
    X86_64 GNU/Linux

  14. #223
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    "./fsbench zopfli/zlib test_data"

  15. Thanks:

    Saswat (12th June 2015)

  16. #224
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Thanks m^2!

    But, the results I get with my Intel Xeon CPU are way less than what is quoted in the results.ods.
    Were there any additional optimization flags or run-time options used for taking the numbers?

    Example:
    for "FNV1a-Jesteress(version: 16-6-2013)", the quoted speed is 27.29 GB/s, whereas I got just 5.4 GB/s !

    My setup details:

    Intel(R) Xeon(R) CPU E5-1607 v2 @ 3.00GHz

    gcc version : 4.8.2 (fno-tree-vectorize -O3)[/TD]
    No. of cores : 4

    command: ./fsbench FNV1a-Jesteress -i10 -s1000 test_random_1MB

    Note: the tests were ran on an isolated system with no other jobs running except the benchmark.

  17. #225
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    There were no options not mentioned in the odf and you seem to have them all right.
    Look into CMakeCache.txt and ensure that:
    * CMAKE_BUILD_TYPE is "Release"
    * CMAKE_C_FLAGS_RELEASE contains -O3 -DNDEBUG
    Last edited by m^2; 12th June 2015 at 22:15.

  18. #226
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    With the following flags, I am able to get a maximum of 5.6 GB/s with single thread (default configuration).
    The speed increases upto 21 GB/s with 4 threads.
    But since no "-tX" switch is mentioned in the .ods file, I assume the published numbers were observed with single thread only.

    CMAKE_BUILD_TYPE=Release
    CMAKE_C_FLAGS_RELEASE=-O3 -DNDEBUG
    C_FLAGS = -fno-tree-vectorize -static -DNDEBUG -O3

    compilation : cmake -DCMAKE_BUILD_TYPE=Release .
    execution : ./fsbench FNV1a-Jesteress -b131072 -i10 -s1000 -t4 ../test_data/random_1MB

    BTW, my version of "
    FNV1a-Jesteress" is : FNV1a-Jesteress 2013-06-16
    the one mentioned in the ods is : FNV1a-Jesteress 2013-05-12
    I don't see that as the cause for less performance though.

    A few more clarifications, please:
    1. The ods file also reports Speed in terms of ticks/B. How do I get that number?
    2. The C.Ratio (compression ratio?) always is 1.000. Can that be configured?
    3. E.Eff and D.Eff show hexa numbers, example: 16e15.
    How should I interpret that?
    Last edited by Saswat; 15th June 2015 at 15:34.

  19. #227
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Saswat View Post
    With the following flags, I am able to get a maximum of 5.6 GB/s with single thread (default configuration).
    The speed increases upto 21 GB/s with 4 threads.
    But since no "-tX" switch is mentioned in the .ods file, I assume the published numbers were observed with single thread only.

    CMAKE_BUILD_TYPE=Release
    CMAKE_C_FLAGS_RELEASE=-O3 -DNDEBUG
    C_FLAGS = -fno-tree-vectorize -static -DNDEBUG -O3

    compilation : cmake -DCMAKE_BUILD_TYPE=Release .
    execution : ./fsbench FNV1a-Jesteress -b131072 -i10 -s1000 -t4 ../test_data/random_1MB

    BTW, my version of "
    FNV1a-Jesteress" is : FNV1a-Jesteress 2013-06-16
    the one mentioned in the ods is : FNV1a-Jesteress 2013-05-12
    I don't see that as the cause for less performance though.
    Oh well, I see no reason for slowness. Maybe you can try to write a trivial benchmark to be able to tell for sure if fsbench is in the right ballpark for your platform?

    Quote Originally Posted by Saswat View Post
    A few more clarifications, please:
    1. The ods file also reports Speed in terms of ticks/B. How do I get that number?
    Pure calculation. I use -c to get raw results as csv and work on that.
    Quote Originally Posted by Saswat View Post
    2. The C.Ratio (compression ratio?) always is 1.000. Can that be configured?
    No, it's calculated. Checksum is just a transform for fsbench. It doesn't compress data and gets ratio of about 1 (actually it makes data longer by the digest size).
    Quote Originally Posted by Saswat View Post
    3. E.Eff and D.Eff show hexa numbers, example: 16e15.
    How should I interpret that?
    This is scientific notation, 16 * 10^15 .
    I should do something to make it clearer.

  20. Thanks:

    Saswat (16th June 2015)

  21. #228
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Thanks M^2 ! That definitely improved my understanding.

    I tried compiling fsbench on Ubuntu(3.13.0-32-generic #57-Ubuntu x86_64 GNU/Linux) and it compiled fine.

    But now when I try to compile it on FC21 (3.19.3-200.fc21.x86_64 x86_64 GNU/Linux), it fails with below log!

    Linking CXX executable fsbench
    /usr/bin/ld: cannot find -lrt
    /usr/bin/ld: cannot find -lpthread
    /usr/bin/ld: cannot find -lstdc++
    /usr/bin/ld: cannot find -lm
    /usr/bin/ld: cannot find -lc
    collect2: error: ld returned 1 exit status
    CMakeFiles/fsbench.dir/build.make:4735: recipe for target 'fsbench' failed
    make[2]: *** [fsbench] Error 1
    CMakeFiles/Makefile2:91: recipe for target 'CMakeFiles/fsbench.dir/all' failed
    make[1]: *** [CMakeFiles/fsbench.dir/all] Error 2
    Makefile:76: recipe for target 'all' failed
    make: *** [all] Error 2

    In CMakeLists.txt, if I replace

    ELSEIF (UNIX)
    target_link_libraries (fsbench -lrt -lpthread)

    WITH



    ELSEIF (UNIX)
    set(TARGET_LINK_LIBRARIES "fsbench -lrt -lpthread -lstdc++ -lm -lc")

    then, -lrt and -lpthread error vanishes but the remaining 3 libraries still don't get linked.


    Setup where it fails:
    gcc version: gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
    cmake version : 3.0.2
    OS: 3.19.3-200.fc21.x86_64 x86_64 GNU/Linux
    LDD version: ldd (GNU libc) 2.20

    NOTE: I cmaked with -DCMAKE_BUILD_TYPE=Release
    Last edited by Saswat; 16th June 2015 at 16:33.

  22. #229
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I'm not good at debugging linux toolchain issues, I suggest that you ask your OS provider.

  23. #230
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Found the "-static" flag to be causing the linker errors.
    My lib64 had only the .so libraries and not the static .a ones.

    So removing the -static flag from CMakeLists.txt solves the linker issues.

    Tried on fc20 / fc21 / fc22. Works cool.

    Now got to analyse the cause for low performance on my Intel Xeon CPUs.

  24. #231
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Can you please share how to calculate the speed in terms of ticks/B ?

  25. #232
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    if your cpu runs, for example, at 4000 MHz while performing the operation, each cpu tick is 1/4 ns. so if compression runs at 200 MB/s then each byte requires 4000/200=20 cpu ticks

  26. Thanks:

    Saswat (23rd June 2015)

  27. #233
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Taking the example of xxhash (sheet 5 of results.ods):
    Frequency is 2 GHz.
    Speed is 29.12 GB/s

    So, as per the calculation, each byte should take : 2 / 29.12 ticks = 0.06 ticks.

    But the noted speed is 0.51 ticks/B ! How's that?
    Last edited by Saswat; 23rd June 2015 at 13:33.

  28. #234
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    Or is the speed in Giga "bits" per sec and not giga "bytes" per sec?
    Last edited by Saswat; 23rd June 2015 at 16:00.

  29. #235
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    There is a discrepancy between the performance numbers quoted in results.ods and the actual numbers.

    Can anyone please clarify:
    1. In result.ods : (for Intel xeon CPUs) the performance numbers are expressed in terms of giga bits per second(Gb/s) or giga bytes per second(GB/s) ?
    2. With the latest snapshot of FSbench(https://github.com/Ahti/fsbench), the unit of speed is expressed in terms of Gb/s or GB/s ?

    I assume result.ods has in Gb/s and the benchmark displays in GB/s.
    Please clarify.

  30. #236
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    you should ask creator of this odt. from official results, xxhash is 0.5 cpb = 6 gbyte/s on 3 GHz cpu

  31. #237
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    I need to take a closer look to answer that. I probably made something wrong. Maybe I actually used all cores and didn't mark it? Maybe I had some math wrong?
    I'll give you the answer as soon as I can, not yet, sorry.
    Thank you for spotting it.
    Last edited by m^2; 26th June 2015 at 23:47.

  32. #238
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,610
    Thanks
    30
    Thanked 65 Times in 47 Posts
    In the sequoia part I have the exact formula that calculates ticks/B entered:
    Code:
    =IF(H2;666.66667*1000*1000/H2/1024/1024;"??")
    So it's CPU frequency divided by speed * unit conversion.
    If I take this formula and apply to E5335 and shuffle a bit, I can calculate CPU frequency from speed in GB/s and ticks/B.
    It's 15.41-15.85 Ghz.
    That machine had 8 cores, 2 Ghz each.
    My hypothesis right now is that I measured speed twice, with -t1 and -t8. GB/s is -t8 and ticks/B -t1.

    I'll try to get access to that machine again and verify it, but no promises. And sorry for confusion.

  33. #239
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    i had this idea but ARK says that this CPU has 4 cores. your file never mentions that this box has two cpus

  34. #240
    Member
    Join Date
    Jun 2015
    Location
    Bangalore
    Posts
    13
    Thanks
    3
    Thanked 0 Times in 0 Posts
    The default configuration is single core(-t 1).
    If the published results are with 8 cores (-t 8 ), then may be it needs to be specified explicitly in the results file.

Page 8 of 9 FirstFirst ... 6789 LastLast

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •