Results 1 to 14 of 14

Thread: Turbo Transpose compressor filter for binary/integer/floating point data

  1. #1
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts

    Exclamation Turbo Transpose compressor filter for binary/integer/floating point data

    TurboTranspose: Integer + Floating Point Compression Filter
    - Byte/Nibble transpose/shuffle for improving compression of binary data (ex. floating point data)
    - Scalar/SIMD Transpose/Shuffle 8,16,32,64,... bits
    - Dynamic CPU detection and JIT scalar/sse/avx2 switching
    - 100% C (C++ headers), usage as simple as memcpy
    - Ready and simple to use library, no hassless dependencies

    + Byte Transpose
    - Fastest byte transpose

    + Nibble Transpose
    - nearly as fast as byte transpose
    - more efficient in most binary data files, up to 6 times faster than Bitshuffle
    - more robust worst case scenario than bitshuffle

    Last edited by dnd; 11th June 2018 at 12:18.

  2. Thanks (2):

    Bekk (1st April 2018),oltjon (4th March 2017)

  3. #2
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    TurboTranspose: Integer + Floating Point Compression Filter update:
    - More faster
    - More benchmarks
    Last edited by dnd; 11th June 2018 at 12:19.

  4. #3
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    TurboTranspose: Integer + Floating Point Compression Filter update:
    + Scalar and SIMD Transform
    - Delta encoding for sorted lists
    - Zigzag encoding for unsorted lists
    - Xor encoding
    Last edited by dnd; 11th June 2018 at 12:19.

  5. #4
    Member
    Join Date
    May 2017
    Location
    UK
    Posts
    10
    Thanks
    1
    Thanked 3 Times in 3 Posts
    I tried to compile this with VS2010 on my Westmere Xeon (no AVX2, but does have SSE4.2) and had a few issues:

    1. _xgetbv isn't present in VS2010, but I fixed this by changing line 123 of transpose.c to
    Code:
    #if(defined _MSC_VER && (_MSC_FULL_VER >= 160040219) || defined __INTEL_COMPILER)
    2. couldn't find getopt.h, this seems to be a mingw32 file? Google search revealed this link on github: https://github.com/skandhurkat/Getop...aster/getopt.h ; seemed to fix

    3. line 85 of conf.h is inside #elif _MSC_VER but references __builtin_clz which is a GCC intrinsic I believe, so wouldn't compile. I replaced it with
    Code:
    static inline int bsr32(int x) { return x ? 32 - __lzcnt(x) : 0; }
    Not sure how correct this is, but googling around suggested __lzcnt was a VS drop-in replacement for __builtin_clz

    4. line 91 and 92 also threw up compile errors on _BitScanForward and _BitScanReverse signature not expecting unsigned int arguments.
    According to https://msdn.microsoft.com/en-us/library/wfd9z0bb.aspx, first argument is expected unsigned long:
    Code:
    unsigned char _BitScanForward(     unsigned long * Index,  
       unsigned long Mask   );  
    So I changed line 91 and 92 to as follows:
    Code:
    static inline int clz32(unsigned int          x) { unsigned long z = 0; _BitScanForward(  &z, x); return 32 - z; }
    static inline int ctz32(unsigned int          x) { unsigned long z = 0; _BitScanReverse(  &z, x); return z; }
    After all this, I couldn't get any further and had the following compile/link errors that I didn't know how to fix:
    Code:
    c_tt_transpose_double.obj : error LNK2001: unresolved external symbol tp4dec128v4 
    c_tt_transpose_double.obj : error LNK2001: unresolved external symbol tp4enc128v8 
    c_tt_transpose_double.obj : error LNK2001: unresolved external symbol tp4enc128v4 
    c_tt_transpose_double.obj : error LNK2001: unresolved external symbol tp4enc128v2 
    c_tt_transpose_double : fatal error LNK1120: 4 unresolved externals
    Would appreciate any advice, would love to use your library. FP transformations seem to be mostly forgotten in practical compression, as everyone seems to focus on english text, but for scientific use it is invaluable.
    I'm able to compile blosc-shuffle without issue, but would like something faster.

  6. #5
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    Glad you have interest in TurboTranspose and thank for your hints about vs.
    TurboTranspose has been tested with MingW64 under windows.


    For compiling your app with VS:
    - include the header "transpose.h" in all files using TurboTranspose functions.
    - Build the TWO object files "transpose.o" and "transpose_sse.o" from "transpose.c"
    Use the same procedure and the define directives "USE_SSE" and "SSE2_ON" as the makefile in TurboTranspose.
    You can also copy "transpose.c" to "transpose_sse.c" and build the 2 object files in visual studio.


    FP compression is highly dependent on the distribution of the values.
    You can use tpbench for benchmarking.
    If you want more compression it is better to use a stronger compressor.
    You can also try the floating point functions in TurboPFor


    P.S. : you can find "getopt" here

  7. #6
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    Included a makefile for visual c++ in TurboTranspose: Integer + Floating Point Compression Filter.
    This builds the static library "libtb.lib" and the benchmark app "tpbench.exe" with
    Code:
    nmake /f makefile.vs
    Last edited by dnd; 11th June 2018 at 12:20.

  8. #7
    Member
    Join Date
    May 2017
    Location
    UK
    Posts
    10
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Thanks very much! I have it compiled now.
    I seem to get lower benchmark figures in my testing environment than using tpbench.exe, but this may be due to obscure features of my particular use case.

    Also, tpbench seems to detect my cpu as SSE3, not SSE4.1 "detected SIMD=sse3". Is this an issue with cpu detection code, or just because code doesn't need more than SSE2?

    I'll post some benchmarks soon.

  9. #8
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    There was a typo in the display function cpustr. Fixed now.

  10. Thanks:

    djubik (6th June 2017)

  11. #9
    Member
    Join Date
    May 2017
    Location
    UK
    Posts
    10
    Thanks
    1
    Thanked 3 Times in 3 Posts
    Some benchmarks on my Dual X5690

    This is using tpbench:
    Code:
    File='G:\tmp\data1' Length=41918956 element size=4. detected simd=sse3
    
    
    tp_byte         3847.51   4094.03
    tp_nibble       3606.50   3796.99
    memcpy          3592.37
    This is embedded in my application:
    Code:
    size = 40 MB, type = float, numIter = 20, calculating inter-quartile mean
    copy  (memcpy)  = 0.021968 sec :: 1820 MB/s :: comp ratio LZ4 100.39%
    blosc (shuffle) = 0.050594 sec ::  790 MB/s :: comp ratio LZ4 89.59%
    turtt (tpenc)   = 0.021133 sec :: 1892 MB/s :: comp ratio LZ4 89.60%
    
    blosc (unshuff) = 0.053062 sec ::  753 MB/s
    turtt (tpdec)   = 0.019419 sec :: 2059 MB/s
    And a larger file:
    Code:
    size = 1108 MB, type = double, numIter = 20, calculating inter-quartile mean
    copy  (memcpy)  = 0.603125 sec :: 1837 MB/s :: comp ratio LZ4 100.39%
    blosc (shuffle) = 1.882178 sec ::  589 MB/s :: comp ratio LZ4 87.93%
    turtt (tpenc)   = 0.664861 sec :: 1667 MB/s :: comp ratio LZ4 87.93%
    
    blosc (unshuff) = 1.523911 sec ::  727 MB/s
    turtt (tpdec)   = 0.572066 sec :: 1937 MB/s
    My application calls C++ from Matlab, and includes a malloc.

    Either way, turbotranspose blows blosc out of the water for me. Many congratulations - this is excellent.
    Last edited by djubik; 6th June 2017 at 16:34.

  12. Thanks:

    dnd (7th June 2017)

  13. #10
    Member
    Join Date
    May 2017
    Location
    Sealand
    Posts
    15
    Thanks
    7
    Thanked 2 Times in 2 Posts
    How do I use this with a solid compressor?
    I have compiled the tpbench.exe but it doesn't return any files

  14. #11
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    TurboTranspose is a transform library only and not a complete compressor.
    "tpbench.exe" is an in memory benchmark program to test TurboTranspose.

    You can look at "tpbench.c" to figure out how TurboTranspose can be used in combination with other compressors.

  15. #12
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    NIBBLE transpose demolished "bitshuffle", "blosc" and one of the best floating point compressors "SPDP"
    in a standard 32/64 bits scientific dataset.

    Better compression and several times faster.

  16. #13
    Member
    Join Date
    Apr 2019
    Location
    United States Texas
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I seem to get lower benchmark figures in my testing environment than using tpbench.exe, but this may be due to obscure features of my particular use case.

  17. #14
    Member
    Join Date
    Mar 2013
    Location
    Worldwide
    Posts
    487
    Thanks
    52
    Thanked 182 Times in 133 Posts
    Update: TurboTranspose Integer + Floating point binary filter

    - All Turbo Transpose functions are now available for 64 Bits ARM including optimized Neon SIMD
    - 2D, 3D and 4D transpose
    - Optimized transform for relative error bound lossy floating point compression
    - New ARM Benchmarks: Nibble transpose up to 10 times faster than bitshuffle



Similar Threads

  1. Floating point compression
    By entropy in forum Data Compression
    Replies: 12
    Last Post: 9th June 2018, 23:40
  2. Replies: 4
    Last Post: 22nd June 2015, 01:32
  3. Crook, a new binary PPM compressor
    By valdmann in forum Data Compression
    Replies: 25
    Last Post: 19th March 2012, 18:12
  4. BIN@ERN: binary-ternary compressing data coding
    By I/I.I-I. in forum Data Compression
    Replies: 4
    Last Post: 29th January 2012, 14:30
  5. Need god PCM compressor/filter
    By SvenBent in forum Data Compression
    Replies: 10
    Last Post: 8th July 2008, 16:52

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •