Results 1 to 7 of 7

Thread: TurboPFor Library usage

  1. #1
    Member
    Join Date
    Jul 2020
    Location
    USA
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts

    TurboPFor Library usage

    Newbie here. I have been looking to use TurboPFor and similar libraries for compression of floating point data. I'm not an expert by any means, so I would really appreciate detailed help with implementation. I got a pretty good start from browsing other posts on the forum, but I am getting stuck in my implementation.

    The data is from a text file, so I wrote a parser that searches the text file and saves the values into a vector of floats. It does this by scanning each line and pushing back the value onto the vector. Is this the best way to format my data for the compressor? If not, how should I store them, and how would my implementation have to change?

    I ran the icapp benchmark with 32 bit and TXT file flags, and it pointed me to p4nzzenc128v32 being the best encoder for compression ratio. How would I add code to my parser to just run this encoder(and the decoder)? I know that I would have to include the fp.h header file, but I am lost as far as what code to include for the p4nzzenc128v32 method implementation.

    Thank you for any help!!

    Baker

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts
    There was a similar thread: https://encode.su/threads/3429-Turbo...sion-questions

    > I have been looking to use TurboPFor and similar libraries for compression
    > of floating point data. I'm not an expert by any means, so I would really
    > appreciate detailed help with implementation.

    I think you're supposed to contact the author (powturbo@gmail.com or via github issues).

    > Is this the best way to format my data for the compressor?

    Probably not, if you need high speed; whatever works otherwise.

    > If not, how should I store them, and how would my implementation have to change?

    For high speed you'd need a low-level implementation - standard library is too slow.

    > How would I add code to my parser to just run this encoder(and the decoder)?
    > I know that I would have to include the fp.h header file,
    > but I am lost as far as what code to include for the p4nzzenc128v32 method implementation.

    On source side just fp.h should be ok - it contains definitions of simple memory-to-memory functions,
    you can just pass myvector.data() to it or something.
    Otherwise, I think you can build icapp, then add all *.o except icapp.o to your project?
    Or you can track the dependencies manually, by including .o files which contain
    functions requested in linker errors, starting with fp.o.

  3. #3
    Member
    Join Date
    Jul 2020
    Location
    USA
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you for your help! I have one issue with my implementation that I am running into. When calling a floating point compressor(like the ones listed in fp.h) it asks for a uint32_t pointer as the input(here's a snippet of fp.h):

    // ---------- TurboPFor Zigzag of delta (=delta of delta + zigzag encoding) (TurboPFor)
    size_t p4nzzenc128v32( uint32_t *in, size_t n, unsigned char *out, uint32_t start);
    size_t p4nzzdec128v32( unsigned char *in, size_t n, uint32_t *out, uint32_t start);

    //----------- Zigzag (bit/io) -------------------------------------------------------
    size_t bvzenc32( uint32_t *in, size_t n, unsigned char *out, uint32_t start);
    size_t bvzdec32( unsigned char *in, size_t n, uint32_t *out, uint32_t start);

    //----------- Zigzag of delta (bit/io) ---------------------------------------------
    size_t bvzzenc32( uint32_t *in, size_t n, unsigned char *out, uint32_t start);
    size_t bvzzdec32( unsigned char *in, size_t n, uint32_t *out, uint32_t start);

    To get this uint32_t pointer, I am currently using memcpy to reinterpret my floats as uint32_t and point to them. However, while this does work for running the compressor and getting my input back when compressing and decompressing, the compression ratios are awful(greater than 1). I'm guessing this means I am creating my uint32_t pointer incorrectly. How else should I do it?

    I also tried premultiplying my floats by 10^6 to create uint32_t's, and while this did work better, it seems like it shouldn't be necessary since TurboPFor explicitly advertises floating point compression. Is there something that I am missing?

    I posted a snippet of my code below. I'm sure there are plenty of errors in it(I'm not the most experienced), but maybe it will help with answering my question. The code opens a binary file f, and writes the compressed version to w, and after decoding rewrites the original to h so that I can check that the compression/decompression had no unintended effects.

    else {

    FILE* f = fopen( argv[2], "rb" ); if( f==0 ) return 2;
    fseek(f, 0, SEEK_END);
    int n = ftell(f);
    rewind(f);

    FILE* w = fopen( argv[3], "wb" ); if( w==0 ) return 2;
    FILE* h = fopen( argv[4], "wb" ); if( h==0 ) return 2;

    unsigned char *out;
    uint32_t * in, *cpy;

    in = (uint32_t*)malloc((n + 1024*1024));
    out = (unsigned char*)malloc(n+1024*1024);
    cpy = (uint32_t*)malloc(n+1024*1024);
    uint32_t * checker = cpy;
    if (in == NULL or out == NULL or cpy == NULL) {
    printf("Memory not allocated.\n");
    exit(0);
    }
    size_t numEl = 0;

    float a;
    int i=0;
    while(1) {
    if( fread( &a, sizeof(float),1, f )!=1 ) break;
    memcpy(&in[i], &a, 4);
    numEl++;
    }







    size_t newEl = p4ndenc128v32(in, n/4, out);
    float ratio = (float)newEl/numEl;
    fwrite(&n, sizeof(float),1, w);
    fwrite(out,sizeof(float),newEl,w);


    //size_t p4nzzdec128v32( unsigned char *in, size_t n, uint32_t *out, uint32_t start);
    size_t finalSize = p4nddec128v32(out, n/4, cpy);

    float b;
    for (int j=0; j<n/4; j++){
    memcpy(&b,&cpy[j], 4);
    fwrite(&b,1, sizeof(float),h);

    }

  4. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts
    Seems to work like this?
    Attached Files Attached Files

  5. #5
    Member
    Join Date
    Jul 2020
    Location
    USA
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Oh wow I was being pretty dumb. I didn't realize that you could just create byte pointers and cast them later. Thank you for your help!!

  6. #6
    Member
    Join Date
    Jul 2020
    Location
    USA
    Posts
    4
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Just as a quick followup, I think there might be a small error in how you size the out bytestream. The size of it seems to be based on the read binary file, even in decoding. So, I think it is possible for it to size the out stream too small and result in a segfault. This happened for me when using the p4nzzenc128v32 compressor(it had a compression ratio of around 4% since I had data where delta encoding was very effective).

    To fix this, I just initialized the byte stream, and declared its size later based of f_len*4 for the encoding and n*4 for the decoding.

    It is also very possible I was doing something incorrectly, but I just wanted to let you know what I found!

  7. #7
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts
    Sure, I just did that before finding out that I have to write the number of floats to compressed file.

Similar Threads

  1. TurboPFor compression questions
    By AlexBa in forum Data Compression
    Replies: 8
    Last Post: 1st July 2020, 21:42
  2. TurboPFor: Integer Compression
    By dnd in forum Data Compression
    Replies: 50
    Last Post: 15th November 2019, 15:48
  3. Simple Program To Measure Peak Memory Usage
    By comp1 in forum The Off-Topic Lounge
    Replies: 7
    Last Post: 20th July 2016, 17:09
  4. SR2m - modified of SR2 for memory usage
    By snowcat in forum Data Compression
    Replies: 0
    Last Post: 4th July 2015, 05:47
  5. Looking for a low memory usage .exe packer
    By SvenBent in forum Data Compression
    Replies: 7
    Last Post: 5th April 2015, 17:19

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •