Results 1 to 16 of 16

Thread: Schindler Transform (STX)

  1. #1
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Schindler Transform (STX)

    The question is.
    Suppose we have two sequences
    1: AAABAA
    2: AABAAA

    Use ST2 (STX):
    And we get out of 1st and 2nd row from the same recent columns.
    AA
    AA
    AA
    AA
    AB
    BA

    From different lines have the same result.
    Or where is wrong?

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,375
    Thanks
    214
    Thanked 1,023 Times in 544 Posts
    With STn you still sort the whole string shifts like in BWT, and still output the last column.
    The difference is that you only include n symbols in comparisons.

    Also in STn you'd still need a EOF symbol or an index position like in BWT.

  3. #3
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    1: AAABAA
    2: AABAAA

    Use ST2 (STX):
    1.
    shift matrix:
    AAABAA <- index0
    AAAABA
    AAAAAB
    BAAAAA
    ABAAAA
    AABAAA

    after sort:
    AAABAA <- index0
    AAAABA
    AAAAAB
    AABAAA
    ABAAAA
    BAAAAA

    Result : AAAABA
    index0: 0

    2.
    shift matrix:
    AABAAA <- index0
    AAABAA
    AAAABA
    AAAAAB
    BAAAAA
    ABAAAA

    after sort:
    AABAAA <- index0
    AAABAA
    AAAABA
    AAAAAB
    ABAAAA
    BAAAAA

    Result : AAAABA <- equal to 1
    index0: 0


    Or as a result I do not have to take second, and the last column?
    For 1: AABAAA
    For 2: AAABAA

  4. #4
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,474
    Thanks
    26
    Thanked 121 Times in 95 Posts
    You have to take the last column.

  5. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,375
    Thanks
    214
    Thanked 1,023 Times in 544 Posts
    Well, there's a different (and much simpler) way to build ST2.
    Just count the occurrences of symbol pairs, then allocate a buffer for each pair, then put
    the symbols encountered in context of that pair into that buffer.

    Something like this:
    Code:
    uint Forward_ST2( byte* inpbuf, byte* outbuf, uint inplen ) {
      uint i,c,p,q;
    
      memset( o2, 0, sizeof(o2) );
    
      p = inpbuf[inplen-2];
      q = inpbuf[inplen-1]; 
      for( i=0; i<inplen; i++ ) {
        c = inpbuf[i];
        o2[1+p+(q<<8)]++;
        p = q; q = c;
      }
    
      for( i=0,c=2; i<CNUM*CNUM; i++ ) o2[i]=c, c+=o2[i+1];
    
      p = inpbuf[inplen-2]; outbuf[0] = p;
      q = inpbuf[inplen-1]; outbuf[1] = q;
      for( i=0; i<inplen; i++ ) {
        c = outbuf[ o2[p+(q<<8)]++ ] = inpbuf[i]; 
        p = q; q = c; 
      }
    
      return inplen+2;
    }
    
    uint Inverse_ST2( byte* inpbuf, byte* outbuf, uint inplen ) {
      uint i,j,c,p,q;
    
      memset( o1, 0, sizeof(o2) );
      memset( o2, 0, sizeof(o2) );
    
      for( i=2; i<inplen; i++ ) c=inpbuf[i], o1[1+c]++; // symbol freqs
    
      for( i=0,c=2; i<CNUM; i++ ) o1[i]=c, c+=o1[i+1]; o1[CNUM]=c; // now offsets
    
      for( j=0; j<CNUM; j++ ) { // symbol loop
        q = j;
        for( i=o1[j]; i<o1[j+1]; i++ ) {
          c = inpbuf[i]<<8;
          o2[1+q+c]++; // byte freqs
        }
      }
      for( i=0,c=2; i<CNUM*CNUM; i++ ) o2[i]=c, c+=o2[i+1];
    
      p = inpbuf[0]; 
      q = inpbuf[1]; 
      for( i=0; i<inplen-2; i++ ) {
        c = outbuf[i] = inpbuf[ o2[p+(q<<8)]++ ];
        p = q; q = c; 
      }
    
      return inplen-2;
    }

  6. #6
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    Talking

    with something as simple as
    AABAAA and AAABAA
    why not use the BWTS transform and then you don't need a stupid INDEX

    AABAAA > AAB AAA two lyndon words the strings of interest
    are AAA AAA AAA AAB ABA BAA which sort to
    AAA
    AAA
    AAA
    AAB
    ABA
    BAA

    giving AAABAA as the BWTS of AABAAA

    while
    AAABAA > AAAB AA two lyndon words with strings
    AA AA AAAB AABA ABAA BAAA
    sorting
    AAA Note repeat string to match length of longest lyndon word
    AAA
    AAAB
    AABA
    ABAA
    BAAA Use last symbol of strings of interest in case for the length 2
    lyndon words you use the second symbol.

    giving AABAAA as the BWTS string of AAABAA and again no need for an INDEX

  7. #7
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    All thanks.
    biject.bwts I do not fully understand a tricky algorithm. This is a simple BWT?

    Shelwien Thanks, it significantly speeds up the decoding ST2.
    And explain what the difference in the inverse transform algorithms for different X - differ from ST2_decode ST8_decode?
    I wonder how, in general, and optimized versions.

  8. #8
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,375
    Thanks
    214
    Thanked 1,023 Times in 544 Posts
    You can see the ST3,ST4,ST5 implementation in libbsc.
    In short, ST4+ has to be much more complicated than ST2, because allocating 16G static arrays for it is still troublesome atm.

  9. #9
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I looked bsc_st.cpp decoding STX - not so difficult to implement, but for a long time to execute.
    In fact, it turns out that sorting the entire length of the contexts we have shifted from compression to decompression. However, decompression is use is usually much more often than compression and we get the loss in whole . Or is there something wrong?

    And presumably, that should be faster encoding + decoding for STX or for BWT, ie that faster unSTX (STX) or unBWT (BWT)?

  10. #10
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 1 Time in 1 Post
    Only reverse ST3 transform is faster than BWT. BWT also can be easy optimized for multi-core systems where ST is not.
    Enjoy coding, enjoy life!

  11. #11
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Gribok View Post
    Only reverse ST3 transform is faster than BWT. BWT also can be easy optimized for multi-core systems where ST is not.
    About BSC. Why the secondary indexes in BWT and why they are not used?
    I guess they are just for parallelization BWT inverse transform, right?

    Quote Originally Posted by bwt.h
    * @param indexes - the secondary indexes array, can be NULL.
    int bsc_bwt_encode(unsigned char * T, int n, unsigned char * num_indexes, int * indexes, int features);

  12. #12
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by CUDALIKE View Post
    About BSC. Why the secondary indexes in BWT and why they are not used?
    I guess they are just for parallelization BWT inverse transform, right?
    They are used in parallel decompression and they are on by default.
    Enjoy coding, enjoy life!

  13. #13
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Gribok View Post
    They are used in parallel decompression and they are on by default.
    I think it's just as easy to use to speed up BWT inverse on the GPU. Do you plan this?

  14. #14
    Programmer Gribok's Avatar
    Join Date
    Apr 2007
    Location
    USA
    Posts
    159
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by CUDALIKE View Post
    I think it's just as easy to use to speed up BWT inverse on the GPU. Do you plan this?
    Random memory access is slow on CUDA. It is not clear to me if CUDA will be faster then CPU, but I may try.
    Enjoy coding, enjoy life!

  15. #15
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Gribok View Post
    Random memory access is slow on CUDA. It is not clear to me if CUDA will be faster then CPU, but I may try.
    Yes. But there is a possibility to use all the cores. To count the number of different characters all the cores and threads. To save the intermediate data can be used shared memory.
    I think you can speed up tenfold more than the CPU.
    Even if we can speed up a bit - it will share the load between the CPU and GPU.

  16. #16
    Member
    Join Date
    Jul 2011
    Location
    Russia
    Posts
    13
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Random memory access is slow on CPU too.
    RAM works well in consistently access and in the random access cache adds speed .
    Block larger than 1MB will not be able to use almost the L2 cache, and we can only use L3.
    Using key -b10 - 100, CPU will not have any benefit in using the cache.

    http://www.1024cores.net/home/in-rus...ata-structures
    http://www.1024cores.net/home/parall...ous-algorithms

Similar Threads

  1. a very simple transform for english.dic
    By willvarfar in forum Data Compression
    Replies: 8
    Last Post: 1st March 2010, 15:44

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •