Results 1 to 3 of 3


  1. #1
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Thanked 14 Times in 10 Posts

    Thumbs up BIJECTIVE DC

    A long time ago I noticed Binders discussion of DC though he provided only a few examples like
    "aaabccca" where he assumed you already had the number of character types a b c as well as there
    starting positions. he got 2 DC values namely 3 and 0 so I felt like that was to high I think one DC
    value is all that is needed and its a 2.
    Yuta has code that I think reflects the original internet discussion of DC. Over a period of a few
    days I showed several examples of what I felt was wrong with the DC values. Since I wanted to
    make it bijective. Yuta and I made several changes. I made another one recently. I am not sure if
    its the last.

    What I did was write a program to encode DC as follows
    bij_dc_32.exe e A B
    and decode as follows
    bij_dc_32.exe d B C

    B would be the bijective DC of file A
    C should be the same as A

    Since bijective you could take any file and decode first and then encode second
    to get back to same file.

    The output of the DC is nothing but numbers done in a bijective way
    0xxxxxxx is a number
    1xxxxxxx 0xxxxxxxx is a larger number
    the 1 in first postion of a byte means continue inless last number in a file.
    the 0 means this is last byte in the file. Note the codeing of last number
    is special.

    I did this so one could test it in reverse direction easily so if you use
    random data to decode and then encode so you don't end up with a
    monster file. Here is hex ouput of "aaabccca"
    6 BYTES. It could be smaller since use a whole byte for numbers wasteful.

    E203 codes the b position
    62 codes the code the c position
    61 by what I call wrapping around code the a position
    F903 codes the NEW MORE NEW SYMBOLS with the first DC value
    that's it since only one DC value

    Another example SNNBAAA the BWTS of BANANAS
    codes to 4EC10141CC02 which is 6 bytes. However by using a different number system easy to get to 5 bytes
    4E is for N
    C101 is for B
    41 is 4 A
    CC02 is for S no need for the no new symbol symbol

    Attached Files Attached Files
    Last edited by biject.bwts; 28th February 2012 at 04:26. Reason: corrected english errors

  2. #2
    Join Date
    Dec 2012
    Thanked 71 Times in 44 Posts
    I found a bij_dc's bug.
    input    decoded
    ff    -> ff ff
    fe ff -> fe ff ff

  3. #3
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Thanked 14 Times in 10 Posts
    I am not in best shape. I no longer have most of code wiped out by computer virus and my inability to maintain backups. But This is something worth fixing if you are correct. Did you run this using binary files for data. I seem to remember people trying to do many of my programs using pipes and such. My code was for BASIC BINARY FILES. I use to have a test set that included the first 256 basic files plus several more where trouble occurred. That said I use to have another version updated that I was going to post. But alas that is gone.

    If you have a zip or better a 7z with all the data files in question I can recreate exe and run to see if I can get the error on my current machine. Any way glad some looked at it

    If you notice I think this is the version where I have entries for all the symbols used. My later version you pretend that every symbol is used and it still compressed very small since it uses arithmetic coding for the output. In other words your are not wasting a whole bit per symbol if long strechs of symbols not used. If this is wrong when I get healthier I will try to write new version again and post it.

Similar Threads

    By biject.bwts in forum Data Compression
    Replies: 4
    Last Post: 13th February 2012, 02:36
  2. bijective fpaq0p_sh
    By biject.bwts in forum Data Compression
    Replies: 20
    Last Post: 16th November 2011, 04:48
  3. Replies: 0
    Last Post: 30th August 2011, 17:47

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts