Page 2 of 2 FirstFirst 12
Results 31 to 40 of 40

Thread: Bit Archive Format

  1. #31
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    At least, can I ask a question? Why did you merge match length with literals in single 9-bits alphabet? What is the advantages/disadvantages of this? What is the advantages/disadvantages of seperating them?
    BIT Archiver homepage: www.osmanturan.com

  2. #32
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,002
    Thanks
    391
    Thanked 377 Times in 147 Posts
    Quote Originally Posted by osmanturan
    Why did you merge match length with literals in single 9-bits alphabet? What is the advantages/disadvantages of this? What is the advantages/disadvantages of seperating them?
    Well, I tried lots of schemes. This one comes out the best and simplest. I noticed that other authors like Igor Pavlov, Malcolm Taylor are use different approaches with flags and other stuff... Dont know just with my implementations LZH-like approach comes out the best.
    With my scheme, match lengths are also encoded using N-order context as well as literals. This gives a compression gain, I repeat its about my implementations and according to my tests...


  3. #33
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I think, a single bit can describe literal / match length. If we assume 1 indicates literal and zero indicates matches. Main stream can be coded like this:

    [0][Match Length][Match Index][1][Literal]...

    By using this we can use 3 different approach for entropy coding. By doing this, there must be a compression gain. Isn't it? I'm not sure well
    BIT Archiver homepage: www.osmanturan.com

  4. #34
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,002
    Thanks
    391
    Thanked 377 Times in 147 Posts
    Quote Originally Posted by osmanturan
    Isnt it? Im not sure well
    Try for yourself.

    I spent a few years to find out what you see. Of course Ive done a few theoretical conclusions about approaches used by me. In other words be sure that I tried at almost ALL known variants including your variant described above.

  5. #35
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Okay, thanks for advice
    BIT Archiver homepage: www.osmanturan.com

  6. #36
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by osmanturan
    What is the advantages/disadvantages of this?
    go through the forums as encode recommended, because there are a lot of ideas posted AND a lot of them labelled "improvement guaranteed" didnt turn out to be that good in encodes programs You will have to try what works best for you

  7. #37
    Programmer osmanturan's Avatar
    Join Date
    May 2008
    Location
    Mersin, Turkiye
    Posts
    651
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi everyone,
    Bit v0.2 compressor is coming! The change list is:

    1 - ROLZ match context has been changed order-1. Old implementation was order-2.
    2 - ROLZ index range has been increased 32768 from 64.
    3 - Literals / Match Lengths encoding has been seperated. Also, match index encoding has been changed.

    New compression results are:
    1 - valley.cmb -> 8.881.405 bytes
    2 - calgary corpus -> 831.253 bytes
    Old compression results were:
    1 - valley.cmb -> 9.569.057 bytes
    2 - calgary corpus -> 952.156 bytes
    As you see, there is %7-13 compression gain! Notice that, in old comparasion table, Bit v0.2 ranked in 4th for both files!

    Currently, it is very slow. So, after reviewing the code Ill publish it!
    BIT Archiver homepage: www.osmanturan.com

  8. #38
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,002
    Thanks
    391
    Thanked 377 Times in 147 Posts
    Quote Originally Posted by osmanturan
    calgary corpus -> 831.253 bytes
    Looking forward!

  9. #39
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,537
    Thanks
    758
    Thanked 676 Times in 366 Posts
    Quote Originally Posted by osmanturan
    1 - ROLZ match context has been changed order-1. Old implementation was order-2.
    2 - ROLZ index range has been increased 32768 from 64.
    these settings are excatly the same as in rolz3 Ilya, am i right?

  10. #40
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,002
    Thanks
    391
    Thanked 377 Times in 147 Posts
    Quote Originally Posted by Bulat Ziganshin
    these settings are excatly the same as in rolz3 Ilya, am i right?
    Quote Originally Posted by Malcolm Taylor
    With order 1 context, you will find you want at least 1024 offsets. The
    earlier ROLZ implementations called this the table size (since the
    offsets were stored in a circular table). Without optimal parsing, you
    will probably find that a larger table size can sometimes hurt
    compression, but this problem dissappears with optimal parsing.

    The ROLZ2 and ROLZ3 algorithms both are able to use table sizes of 32000
    (older versions were restricted to 32k) and more, in fact if you ask for
    a 64MB model size youll be using a table size of 64000 (IIRC). This
    means it can efficiently encode very large files.
    Yep!

Page 2 of 2 FirstFirst 12

Similar Threads

  1. BIT Archiver
    By osmanturan in forum Data Compression
    Replies: 137
    Last Post: 16th January 2009, 19:19
  2. StuffIt X Format
    By maadjordan in forum Data Compression
    Replies: 19
    Last Post: 9th August 2008, 13:03
  3. Universal Archive Format
    By Bulat Ziganshin in forum Data Compression
    Replies: 1
    Last Post: 9th July 2008, 00:54
  4. New archive format
    By Matt Mahoney in forum Forum Archive
    Replies: 9
    Last Post: 25th December 2007, 11:22
  5. UZ2 file format
    By encode in forum Forum Archive
    Replies: 0
    Last Post: 12th July 2007, 23:00

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •