Results 1 to 7 of 7

Thread: Guide to the new compression

  1. #1
    Programmer
    Join Date
    Jul 2008
    Location
    Finland
    Posts
    102
    Thanks
    0
    Thanked 1 Time in 1 Post

    Guide to the new compression

    I've been waiting for new compressors to enter top rankings of current
    benchmarks, due to the kuhnian paradigm shift that was announced by the
    experts here at encode.su. The fruits of this new knowledge has not been
    put to use by amateurs to far. To fix this, here is my guide how you can
    write a compressor that is top ranked on various benchmarks, you do
    not need any experience or expertise in compression (as explained by
    Shelwien), furthermore you need only about an hour of your time to do
    this. Because I don't think my theories about how to do this are relevant,
    I will only keep to the fresh theories presented here at encode.su by
    the experts.

    The guide is based on applying the new theories to the knowledge of
    nz. As we know it was very aggressively nullified by experts and
    geniuses here by announcements that it is a set of magic filters.
    The expert Shelwien speculated that some "general programming" is
    required to make it, defining his term as "the stuff like described
    in Knuth's books". This guide will present a way to do a session of
    general programming to summon magic filters for some good and proven
    quality piece of work like bzip2 (or gzip) and make it to the top
    rankings in benchmarks.

    How difficult it is to write a magic filter? Michael Maniscalco
    provided answer for this: "anybody can write a filter". Very rarely
    it happens that one can agree with Michael, but here we can.

    So how much time you need for reaching the top rankings? For nz,
    about 5% of the time was spent on the magic filters. After being
    unable to generate data where blizzard would outperform nz, even
    though being able to generate data where nz filters cause compression
    and speed loss, Christian Martelock declared: "Blizzard was
    written on one day". So we use this as our timeframe. Suppose the
    genius woke up early and began working on his compressor at 9:00am
    and arrived to the version 0.24 at 9:00pm. So we get about 10 hours
    of effective working time on this busy day, so 5% of this time is
    30 minutes. Since we are not all geniuses, suppose we take slightly
    longer to summon the filters and do some general programming,
    using these estimates, I presume it takes you one hour total.

    So what results can we expect? Let's derive an approximation by
    looking at the enwik9 results:

    nanozipltcb 166,251,135 348s 185s
    bzip2 1.0.2 -9 253,977,839 379s 129s

    Suppose you have some skill, then we can expect much better
    results, something like this:

    <your_compressor> 100,000,000 ???s ???s
    M99 v2.1 178,910,174 713s 535s

    So you are also applicable for the hutter prize. That is good because
    it means expert Shelwien will consider you have something that he
    describes as "compression skill".

    The guide is divided into two parts, first part is for "general
    programming" and another part is for summoning the magic filters.

    General programming:

    1. go to amazon.com and buy TAOCP
    2. download bzip2 source code
    3. skim TAOCP for shellsort optimizations
    4. apply shellsort optimizations to blocksort.c
    5. remove the blocksize limitations from the blocksort algorithm,
    if you get assert failures or segfaults, just replace the whole sort
    with the best algorithm given by knuth
    6. scan the first kb of a file for a word "the" and if it exist,
    apply magic textfilter to it or if the file begins with MZ,
    apply magic exe filter to it.

    Summon Magic Filters:

    1. before bzip2, filter the data to remove capital letters by
    inserting flags, so that "Genius" becomes "#genius".
    2. surround words and flags by spaces, so that "expert Shelwien"
    becomes " expert # shelwien ".
    3. replace common words with dictionary indexes, such that "Magic Filter"
    becomes " # 5 # 9 ".
    4. after encountering e8 or e9 bytes, add the pointer address
    for the following doubleword (if you are unsure what is a doubleword
    or a pointer, just look it up on wikipedia)

    Now you are done. To study your work more closely, permute sfc.tar
    or generate some other data until you get the results you want. Compare
    the results for various permuted and randomized files and see if
    everything working as supposed in the orthodoxy.

    I hope this guide is useful for aspiring amateurs and dilettantes
    like toffer and osmanturan if they have time to do some general
    programming instead of accusing other work being clones etc. Good luck.
    Just ask Shelwien, Christian or me if something isn't clear.

  2. #2
    Programmer
    Join Date
    Feb 2007
    Location
    Germany
    Posts
    420
    Thanks
    28
    Thanked 151 Times in 18 Posts
    Sorry, I've only read the first couple of lines - but whatever it is, you should really let it go, Sami. Everyone appreciates your work and your compressors, but well...
    Maybe Ilia should close this thread before it becomes another flame-war filled with senseless accusations and half-truths.

  3. #3
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    Quote Originally Posted by Sami View Post
    [...] Because I don't think my theories about how to do this are relevant,
    [...]
    I hope this guide is useful for aspiring amateurs and dilettantes
    like toffer and osmanturan if they have time to do some general
    programming instead of accusing other work being clones etc. Good luck.
    Just ask Shelwien, Christian or me if something isn't clear.
    I'm _not_ a programmer, but at least to my knowledge some members of the forum asked here for some theories. They got _nothing_ from you (<- Sami)!
    And I really ask myself, if writing a surprising _good_ compression-algo has to be coming along with an invisible (and un-guessable) ability in social communication.
    In other forums people behaving like that (similar to a troll) would soon been thrown out.
    Please be more polite!
    Just my two cents...

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,979
    Thanks
    376
    Thanked 347 Times in 137 Posts
    I think someone thinks that it is impossible to be banned at this forum... I'm not so sure about that...

  5. #5
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    870
    Thanks
    47
    Thanked 105 Times in 83 Posts
    I must admit being a heavy user of nanozip, it seems to me that Sami take things wat to personal, and every test where NZ doesn't come out on top he takes it as an assault on his coder.

    Here i'm thinkink about his response for my simple test on MMA vs NZ on a cue/bin file.

    Sami... Your attitude seems bordline paranoia.
    just focus on nanozip. I'm looking forward to every new version



    --- edit ---
    Mod please fell free to delete this if you find it to offensive
    Last edited by SvenBent; 19th August 2008 at 23:37.

  6. #6
    Programmer
    Join Date
    Jul 2008
    Location
    Finland
    Posts
    102
    Thanks
    0
    Thanked 1 Time in 1 Post
    I find it interesting, but expected, that any of my points about there being number of trolls in this forum, is now answered by the reverse and personal slander. This is due to the very authoritarian and even totalitarian (now thanks to encode) atmosphere in this forum.

    SvenBent, what is it about my suggestion that for comparing audio compression, you should use wav format, is not clear? My campaign for the last 10 years has been a consistent one, use meaningful test data to produce meaningful results.

  7. #7
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Guys, please cool it. This is getting a little over the top now.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •