Results 1 to 8 of 8

Thread: The Hutter Prize

  1. #1
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Will we be seeing a specially tuned version of PIMPLE or Dark entered sometime in the near future?

    http://prize.hutter1.net/

  2. #2
    Programmer kvark's Avatar
    Join Date
    Aug 2006
    Location
    Toronto, Canada
    Posts
    74
    Thanks
    1
    Thanked 1 Time in 1 Post
    only for participatin?
    As I sad before, I'll not lose any speed when gain a compression (moreover, It's hard to believe - I'll gain some speed on it). So If I could improve the ratio - I would do it immediately...

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Both PIMPLE and DARK are already compressing this file really well.

    PIMPLE: 20,992,830

    DARK: 21,231,325

    Alexander Rhatushnyak's PAQ8hp1 compressing to 17,397,023

    The text below was copied from here: Hutter Prize FAQ

    The PAQ8 compressors are already so good that it will be difficult to beat them.

    Yes, it will not be easy (nobody said this), but there is likely a lot of room for improvement. PAQ8 models text only. There has been lots of research in language modelling (mostly for speech recognition) at the syntactic and semantic levels, but these are usually offline models on preprocessed text where words are mapped to tokens, and messy details like capitalization, punctuation, formatting, and rare words are removed. So far, nobody has figured out how to integrate these two approaches, but when that happens we will have a very powerful text compressor. Also from Shannon's estimates, that human text contains about 1 bit per character information, enwik8 should be compressible down to 12MB.

  4. #4
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    I'm surprised by the apparent lack of interest in this subject.

  5. #5
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    I think it's similar to compressing random data. In other words, It's just PR-company.

  6. #6
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Alexander Rhatushnyak is easily winning this at the moment. His paq8hp4 takes ENWIK8 to 17'039'173 bytes which is now almost a 6% improvement over the baseline.

    If this continues he could soon be 50'000? richer.

  7. #7
    Member
    Join Date
    May 2006
    Location
    Uruguay
    Posts
    30
    Thanks
    0
    Thanked 1 Time in 1 Post
    If this continues he could soon be 50'000? richer.


    6% improvement is ~3000?

  8. #8
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    PAQ8hp5 compresses ENWIK8 to 16'898'402 bytes which is almost 7% improvement over the baseline.

    Awesome!

Similar Threads

  1. Hutter prize awarded
    By Matt Mahoney in forum Data Compression
    Replies: 2
    Last Post: 19th August 2009, 22:17
  2. Ocarina Compression Challange (Total Prize: $1 Million)
    By osmanturan in forum Data Compression
    Replies: 8
    Last Post: 2nd October 2008, 10:19
  3. Alexander Rhatushnyak wins Hutter Prize!
    By LovePimple in forum Forum Archive
    Replies: 1
    Last Post: 5th November 2006, 19:04

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •