Results 1 to 6 of 6

Thread: Even when not uncomputable, optimal compression is intractable

  1. #1
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts

    Even when not uncomputable, optimal compression is intractable

    As it turns out, a lot of work has gone into studying the computational complexity of optimal compression under various models (setting aside Kolmogorov's famously uncomputable model). A lot of the work was done by Jim Storer and/or his students. There is a wealth of references on his web page:

    http://www.cs.brandeis.edu/~storer/

    I haven't looked through all of it, and I don't want to risk misquoting something, so I'm not going to restate many results. But, optimal grammar-based compression (Tree-type) is NP-complete, and many other practical compression methods have been proven to be NP-complete. Brandeis has a compression group (which I've somehow been systematically ignoring) that has studied a great many interesting questions related to compression.

    There are too many interesting things to try to stuff them all into one post and do them justice. It would take a week to write the post (not even counting the time to read it). I'll add interesting info as I read more.

    Unfortunately, I couldn't find too many free papers. However, here is an interesting paper that deals with approximating optimal dictionary compression: http://people.csail.mit.edu/arasala/papers/grammar.pdf

    Here is a short paper showing that the problem of finding optimal binary decision trees is NP-complete. A binary decision tree is a procedure that identifies an object out of a set through a series of binary decisions. Many equivalent trees are possible; a tree is optimal when the expected number of decisions is minimized. A compressor can be viewed as a (infinitely-large) decision tree. Objects are replaced with the sequence of decisions that identifies them. Clearly this result is applicable to compression; exactly what the boundaries are of the affected subset requires a little bit of consideration.
    Last edited by nburns; 10th July 2014 at 06:13.

  2. Thanks:

    encode (17th July 2014)

  3. #2
    Member
    Join Date
    Feb 2014
    Location
    Canada
    Posts
    17
    Thanks
    23
    Thanked 0 Times in 0 Posts
    Having such a large fraction of important papers (in computer science, no less) behind some type of paywall is one of the tragedies of modern academia. The odd thing is provides next to zero benefits, after publication, for the writers of the papers to have them paywall, and comes with significant downsides.

  4. #3
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by PSHUFB View Post
    Having such a large fraction of important papers (in computer science, no less) behind some type of paywall is one of the tragedies of modern academia. The odd thing is provides next to zero benefits, after publication, for the writers of the papers to have them paywall, and comes with significant downsides.
    I agree. You can find leaks in the paywalls, though, and learn a lot in spite of them.

  5. #4
    Member
    Join Date
    Apr 2012
    Location
    Stuttgart
    Posts
    448
    Thanks
    1
    Thanked 101 Times in 61 Posts
    Quote Originally Posted by PSHUFB View Post
    Having such a large fraction of important papers (in computer science, no less) behind some type of paywall is one of the tragedies of modern academia. The odd thing is provides next to zero benefits, after publication, for the writers of the papers to have them paywall, and comes with significant downsides.
    Jim and Brandeis are the sponsors of DCC (Data Compression Conference) where you find many interesting papers every year, thus I assume that most of the papers there are on IEEE. Most universities pay for IEEE services (including mine) thus papers from their repository can be downloaded by the academic staff. This being said, IEEE also sponsors such conferences. Whether the price is fair I do not know. The business model of IEEE (and to an even higher degree) other publishers is questionable, though. Scientists pay for papers to be published, review papers for free, and pay for the papers again when retrieving them - thus, "triple pay" for a single publication.

    Unfortunately, there is no recognized open access journal comparable to IEEE, thus there are not really many alternatives.

  6. #5
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    From past experience I would say the problem with papers you pay for are often of poor quality. And the ones that are really good don't make it to those that could use them for the advancement of mankind. I also think in the 70's and 80's the people actually coding aircraft and missile software for example were why ahead of academia. The problem with software today is that the people from the 70's and 80's that actually where good are not in charge of the software. Some of these people had only a high school education most only had a BS in math or engineering. But they were far better than the PHD types that lacked real world experience that are trusted today. In those day the managers where are also engineers and scientists. Another major problem is that if someone has a real understand of a method it may not get published at all because of the way peer reviews are being done. Just look at the global warming bull shit. You don't get an honest review or published if you don't pay homage to the religion of global warming. It kind of reminds me of stories an early math teacher told us about how people fared that dared to believe in irrational numbers.

  7. #6
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by biject.bwts View Post
    It kind of reminds me of stories an early math teacher told us about how people fared that dared to believe in irrational numbers.
    Not believing in irrational numbers is like not believing in punctuation. They exist because we say so. Unless you are talking about people who refused to believe that the ratio of the radius to the circumference of a circle is not equal to any ratio of whole numbers. That's a factual issue. I think lots of people misspent their time trying to show otherwise, sort of like compression of random data.
    Last edited by nburns; 15th July 2014 at 20:29.

Similar Threads

  1. Replies: 75
    Last Post: 24th March 2014, 19:34
  2. [LZ] Optimal parsing
    By Bulat Ziganshin in forum Data Compression
    Replies: 14
    Last Post: 19th March 2014, 21:56
  3. Optimal ordered bitcodes (or smth like this)
    By Piotr Tarsa in forum Data Compression
    Replies: 29
    Last Post: 26th July 2011, 13:18
  4. optimal parsing
    By flounder in forum Data Compression
    Replies: 10
    Last Post: 3rd August 2008, 13:07

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •