Results 1 to 13 of 13

Thread: Papers - Summary of Data Compression Algorithms

  1. #1
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    538
    Thanks
    238
    Thanked 92 Times in 72 Posts

    Papers - Summary of Data Compression Algorithms

    Even when in this forum lies the probably largest source of information about data compression, I want to share with you some documents, including a very useful brief description of numerous algorithms from RLE to PAQ, including BWT, LZ*, PPM, etc.
    The purpose is to resume the almost confusing bunch of papers about compression, specially for newbies' well . Actually, I am one of those...
    After having this overview can be done further investigation on the chosen algorithm..

    The link

    The summary:




    Note: Don't be afraid to add your preferred papers in a clean way.
    Also, as I don't want to make public a somewhat defective text, nor even a word, please be that gently and point out any wrong or unclear redaction via PM. I'll correct them as soon as I take notice. Thanks in advance.
    Last edited by Gonzalo; 12th January 2015 at 07:13.

  2. Thanks (2):

    jman (14th April 2017),just a worm (16th January 2015)

  3. #2
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    538
    Thanks
    238
    Thanked 92 Times in 72 Posts
    Another easy to understand document about LZ derivatives.

  4. #3
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    538
    Thanks
    238
    Thanked 92 Times in 72 Posts
    And the well known, must-have Data Compression Explained


    Edit: on this post I'll be writing new additions, including yours. That's in order to maintain a clean presentation.

    -----------------------------------------------------------------------------------

    The Data Compression Book
    - 370 pages...

    -----------------------------------------------------------------------------------

    Added by Bulat Ziganshin:
    http://www.drdobbs.com/cpp/data-comp...0169251?pgno=1

    -----------------------------------------------------------------------------------

    From the same site as above:

    * LZW data compression - Include demonstration code.

    * A simple data-compression technique - Look at page 3 first. Fast compression in comparable ratio with LZW. Code included.

    * A new algorithm for data compression - Look at page 3 first. Algorithm: "Byte Pair Encoding"

    * Improving data compression fidelity without sacrificing speed - Now lossy is not that lossy.

    * The enduring challenge of compressing random data - Or how the numbers laugh at you.

    * Combinatorial data compression - Yet another article against universal compression.

    * Data Compression with the Burrows-Wheeler transform

    * Differential compression algorithms

    * Dynamic Markov Compression - Go to page 5 first.

    * An algorithm for online data compression - "LZH-Light" LZ77+Huffman. Go to page 5 first.

    * Image compression using Laplacian Pyramid Encoding - Go to page 12 first.

    * Image compression with Wavelets

    * Faster fractal compression - Speeding up the process. Go to page 9 first.

    * Audio Compression

    * Digital speech compression - About GSM standard. Go to page 5 first.

    * MPEG-2 and video compression

    * H.264 and video compression

    * Easy analog data compression - Delta applied to medicinal data logging.

    * Compression shrinks digital mammograms down to practical size - Less than 1% using wavelet...

    * Bounding Box data compression - Very simple bitmap fonts compressor. Go to page 17.

    Note: the documents in this section are mostly old. Despite that, they constitute a good source of original knowledge. Almost everyone have source code included.

    -----------------------------------------------------------------------------------

    At WikiBooks:

    * Data Coding Theory

    * Data Compression

    -----------------------------------------------------------------------------------

    More incoming... let the comments show up
    Last edited by Gonzalo; 20th January 2015 at 08:18. Reason: Content update

  5. #4
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts
    Quote Originally Posted by Gonzalo
    And the well known, must-have Data Compression Explained
    You don't miss much if you haven't read this one.
    Quote Originally Posted by Data Compression Explained
    Symbol ranking, or move-to-front (MTF), is a transform
    Move to front is a conversion algorithm, not a transformation algorithm. A transformation algorithm moves the elements around but does not change the value (like the Wheelers Transform does) while a conversion algorithm changes the value but does not change the position of the elements (like a conversion from ISO 8859-1 to UTF-8 or MTF does).

  6. #5
    Member
    Join Date
    Oct 2013
    Location
    Filling a much-needed gap in the literature
    Posts
    350
    Thanks
    177
    Thanked 49 Times in 35 Posts
    just a worm,

    A transform can change the values around. Think FFT---there's nothing in the output of a Fourier transform that corresponds one-to-one and literally with anything in the input.

    MTF is a transform. You can invert it and get all the original symbols back, just as you can invert the Fourier transform to go from the frequency domain back to the original domain.

  7. #6
    Member
    Join Date
    Jan 2014
    Location
    Bothell, Washington, USA
    Posts
    695
    Thanks
    153
    Thanked 183 Times in 108 Posts
    Doesn't jpeg use a discrete cosine transform?

  8. #7
    Member
    Join Date
    Oct 2013
    Location
    Filling a much-needed gap in the literature
    Posts
    350
    Thanks
    177
    Thanked 49 Times in 35 Posts
    Quote Originally Posted by Kennon Conrad View Post
    Doesn't jpeg use a discrete cosine transform?
    Yes. The DCT (Discrete Cosine Transform) is the most common version of the FFT (Fast Fourier Transform), which is the most common implementation of the Fourier transform. It's used in jpg and lots of other image & video compression algorithms.

  9. #8
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    I don't like your write up on BWT for several reasons.
    1) The original BWT did not use a EOF character to be added to the string so from a historical point of view its wrong.
    2) I feel if talking about the Real BWT or the more used one where an EOF character is added one like you used you should point out that in general most such strings formed at random would not have a UNBWT which means the UNBWT needs to have special handling because it can fail for most files or strings. There is no easy way to check a file or string to see if it has a UNBWT with out doing most of the work needed in doing an actually UNBWT.
    4) There is an alternate BWT that solves these problems called the BWTS and it is a pure permutation of the string or file and always has an inverse. Also the information can be stored in the same amount of space as original data without an index or specially placed EOF character in some string.
    5) In today's world files are often compressed and then encrypted to be sent around the world on the net. If one used Real BWT or the Modifed one you described. If one had a handful or possible encryption keys it most likely most false keys could be immediately rejected do to the poor choice of using a nonbijective method of BWT

    Why not discuss the orignal at least for historical reasons

    Then discuss the current common approach like you did

    And then talk about the bijective BWT approach.

    Just my few cents on it.

  10. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts

  11. #10
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    538
    Thanks
    238
    Thanked 92 Times in 72 Posts
    Added 24 new items. Please refer to post #3.

  12. #11
    Member just a worm's Avatar
    Join Date
    Aug 2013
    Location
    planet "earth"
    Posts
    96
    Thanks
    29
    Thanked 6 Times in 5 Posts
    .
    Last edited by just a worm; 6th February 2015 at 02:38.

  13. #12
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    Is that a good article? I skimmed it, and I have not personally written an arithmetic coder, but I have looked at the code for a few. I thought it was odd that they used floating points. There's also a bit too much c++ for my taste. Those aren't fatal flaws for an article, and the rest may be on point. It just seems like you could probably save time by skipping the extra stuff and seeing the way experts write it right from the start. Matt's arithmetic coder is pretty simple and readable as-is.
    Last edited by nburns; 10th February 2015 at 07:13.

  14. #13
    Member
    Join Date
    Feb 2013
    Location
    San Diego
    Posts
    1,057
    Thanks
    54
    Thanked 72 Times in 56 Posts
    Referring to DCE:
    Quote Originally Posted by just a worm View Post
    You don't miss much if you haven't read this one.
    There are some valuable parts. I think the biggest flaw is the way Matt's opinions are mixed in with the objective information. I think it's probably a good distance removed from what might be considered a standard introduction to compression.
    Last edited by nburns; 10th February 2015 at 13:35.

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Random Compression
    Replies: 244
    Last Post: 23rd March 2020, 16:33
  2. Papers 2011
    By BetaTester in forum Download Area
    Replies: 2
    Last Post: 5th July 2015, 10:47
  3. Sorting Algorithms Benchmark
    By Piotr Tarsa in forum The Off-Topic Lounge
    Replies: 39
    Last Post: 1st June 2015, 09:38
  4. Replies: 2
    Last Post: 18th April 2011, 04:13
  5. identity of obscure algorithms
    By asmodean in forum Data Compression
    Replies: 2
    Last Post: 6th August 2009, 08:50

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •