Results 1 to 6 of 6

Thread: Files ordering + content compression

  1. #1
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    198
    Thanks
    58
    Thanked 15 Times in 15 Posts

    Files ordering + content compression

    Hello users,

    I need to compress the following set of data:

    1.TXT - AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    2.TXT - AAAAAAAAAAAAA
    3.TXT - AAAAAAAAAAAAAAAAAAAAAAAAA
    4.TXT - AAAAAAAAAAAAA
    5.TXT - AAAAAAAAAAAAAAAAAAAAAAAAAAAA
    6.TXT - AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAA
    7.TXT - AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
    and so on...

    Description: Each of text file consist of the same character (in this case it´s "A", but it can be replaced with any other symbols from ASCII table) that has been generated RANDOMLY by myself, but it this case randomness will be irrelevant, because it contains only one letter.
    Filenames will be ordered descending.

    My questions:
    1. Since the aforementioned text files will be ordered descending, I suppose that I don´t need to store the filenames at all... it will be given directly from file content. Thus it COULD BE highly compressible, I suppose.
    2. What about smart ordering algorithms? I mean: file "2.txt" and "4.txt" contains the same number of symbol, so it can be simplified... but I don´t know how.
    3. If "A"´s would causing problems with compression ratio, it is, of course, possible to alter the string with any symbols (interleaving is also possible), but lenght must be preserved as it is.

    Thanks a lot.

  2. #2
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,497
    Thanks
    26
    Thanked 132 Times in 102 Posts
    Compression programs aren't specialized in approaching https://en.wikipedia.org/wiki/Kolmogorov_complexity on weird files. If you have a short generator program then it's probably smaller than output of any conventional file compressor.

    Look at: http://mattmahoney.net/dc/#sharnd
    If you're able to compress file http://mattmahoney.net/dc/sharnd_challenge.dat so that size of your program plus size of compressed data is smaller than original file then you've basically broken the cryptographic hash algorithm and/ or correctly guessed the input to generator. So far nobody did that and the challenge seems uninteresting, just like your challenges.

  3. #3
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,572
    Thanks
    780
    Thanked 687 Times in 372 Posts
    I bet that all these approaches are going around random data compression

  4. #4
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    554
    Thanks
    356
    Thanked 356 Times in 193 Posts
    Quote Originally Posted by Shelwien View Post

  5. #5
    Member Gotty's Avatar
    Join Date
    Oct 2017
    Location
    Switzerland
    Posts
    554
    Thanks
    356
    Thanked 356 Times in 193 Posts
    Quote Originally Posted by CompressMaster View Post
    I need to compress the following set of data
    The questions are as always: how did you generate those files? What do they represent?
    And why would you like to compress these files? Are there any practical reasons, or you'd just like to experiment with them?

    As I told you earlier in your other threads: we can help you better (or help you at all), if you tell us this information.

    If you generate something random, please don't ask us how to compress it. In that case you really must take Shelwien's advice and deepen your knowledge in information theory, and Gonzalo's advice to reread our previous posts with attention.

  6. #6
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    198
    Thanks
    58
    Thanked 15 Times in 15 Posts
    I need to compress files only for fun. They don´t represent nothing.

    OK, so I won´t talk about randomness.

Similar Threads

  1. convert swf files to avi files
    By Jabilo in forum The Off-Topic Lounge
    Replies: 15
    Last Post: 13th April 2020, 06:41
  2. Content-aware compression; Clari-Fi by Harman Kardan
    By SolidComp in forum Data Compression
    Replies: 7
    Last Post: 28th May 2018, 13:44
  3. Replies: 0
    Last Post: 14th August 2014, 20:37
  4. Information content of human genome
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 52
    Last Post: 17th December 2013, 08:10
  5. Online Content Management Services
    By Karhunen in forum The Off-Topic Lounge
    Replies: 2
    Last Post: 9th February 2012, 23:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •