Page 1 of 2 12 LastLast
Results 1 to 30 of 56

Thread: My new compression algorithm

  1. #1
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Talking My new compression algorithm

    Hello, i recently invented a new algorithm - i think it is new because it is not basic on entropy, or data repeating, like LZ family or arithmetic encoding. Because i'm not a software developer so i currently haven't created compression application yet. All information below is only my prediction
    My algorithm have some advance:
    1. Can compress ALL TYPE OF FILE " include files compressed by itself ", so that we can perform multi level compression to maximize compress ratio (compress ratio can be 100:1 if necessary)
    2. compression speed is not affected by the input string, faster than arithmetic but slower than lzw... moreover, my algorithm do not consume too much memory and cpu as PQA ( it just nearby LZ family algorithm), the more memory and cpu speed, more compression speed and ratio we gain.
    3. decompression speed is faster about 4 times than compress speed. (we compress once to decompress millions times)
    4. security feature: encoding sequence will create an string of controller that length 1/1000 of source string and data cannot be decompressed if we hide that string
    .Disadvantage:
    Low speed: every compress - decompress level or circle take time equal to LZ family but only can reduce less than 5,5%(can be increase in future research ) of source string, because we are compressing the
    incompressible data

    Is it valuable enough to develop a application?
    Last edited by tefara; 7th January 2016 at 14:58.

  2. #2
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Quote Originally Posted by tefara View Post
    Is it valuable enough to develop a application?
    No.

  3. #3
    Member
    Join Date
    Nov 2015
    Location
    ?l?nsk, PL
    Posts
    81
    Thanks
    9
    Thanked 13 Times in 11 Posts
    Sorry, couldn't resist the temptation to reply sarcastically.
    Long answer:
    Math doesn't lie. Your algorithm may be able to compress incompressible data. But it won't be able to decompress it back. Your program can't work which implies that it won't work. If you don't believe that it can't work, (and you probably don't), I suggest that you read the literature:
    http://mattmahoney.net/dc/dce.html#Section_11
    If you don't believe the literature either, think about creating a decompressor.
    If you still think you have designed a correct decompressor, implement it.
    If you have implemented a decompressor and believe it's correct, develop a compressor too.
    You should find soon after that the couple doesn't always work.
    Last edited by m^3; 7th January 2016 at 14:49.

  4. #4
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    too young too simple.

  5. #5
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by tefara View Post
    decompression speed is faster about 4 times than compress speed. (we compress once to decompress millions times)
    Is it valuable enough to develop a application?
    Do you guy read my post carefully? any compression algorithm have to go with decompression algorithm or it will be useless.

  6. #6
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    Quote Originally Posted by tefara View Post
    Do you guy read my post carefully? any compression algorithm have to go with decompression algorithm or it will be useless.
    No compressors can compress all types of data. Whan you say you invent such an algorithm, it means you lack some knowledge of math.

  7. #7
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    In the past, noone believe that the earth is round, but in-fact, it's round. Never said something impossible. My question is only the relative compression ratio and compression speed. If compare with LZW algorithm which have compression ratio about 30% for English text file, my algorithm at the same time can only reduce 5%. But, the advantage here is re-compressible to create infinite compression ratio

  8. #8
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    Quote Originally Posted by tefara View Post
    In the past, noone believe that the earth is round, but in-fact, it's round. Never said something impossible. My question is only the relative compression ratio and compression speed. If compare with LZW algorithm which have compression ratio about 30% for English text file, my algorithm at the same time can only reduce 5%. But, the advantage here is re-compressible to create infinite compression ratio
    it's no matter that you implement your compressor and bring it here to blame us. But i think no one is interested in helping you implement it because we all know it is impossible.

  9. #9
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    i'm not blaming any one. it is just a question.I don't try conviene you guy to believe me, i only want to ask a question, if my algorithm is real, basic on that speed and ratio, is it valuable enough to develop an application? However, great to hear you said that "it is impossible", so that i can become the first one who make it become possible. i asked this question because i hear rumor that kgb archiver can compress 600 mbs winxp disk to 1,5mb.
    spoil a example, i can encode and decode a string of 256 totally random char ( all of them are different each other)
    input string 256 char 8bit = 2048 bits
    output string 240 char 8bit = 1920 bits

  10. #10
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,537
    Thanks
    758
    Thanked 676 Times in 366 Posts
    tegara, if i will find proof that 0=1, what should i do next? apply for PhD or search errors in my proof?

    > so that i can become the first one who make it become possible

    actually people like you arrives to compression forums every month or so. the comp.compression newsgroup was filled by them some years ago, i don't know about current situation

    >
    i can encode and decode a string of 256 totally random char

    do you mean that you can encode ANY 256 bytes to 240 one? it's impossible, hope you understand it. and if you can encode only some of 256-byte sequences to 240-byte, it's very easy to compute how much of those 256-byte strings can be encoded to 240-byte. please make such computation for me
    Last edited by Bulat Ziganshin; 7th January 2016 at 16:05.

  11. #11
    Member jibz's Avatar
    Join Date
    Jan 2015
    Location
    Denmark
    Posts
    122
    Thanks
    104
    Thanked 71 Times in 51 Posts
    The problem is (as others have pointed out), if you can compress all strings, then you cannot decompress them all to the original string, because at least some strings have to compress to the same shorter string, and given that shorter compressed string, you cannot know which longer string it originated from.

    For arguments sake, consider a function that can compress all integers from 1 to 1000 to an integer between 1 and 10. At least two of the integers from 1 to 1000 have to compress to the same integer between 1 and 10 (actually a lot more of course). Say both 600 and 700 compress to 5. Now somebody gives you a compressed 5 -- how do you decompress it? to 600 or 700?

    And just to note, it is unlikely that 256 random bytes are all different from each other.

  12. #12
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    Quote Originally Posted by tefara View Post
    i'm not blaming any one. it is just a question.I don't try conviene you guy to believe me, i only want to ask a question, if my algorithm is real, basic on that speed and ratio, is it valuable enough to develop an application? However, great to hear you said that "it is impossible", so that i can become the first one who make it become possible. i asked this question because i hear rumor that kgb archiver can compress 600 mbs winxp disk to 1,5mb.
    spoil a example, i can encode and decode a string of 256 totally random char ( all of them are different each other)
    input string 256 char 8bit = 2048 bits
    output string 240 char 8bit = 1920 bits
    there are 256^256 different 256-byte strings, and only 240^256 different 240 one.
    you compress all 256-byte strings to 240 byte, that means at least (256^256 - 240^256)=3.23e+616 output results are duplicated. how can you recover two same 240-byte compression outputs into two different 256-byte inputs?

  13. Thanks:

    tobijdc (7th January 2016)

  14. #13
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    split source string into block of 256 char string
    if all of char in every block are different, the result is best 256 char to 240 char ( that can be better if using another algorithm an a supporter)
    if they are not totally different, a dynamic dictionary that lenght 32 char is required to decode, if 32 char dictionary can be used for more than 3 blocks...boom we gain some compression ratio

  15. #14
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by RichSelian View Post
    there are 256^256 different 256-byte strings, and only 240^256 different 240 one.
    you compress all 256-byte strings to 240 byte, that means at least (256^256 - 240^256)=3.23e+616 output results are duplicated. how can you recover two same 240-byte compression outputs into two different 256-byte inputs?
    we are encoder, don't something stupid like this ok. if keep thinking like you, there is no compression algorithm on the earth can work. Moreover, practice your English and math skill some more please!

  16. #15
    Member
    Join Date
    Sep 2015
    Location
    Italy
    Posts
    268
    Thanks
    111
    Thanked 153 Times in 112 Posts
    Quote Originally Posted by tefara View Post
    spoil a example, i can encode and decode a string of 256 totally random char ( all of them are different each other)
    input string 256 char 8bit = 2048 bits
    output string 240 char 8bit = 1920 bits
    You can do better: if I'm not wrong, a string of 256 totally random char (all of them are different each other), can be compressed down to 1683.9962... bits (211 bytes) = log2(256) + log2(255) + log2(254) + ... + log2(1).

  17. Thanks (3):

    jibz (9th January 2016),Matt Mahoney (8th January 2016),tefara (7th January 2016)

  18. #16
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    Quote Originally Posted by tefara View Post
    we are encoder, don't something stupid like this ok. if keep thinking like you, there is no compression algorithm on the earth can work. Moreover, practice your English and math skill some more please!
    yes there is no working "universal" compressor on the earth. all compressors we inverted only work on redundant data.

    it easy to proof that your algorithm is impossible:
    Code:
    Y=YourCompressionAlgorithm(X)
    X∈Sx
    Y∈Sy
    for "compressor that can compress all kind of data", card(Sx) must be LARGER than card(Sy). in this way, you can NEVER find X=YourDecompressionAlgorithm(Y).

    i am chinese and speak suck english, sorray about that.

  19. #17
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Your brain has been hacked!!!, All type of data are redundant data, if we has a string of 257 char_8bit, there must be at least one char is repeated and it is the redundance we need to compress in some way. Once again, i don't care if you believe or not, i just wonder about the relative of speed equal to LZW but only can get around 5% reducing is enough or not, so dont talk something useless please

  20. #18
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,537
    Thanks
    758
    Thanked 676 Times in 366 Posts
    and if we have sequence of 3 bits, either 0 or 1 is repeated too, so it also can be compressed?

  21. #19
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Great to hear that anyone believe it ís impossible, that's mean my algorithm is unique. But i'm dissapointed that you guy always try to hacked into other people brain instead of thinking seriously my question about compression speed and ratio. Topic close here until i has finished study phyton and develop my software. A video or some pictures about my work will be upload later

  22. #20
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,537
    Thanks
    758
    Thanked 676 Times in 366 Posts
    it was rather easy case, colleagues

  23. #21
    Member RichSelian's Avatar
    Join Date
    Aug 2011
    Location
    Shenzhen, China
    Posts
    167
    Thanks
    20
    Thanked 59 Times in 28 Posts
    Quote Originally Posted by tefara View Post
    Great to hear that anyone believe it ís impossible, that's mean my algorithm is unique. But i'm dissapointed that you guy always try to hacked into other people brain instead of thinking seriously my question about compression speed and ratio. Topic close here until i has finished study phyton and develop my software. A video or some pictures about my work will be upload later
    all of us believe (and proof) it is impossible. once you implement it, i believe you will be the richest man in the world (at least you can get all bonus of hutter price)

  24. #22
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by RichSelian View Post
    all of us believe (and proof) it is impossible. once you implement it, i believe you will be the richest man in the world (at least you can get all bonus of hutter price)
    WOW, thank so much, i always try to find some website like that since i invented my algorithm, so all i need to do is compress the enwik.zip to smaller than 34.8mb, too simple.

  25. #23
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    143
    Thanks
    47
    Thanked 40 Times in 29 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    it was rather easy case, colleagues
    Until the next one comes around.
    There seems to be a near constant flow of this kind of people with big dreams and no understanding of math.

  26. #24
    Member
    Join Date
    Nov 2014
    Location
    California
    Posts
    143
    Thanks
    47
    Thanked 40 Times in 29 Posts
    Quote Originally Posted by tefara View Post
    WOW, thank so much, i always try to find some website like that since i invented my algorithm, so all i need to do is compress the enwik.zip to smaller than 34.8mb, too simple.
    Too bad you did not listen to what people are trying to tell you. Start by reading about compression before jumping to conclusions.
    Being skeptical is smart, being constantly wrong is not.

  27. #25
    Member jibz's Avatar
    Join Date
    Jan 2015
    Location
    Denmark
    Posts
    122
    Thanks
    104
    Thanked 71 Times in 51 Posts
    Actually the hard part is not compressing enwik8 to 15mb, it is decompressing it again.

    I think if you've actually considered the comments in this thread about universally compressing random data, and still feel there is merit to your method, you should definitely go ahead and implement it. It will be a good learning experience.

  28. Thanks:

    xcrh (7th January 2016)

  29. #26
    Member
    Join Date
    May 2009
    Location
    France
    Posts
    99
    Thanks
    13
    Thanked 75 Times in 45 Posts

  30. #27
    Member
    Join Date
    Nov 2015
    Location
    boot ROM
    Posts
    95
    Thanks
    27
    Thanked 17 Times in 15 Posts
    BARF does it better, at least on Calgary corpus, but I think there is some little catch...

    Another viable option is to read source file, encode some part as filename, write it to filesystem as zero-sized file. Repeat it until all data were encoded to filenames. Decode is the reverse of this process. At the end of day, you can store your 20 megabyte file into bunch of zero sized files. And their total size still going to be 0 bytes. Sounds like a bargain, right?! But once more, there is some little catch...
    Last edited by xcrh; 7th January 2016 at 23:35.

  31. #28
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,537
    Thanks
    758
    Thanked 676 Times in 366 Posts
    actually, if you have a good memory, you can compress file of any size into empty file. and it will have good old 8.3 name

  32. #29
    Member
    Join Date
    Jan 2016
    Location
    Vietnam
    Posts
    15
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Well, i used javascript to generate a string about 20 ramdom char_3bit ( number between 0~7) , using my algorithm to encode ( by hand), it and i have a string of 18,3 char_3bit. Then, i decode and get exactly the original string, I don't care you guy believe or not. I create this topic to find out that is there any other algorithm can compress random data, and discus about compression speed and compress ratio.

  33. #30
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    Quote Originally Posted by tefara View Post
    WOW, thank so much, i always try to find some website like that since i invented my algorithm, so all i need to do is compress the enwik.zip to smaller than 34.8mb, too simple.
    Well, first you need to learn to write code. Then your second problem is you will discover your program doesn't work.

    Really, we are trying to help. But if you want to be stubborn about it, then you're on your own.

Page 1 of 2 12 LastLast

Similar Threads

  1. CRC used to figure out compression algorithm
    By Omnikam in forum The Off-Topic Lounge
    Replies: 13
    Last Post: 29th February 2016, 02:21
  2. Anyone know which compression algorithm does this?
    By hjazz in forum Data Compression
    Replies: 8
    Last Post: 24th March 2014, 05:49
  3. Help identify compression algorithm?
    By DotDotDot in forum Data Compression
    Replies: 0
    Last Post: 1st June 2013, 09:15
  4. Hierarchy compression algorithm and more
    By teddybot in forum The Off-Topic Lounge
    Replies: 7
    Last Post: 3rd May 2012, 01:16
  5. New layer 0 - compression algorithm
    By abocut in forum Data Compression
    Replies: 5
    Last Post: 28th May 2010, 01:32

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •