Page 3 of 4 FirstFirst 1234 LastLast
Results 61 to 90 of 93

Thread: Data compression explained

  1. #61
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    The last paragraph in section Conclusion:
    Quote Originally Posted by Matt Mahoney
    We now make two observations. First, if the universe were divided into regions the size of bits, then each volume would be about the size of a proton or neutron. This is rather remarkable because the number is derived only from the physical constants T, c, h, and G, which are unrelated to the properties of any particles. Second, if the universe were squashed flat, it would form a sheet about one neutron thick. Occam's Razor, which the computability of physics makes true, suggests that these two observations are not coincidences.
    I understand why c, h, and G are physical constants, but why is T a physical constant? A few sentences above:
    The universe has a finite age, T, about 13.7 billion years. Because information cannot travel faster than the speed of light, c, our observable universe is limited to an apparent 13.7 billion light years, although the furthest objects we can see have since moved further away. Its mass is limited by the gravitational constant, G, to a value that prevents the universe from collapsing on itself.
    Wasn't the estimated value of T different years ago? How much was it different 100 years ago?

    What about the mass of the universe? What is the most popular estimate, and how much did it change in the last 100 years?

    The last sentence in section Encryption:
    Quote Originally Posted by Matt Mahoney
    If your system ever does become popular, then others will reverse engineer it and they will break it.
    If your car is much better than 99.9% of cars in your city, then others will steal it, and they will get a lot of money for it.

  2. #62
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Yes, you're right. The age of the universe is not constant.

    In fact the bound on the length of a description of the state of the universe must be increasing with the square of its age. But knowing the current state exactly would also allow you to know the state for all time, which seems to be a contradiction. An alternative view is that the universe we observe is the result of a simple program running on a powerful abstract computer: try an infinite number of universes with all possible laws of physics. We must necessarily observe a universe with physics that supports intelligent life. That universe might only need a few hundred bits to describe the free parameters of string theory and general relativity needed to form planets and to make the chemistry for life come out right. I wrote a bit more about this at http://mattmahoney.net/singularity.html (OK, not really compression related).

    My comments about encryption are for those who would design their own secret encryption algorithms and not publish source code to "improve" security. You are actually doing the opposite. The problem is that many amateurs in cryptography notice that compressed data and encrypted data both look random and get the idea that you can use a key to fiddle with the compression algorithm a bit and have a neat new encryption algorithm. Such algorithms are almost always broken because they don't understand the basic principles of cryptography. If you are not one of the 100 or so people on this planet that really know how to create new, unbreakable encryption algorithms, then use a well tested algorithm like AES and publish your source code so others can make sure you did it right. (If you can't explain why AES or RSA or ECC or SHA-256 are secure, or can't come up with new attacks against them, then you probably aren't one of those 100 people).

    Or better yet, don't encrypt at all. Most people won't use the feature so it just adds bloat. For those that do need it, there are other tools that they probably already have. A good program should do one thing and do it well.

  3. #63
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Yes, you're right. The age of the universe is not constant.

    In fact the bound on the length of a description of the state of the universe must be increasing with the square of its age. But knowing the current state exactly would also allow you to know the state for all time, which seems to be a contradiction. An alternative view is that the universe we observe is the result of a simple program running on a powerful abstract computer: try an infinite number of universes with all possible laws of physics. We must necessarily observe a universe with physics that supports intelligent life. That universe might only need a few hundred bits to describe the free parameters of string theory and general relativity needed to form planets and to make the chemistry for life come out right. I wrote a bit more about this at http://mattmahoney.net/singularity.html (OK, not really compression related).

    My comments about encryption are for those who would design their own secret encryption algorithms and not publish source code to "improve" security. You are actually doing the opposite. The problem is that many amateurs in cryptography notice that compressed data and encrypted data both look random and get the idea that you can use a key to fiddle with the compression algorithm a bit and have a neat new encryption algorithm. Such algorithms are almost always broken because they don't understand the basic principles of cryptography. If you are not one of the 100 or so people on this planet that really know how to create new, unbreakable encryption algorithms, then use a well tested algorithm like AES and publish your source code so others can make sure you did it right. (If you can't explain why AES or RSA or ECC or SHA-256 are secure, or can't come up with new attacks against them, then you probably aren't one of those 100 people).

    Or better yet, don't encrypt at all. Most people won't use the feature so it just adds bloat. For those that do need it, there are other tools that they probably already have. A good program should do one thing and do it well.
    I know I think differently than most. At least that is what my
    friends and enemies say. But if the state of the Universe is increasing
    as you say with the amount of time squared. Knowing it perfectly
    at any time your would not be able to know the state perfectly in
    the future. Since its increasing it must be a random process that
    can not be known. I don't think there are an infinite number of
    universes. I think all it means is that the future is not known.
    Which means that sometimes we have a free will choice.
    That is what quantum theory means. The future of the
    Universe can not be known from the present.
    However if the state at some points is such that its
    contracting we may be able to deuce the end state
    of the universe without knowing all the intermediate
    states.

    I think people should make there own encryption
    systems because you learn by your mistakes. If you
    never try to write or break one you are missing some
    of the fun of using your mind. It is a learning process.

    But I wrote and still like scott19u. I have new features
    but with the changing state of laws who knows what one
    can do thats legal. Encryption is like global warming in
    that many so called experts define the direction of
    research and keep new comers out. And if your a
    new comer you will make mistakes or you may not care
    about certain modes of attack. But as you play with
    it you see the reasons to expand ones views. One
    problem with a new comer is that even if your code
    is sound you will be dismissed and your code without any
    one really looking. I know I was a pain in the ass to
    some encryption people. The person who set up my
    XOOM account did it because he claimed certain high
    ups that I was a pain in the ass and he likes scott19u.

    That code was called snake oil I thought of renaming
    it mamba or whatever. Shortly after it come out a
    recognized expert in encryption called it worthless and
    stated us slide attack breaks it easily. So thats what the
    crypto people spread and belived. The guy that set up
    my crypto site knew that method would not break the
    code. So he set up the expert by telling him how he
    tried to use this so called slide attack to break it.
    When the expert explained why it should. My unknown
    friend had to explain how parts of the encryption method
    worked so that the attack failed. Finally the so called
    expert admitted he never could follow the C code that I
    wrote so that he really did not know what it did. But
    he already did the work of calling it no good. So I really
    think if an outsider develops good crypto it will be
    ignored and trashed. This stuff of where the guy
    tried to break it was in a series of posts on sci.crypt
    Another point look at Matt Timmermans BICOM it compression
    and AES that is fully bijective using only full blocks of AES. I
    wrote to the inventors of AES or (rain doll) as it was called
    then only one bothered to write back. And he stated that
    was obviously not possible. He was to lazy to check Matt
    Timmermans code. The so called experts in encryption live
    in such a narrow box that it appears to me they hate
    outsiders. Of course they allow in for press 16 year old
    girls that do stuff for a science project and call her genius.
    So they can pretend so outsiders get in.

    Encryption is no better than the weakest link. When you
    improve some of the weak links such as how to do real
    bijective CBC mode AES so that its bijective. Yet exactly
    like the standard except you often don't have the extra
    wasted block its obviously better. Yet you only get hate.

    Compress is at least easier to test and show people and
    any one with half brain can repeat and see results. Encyrption
    is not like that. It hard to show even intelligent what gains
    you make especially if you piss off the encryption gods.

    Bijective compression also caused long running arguments
    on comp.compression with so called experts either quoting
    or showing proofs why its not possible. Plus if you can't
    write like me people are even more sure that its safe to
    attack my work while be to lazy to get off there asses to
    test it. But fortunately some people actually try it and then
    it starts to take a life of its own. Look I am an old man
    that is in his sixties I still came up with BWTS. I tried to
    show the gods of BWT it but the few that answered stated
    it was not possible. It seems they repeat in themselves
    the myth and belied that BWT has so much order that you
    need extra information in an index to do the UNBWT I knew
    in my gut that was crap. I tried many methods I think
    mostly when I sleep. When I wake I have headaches
    and vertigo and often fill like vomiting. But after many tries
    and coming up with things that are bijective but not close
    enough to BWT I finally hit on it. I can't write but after
    telling many people about it. Gil from Israel offered to
    write it up. It was rejected right way so was not really done
    Gil expected that they would raise a few objections and
    then we would polish it. Near the end he asked where did
    you get you Phd. I told I don't have one. I think he was
    surprised. Needles to say people who reviewed felt it was
    nothing new and some one pointed all you need to do to
    make it bijective is add a special unique character it end
    to do a bijective BWTS. Matt in his right up stated right
    a way that is nonsense. Of course what do you expect
    form the people they use to review stuff we got no
    feedback no chance to respond. Gil stated that you have
    to pay your dues to get published whatever that means.
    I wrote back and stated in some ways glad I was not
    published since the global warming people publish so much
    crap that is previewed and obviously false that it best
    not to have it published.

    Anyway thanks for opening the door for my rant. I
    actually looked at what I wrote several times. But
    I am sure there are still typos and bad spelling.
    When I type then reread I see what I wanted to type
    instead of what I typed if that makes any sense. At
    least this spell checker has found no more errors. Trust
    me this long port would be worse without the speel
    checker.

  4. #64
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,505
    Thanks
    26
    Thanked 136 Times in 104 Posts
    You should've to show them diagram block instead of C code. Reading someone's code is usually a pain.


    Please remember algos like ZipCrypto which are custom designed & broken easily. It's not good to use them in commercial archiver.
    Last edited by Piotr Tarsa; 24th May 2010 at 11:33.

  5. #65
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    Quote Originally Posted by Alexander Rhatushnyak View Post
    Wasn't the estimated value of T different years ago? How much was it different 100 years ago?
    Some numbers of 100 years and longer ago:

    First humanoid at our chain of planets:
    1.664.500.987 years ago (as estimated in 1887).

    First human as we know it split in male and female:
    18.618.728 years ago (as estimated in 1887).

    The universe is endless and repeatedly disappear (destruct/die) and appear (create/born) itself again but the biggest known cycle to human is:
    311.040.000.000.000 years, from what our universe has done round 158.630.400.000.000 years of it (51%) so 152,409,600,000,000 years to go then it disappear for the same length as it existed also 311.040.000.000.000 years so the total cycle is 622,080,000,000,000 years.
    Last edited by Sportman; 23rd May 2010 at 18:34.

  6. #66
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    Quote Originally Posted by Matt Mahoney
    Yes, you're right. The age of the universe is not constant.
    I mean, can you measure T, mass of the universe, and number of nats in it, directly, as you do with c and G? Don't these values depend on what hypotheses you support, what theories you pick, what assumptions you make?
    What are the most popular values nowadays, and what were the most widespread estimates 50 years ago, in 1960?
    There are facts that no modern theory can explain, aren't there?

    Quote Originally Posted by Matt Mahoney
    My comments about encryption are for those who would design their own secret encryption algorithms and not publish source code to "improve" security.
    I understood what you meant to say in section Encryption, but the style of statements is a bit... surprising.
    Last edited by Alexander Rhatushnyak; 24th May 2010 at 07:18.

  7. #67
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    The age of the universe is 13.58 to 13.92 billion years old by current estimates. Maybe not everyone agrees, just like there are people who believe that the world is flat. But my point is that whatever the number, the universe has a finite description length. True infinite incompressible (random) data sources do not exist because the laws of physics makes it impossible. So even though most strings of a given finite length are not compressible in the Kolmogorov sense (there is no shorter description), most strings that you actually encounter are compressible.

    When we model data as coming from an infinite random source and make no attempt to compress it, it is not because it really is random, but because we can't do the computation to predict it. We must do this because (1) deciding whether finite strings are random is not computable and (2) it is not possible for a physical computer to model the universe that contains it. The second fact was proved by Wolpert.

  8. #68
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Quote Originally Posted by biject.bwts View Post
    Compress is at least easier to test and show people and
    any one with half brain can repeat and see results. Encyrption
    is not like that. It hard to show even intelligent what gains
    you make especially if you piss off the encryption gods.
    I think the reason is that it is easy to make up new encryption algorithms, harder to break them, and even harder to prove them to be strong. So of course encryption experts get tired of breaking amateur attempts. They have better things to do. I think if you want to get attention, then you have to at least put as much effort into proving your system is strong as they would have to do to prove it is weak. That is a lot of work, so most people don't bother. Instead they just post some random string and challenge the experts to break it. They are rightly flamed and ignored.

    So that is what I mean by explain why AES is strong. What is the reason for the math behind it? Most people including me can't answer that. If I could, then I would probably be smart enough to break it, or at least tell which variations could be broken and which couldn't. Since I'm not that smart, I am better off using something well tested instead.

  9. #69
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    Quote Originally Posted by Matt Mahoney View Post
    The age of the universe is 13.58 to 13.92 billion years old by current estimates.
    Good! What are the mass and the number of nats in the universe by current estimates, and what were these three ranges 50 years ago? Why not to put these numbers to the end of Conclusion?

    Quote Originally Posted by Matt Mahoney View Post
    Maybe not everyone agrees, just like there are people who believe that the world is flat.
    I suppose every normal person can make an experiment to find out if the Earth is flat or not. But less than 0.01% of people try to find out (using scientific methods) what the universe was like 5...15 billion years ago.

    Quote Originally Posted by Matt Mahoney View Post
    But my point is that whatever the number, the universe has a finite description length. True infinite incompressible (random) data sources do not exist because the laws of physics makes it impossible. So even though most strings of a given finite length are not compressible in the Kolmogorov sense (there is no shorter description), most strings that you actually encounter are compressible.

    When we model data as coming from an infinite random source and make no attempt to compress it, it is not because it really is random, but because we can't do the computation to predict it. We must do this because (1) deciding whether finite strings are random is not computable and (2) it is not possible for a physical computer to model the universe that contains it. The second fact was proved by Wolpert.
    This is interesting, but my concern is about the last paragraph in section Conclusion:
    "We now make two observations. First, if the universe were divided into regions the size of bits, then each volume would be about the size of a proton or neutron. ... Second, if the universe were squashed flat, it would form a sheet about one neutron thick. Occam's Razor, which the computability of physics makes true, suggests that these two observations are not coincidences."
    Are there other publications discussing or mentioning these two observations, except the Data Compression Explained?

  10. #70
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Quote Originally Posted by Alexander Rhatushnyak View Post
    I suppose every normal person can make an experiment to find out if the Earth is flat or not. But less than 0.01% of people try to find out (using scientific methods) what the universe was like 5...15 billion years ago.
    You can look back to galaxies when the universe was about 1 billion year old and to microwave background radiation when the universe was even younger. But it doesn't really matter because even 50 years ago we knew the constants c, G, and h (not as accurately), so we knew that the universe had to have finite size, mass, and resolution due to Heisenberg uncertainty, assuming it had finite age. It wasn't until later that Hawking gave an exact value for Bekenstein's approximate limit to the entropy of a black hole. I think that 50 years ago that black holes, quantum mechanics, and the big bang theory were controversial. We knew then that the universe was expanding, but there was a competing steady state theory.

    Also, Kolmogorov complexity and algorithmic information theory were new ideas. Shannon used the term "equivocation" instead of "entropy" to measure information, and I don't think it was appreciated that thermodynamic entropy also measured information or that it had an absolute value. (In thermodynamics you could only measure changes in entropy).

    Quote Originally Posted by Alexander Rhatushnyak View Post
    This is interesting, but my concern is about the last paragraph in section Conclusion:
    "We now make two observations. First, if the universe were divided into regions the size of bits, then each volume would be about the size of a proton or neutron. ... Second, if the universe were squashed flat, it would form a sheet about one neutron thick. Occam's Razor, which the computability of physics makes true, suggests that these two observations are not coincidences."
    Are there other publications discussing or mentioning these two observations, except the Data Compression Explained?
    I could not find any other sources where these ideas were published before. Originally I was unaware of the Bekenstein bound and came up with the same number myself (10^122 bits) by calculating the number of possible states of a wave with given energy in an enclosed volume. I think this is what Bekenstein did, without Hawking's refinement. Then I calculated the number for a volume the size of the universe (13.7 billion light years) and an energy equal to a mass sufficient to create a black hole of the same size (where the escape velocity is the speed of light at 13.7 billion light years from the center) since that is the largest mass you could have, and actually pretty close to the estimated mass of the universe of about 10^80 atoms.

    I was surprised that when you divide the universe into 10^122 parts you get 1 bit equal to the size of a proton or neutron. I did not understand why atomic particles are nowhere near Planck scale, so where did these constants come from? Adding the age of the universe make it come out right, but I am not sure what it means. Perhaps the properties or numbers of particles are changing? But once that happens, notice that 10^80 ~ 10^122^(2/3) which is where I got the idea that a flattened universe would be 1 particle thick.

  11. #71
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by Matt Mahoney View Post
    I think the reason is that it is easy to make up new encryption algorithms, harder to break them, and even harder to prove them to be strong. So of course encryption experts get tired of breaking amateur attempts. They have better things to do. I think if you want to get attention, then you have to at least put as much effort into proving your system is strong as they would have to do to prove it is weak. That is a lot of work, so most people don't bother. Instead they just post some random string and challenge the experts to break it. They are rightly flamed and ignored.

    So that is what I mean by explain why AES is strong. What is the reason for the math behind it? Most people including me can't answer that. If I could, then I would probably be smart enough to break it, or at least tell which variations could be broken and which couldn't. Since I'm not that smart, I am better off using something well tested instead.
    I would not use pure AES. It may or may not be strong. There really is no
    proof. The elites want you to think its strong but they are always hedging
    there bets. The so called crypto experts tend to ignore Shannon and such
    things as Unicity distance. In fact if you know the text is pure English the
    attacker has all the information to break the cipher with a cipher only
    attack my using the data from only a few blocks.
    But just because all the information is contained in just a few short
    blocks does not mean that someone has a break. Call my what you
    want but I don't think the Spooks would allow AES to have won the
    contest when a new cipher was designed. Any more than they did
    when DES was first used. I think they have simple breaks.

    That why compression before encryption is so important. Since
    Shannon proved for simple compression one needs more cipher
    text blocks before there exits enough information for a break.

    I think if AES won it may be due to weaknesses if one uses English
    text because of the short Unicity distance. Why give an attacker
    a break. Compress the dam file add random data to front and back.
    then do a binary BWTS to the file then a bijective AES that way at
    least you have a very large Unicity Distance and the problem is
    harder for an attacker to use a cipher text only type of attack.

    I am sorry but I think its a mistake to trust it. You should never
    trust just one cipher ask the Germans if your elite people think
    its secure and its the only thing you use. The attacker may use
    all his resources to come up with an unexpected break. I could
    be wrong but I have very strong feelings about this.

  12. #72
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Plenty of people have floated conspiracy theories about the US government having secret backdoors, but the fact is that AES was developed by Dutch cryptologists and selected in an open worldwide competition. This is unlike DES, which was developed by IBM under contract with NSA and where some of the design principles were kept secret. There were conspiracy theories about that too, but as it turned out, it was designed to resist certain types of attacks that were not known to the public at the time but independently discovered later. After many decades the only known weakness was its 56 bit key size which everyone knew about from the beginning.

    If you are worried about AES then there are plenty of other ciphers to choose from like twofish, blowfish, serpent, RC4, XTEA, 3DES, etc. Any good cipher has the property that it is just as strong against chosen plaintext attacks as against ciphertext only attacks, so there is no need to compress bijectively. Either way it takes 2^keysize work to break (as far as we know. If you want proof of security then use a one time pad. Bijective compression won't make any difference here either).

  13. #73
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    1WDBN

    Quote Originally Posted by Matt Mahoney View Post
    Plenty of people have floated conspiracy theories about the US government having secret backdoors, but the fact is that AES was developed by Dutch cryptologists and selected in an open worldwide competition. This is unlike DES, which was developed by IBM under contract with NSA and where some of the design principles were kept secret. There were conspiracy theories about that too, but as it turned out, it was designed to resist certain types of attacks that were not known to the public at the time but independently discovered later. After many decades the only known weakness was its 56 bit key size which everyone knew about from the beginning.

    If you are worried about AES then there are plenty of other ciphers to choose from like twofish, blowfish, serpent, RC4, XTEA, 3DES, etc. Any good cipher has the property that it is just as strong against chosen plaintext attacks as against ciphertext only attacks, so there is no need to compress bijectively. Either way it takes 2^keysize work to break (as far as we know. If you want proof of security then use a one time pad. Bijective compression won't make any difference here either).
    I believe the US had the ability to break DES with a full search in a
    reasonable amount of time from the very beginning. It was based on
    the IBM cipher LUCIFER

    http://www.quadibloc.com/crypto/co0401.htm

    The NSA could easily have modified it to be 128 bit cipher with
    a 128 bit key and have added the features added to DES that helped
    make it more resistant to differential cryptanalysis but instead they
    made a much shorter one of with a 64 bit block. and a very short
    key of 56bits. There was not reason they could not have left it 128
    bits with a 128 bit key. I am surprised they at least didn't make it
    a 64 bit key. But maybe they decided that 56 was about all they
    could brute force in a reasonable amount of time with the hardware
    that they had.

    I was round at the time of the "open competition" I think it was about
    as open as the IPCC is about real facts about the climate. Also when
    there where questions about various modes of operation I suggested
    making a bijective CBC mode so at least if one wanted to use it
    they could. It was easy to do this in a way that the only difference
    from normal CBC methods is that sometimes you write fewer ciphertext
    blocks. No way could it be weaker since exactly the same except
    sometimes less info given to an attacker. It even would have saved
    space. They talk as if minor mistakes like what happened in Enigma
    lead to breaks. Yet in real life they don't care about the little pieces
    of extra information. What else could it be other than they want
    to break it. Of course the guy who set up me XOOM page said
    he heard my name a lot in the big boy talks about crypto. They
    really don't like those from the outside. He enjoyed me poking
    fun at them. Wish I could have meet him.


    I don't believe you can prove that even AES is as strong against
    all give plain text attacks as it is against a cipher text only attack.
    I can't decode an English message that is encrypted with AES by
    using just a handful on cipher text blocks. But the information for a
    break is there. Its possible nobody will find the break. But its also
    possible that China or the US already has a break.

    But when its cheap to increase the Unicity Distance to almost the
    whole length of the compressed file. Why would one want use the
    method in such a poor way that any small number of blocks contains
    enough information for a possible break.

    You would think we learned something from WWII but it seems like
    we have not.

  14. #74
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    Quote Originally Posted by biject.bwts View Post
    I believe the US had the ability to break DES with a full search in a
    reasonable amount of time from the very beginning.
    Not only DES almost all encryptions 512-bit/1024-bit where at least in year 2000 no problem to decode in reasonable time. There is more then back doors; think weaknesses, short cut brute force tricks and the most important CPU's we can buy maybe about 10-15 years...

  15. #75
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    NSA/IBM was at least open about DES being vulnerable to brute force attacks by choosing a 56 bit key. http://en.wikipedia.org/wiki/Data_Encryption_Standard

    As for AES, I don't know how the design and selection process could have been more open. http://en.wikipedia.org/wiki/Advance...andard_process

    Unicity distance offers only theoretical protection against unlimited computing power when you can prove that the compressed size minus the Kolmogorov complexity is less than the key size. But since Kolmogorov complexity is not computable even with unlimited computing power, you can't prove the security either.

    Bijective compression eliminates the possibility that an adversary could detect a wrong key guess by detecting invalid input to the decompresser. But this offers no protection against a known plaintext attack. In a ciphertext only attack, the attacker can still detect a correct key guess by looking at the decompressed output. Of course you want a program to look at it, which means you only need a better compression model. So for bijective compression to add security, you need for the algorithm to compress close to the best an adversary could achieve, so that the difference is less than the size of the key. This gets harder as the message gets longer.

  16. #76
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts

    Thumbs up

    Quote Originally Posted by Matt Mahoney View Post
    I could not find any other sources where these ideas were published before.
    It's very likely that you made an important discovery, either about the universe or about properties of the most popular model. Did you discuss your two observations with physicists, astronomers, cosmologists?
    Also, what is the worst-case ratio? Can you please write something like this:
    "The radius of the universe is V1+-D1, the number of (baryons?) in the universe is V2+-D2. If the universe were squashed flat, it would form a sheet with thickness V3+-D3, while the radius of a neutron is V4+-D4, thus in the worst case the ratio is ..." (either (V4+D4)/(V3-D3) or (V3+D3)/(V4-D4))
    And similarly for the other observation.
    Last edited by Alexander Rhatushnyak; 27th May 2010 at 08:00.

  17. #77
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Unicity distance offers only theoretical protection against unlimited computing power when you can prove that the compressed size minus the Kolmogorov complexity is less than the key size. But since Kolmogorov complexity is not computable even with unlimited computing power, you can't prove the security either.


    Bijective compression eliminates the possibility that an adversary could detect a wrong key guess by detecting invalid input to the decompresser. But this offers no protection against a known plaintext attack. In a ciphertext only attack, the attacker can still detect a correct key guess by looking at the decompressed output. Of course you want a program to look at it, which means you only need a better compression model. So for bijective compression to add security, you need for the algorithm to compress close to the best an adversary could achieve, so that the difference is less than the size of the key. This gets harder as the message gets longer.
    Like you said earlier A one time pad is only provable secure method.

    I am not saying that We have a system that is 100% secure. I am
    saying that if you encrypt a plain text English file with AES. The better
    the compression the more Ciphertext blocks needed before there
    is even enough information for an attack. Meaning there is likely only
    one key that could lead to a solution.
    Shanon was in his proofs implying smooth compression that is bijective
    if part of a file compresses better than another part then taking even
    a smaller number of blocks form part where low compression occurred
    would even make it easier for a break. If the file has a known header
    or trailer then you have best world of all since you know part of
    the plain text.

    This is where a proper bijective BWTS compressor especially at bit level
    would shine. Since for a long file the effective Unicity distance would
    be quite larger since the there is no real rate of compression its sort of
    spread spectrum through out the whole file. There is not encough information
    in first 10% of cipher text to know what any of the english is.
    That was part of the driving force behind seeking a good BWTS.

    It so obvious that this is a better approach I would bet that the
    3 letter agencies are using it or will. Even though they may not
    mention it in public. It just makes so much sense.

    In the future and maybe even now as quantum computers or whatever
    become more real Unicity Distance will again be a concern.

    At least in my world view

  18. #78
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Quote Originally Posted by Alexander Rhatushnyak View Post
    It's very likely that you made an important discovery, either about the universe or about properties of the most popular model. Did you discuss your two observations with physicists, astronomers, cosmologists?
    Also, what is the worst-case ratio? Can you please write something like this:
    "The radius of the universe is V1+-D1, the number of (baryons?) in the universe is V2+-D2. If the universe were squashed flat, it would form a sheet with thickness V3+-D3, while the radius of a neutron is V4+-D4, thus in the worst case the ratio is ..." (either (V4+D4)/(V3-D3) or (V3+D3)/(V4-D4))
    And similarly for the other observation.
    The size of the universe is not a well defined quantity because it has no boundaries and which parts you can see depends on where you are, so I used Tc = 13.7 billion light years. Also the size of a proton or neutron is not well defined because particles are not hard spheres. They are defined by probability densities given by a quantum wave equation, but it is roughly h/mc ~ 10^-15 meters, where m is its mass. Another problem is that the universe is believed to be something like 4% ordinary matter, 21% dark matter and 75% "dark energy" according to some theories. Dark matter might be ordinary matter that did not form into stars, but nobody really understands dark energy which some say causes the universe to accelerate outward. So I just set the Schwarzchild radius to Tc, which is a gross approximation.

  19. #79
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    Quote Originally Posted by Matt Mahoney View Post
    So I just set the Schwarzchild radius to Tc, which is a gross approximation.
    This is what I am concerned about: your assumptions and approximations. How much did they influence your observations? When you say a sheet about one neutron thick, do you mean A=B+-10%, A=B+-100%, or +-500% ? You didn't calculate the worst-case ratio yet, did you?

  20. #80
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    I mean like between 0.1 and 10 neutrons thick, although I think even that is overstating the accuracy of my calculations. But I think that is enough to show a coincidence. Otherwise you get into disagreements about too many details, like how to measure the size of the universe, do you squash flat to a disk or to a surface of a sphere, and how do you take into account relativity, dark energy, universe expansion, etc.

    But this is secondary to my main point that the universe has a finite description length and is Turing computable. It explains the lack of infinite random data sources in nature, and why real observations are usually compressible. It justifies Solomonoff induction and Kolmogorov complexity, which are formal statements but not proofs of Occam's Razor, as a universal probability distribution to which all compression models should aspire.

  21. #81
    Member Alexander Rhatushnyak's Avatar
    Join Date
    Oct 2007
    Location
    Canada
    Posts
    252
    Thanks
    49
    Thanked 107 Times in 54 Posts
    Quote Originally Posted by Matt Mahoney View Post
    I mean like between 0.1 and 10 neutrons thick, although I think even that is overstating the accuracy of my calculations.
    If you discussed the set of assumptions and approximations and then showed calculations, that would look better, I think. If those calculations led you to a range "between 0.01 and 10 neutrons thick" or "between 0.1 and 100 neutrons thick", would you still write "about one neutron thick" ?

    I know little about astrophysics and cosmology, I would fail to complete the list of assumptions (BTW, I'd rather believe that you have such list but prefer not to show it, than that you don't have it ) but I guess conclusions "between 0.5 and 2 neutrons thick" and "between 50 and 200 neutrons thick" would lead to very different consequences.

    Don't you believe that your two observations are important for astrophysics and/or cosmology, regardless of how you apply them to prove that the universe has a finite description length and is Turing computable?

    Quote Originally Posted by Matt Mahoney View Post
    But this is secondary to my main point that the universe has a finite description length and is Turing computable. It explains the lack of infinite random data sources in nature, and why real observations are usually compressible.
    Are there no simpler explanations?
    If the universe had no finite description length, would real observations be usually incompressible?
    Which of our assumption(s) are so essential that we get the opposite on output if they are incorrect?
    These are examples of questions a thoughtful reader of DCE is likely to ask.
    Last edited by Alexander Rhatushnyak; 28th May 2010 at 17:37.

  22. #82
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    Quote Originally Posted by Sportman View Post
    ....311.040.000.000.000 years, from what our universe has done round 158.630.400.000.000 years of it (51%)....
    Here I wrote wrong 'universe' it must be 'solar system' so:
    ....311.040.000.000.000 years, from what our solar system has done round 158.630.400.000.000 years of it (51%)....

    (as written in 1500?1000 BCE)

  23. #83
    Member
    Join Date
    Aug 2008
    Location
    Planet Earth
    Posts
    1,040
    Thanks
    104
    Thanked 420 Times in 293 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Another problem is that the universe is believed to be something like 4% ordinary matter, 21% dark matter and 75% "dark energy" according to some theories. Dark matter might be ordinary matter that did not form into stars, but nobody really understands dark energy which some say causes the universe to accelerate outward.
    Only 1/7 (14,29%) of what scientists call 'matter' or 'energy' is visible in solar systems the other 6/7 (85.71%) is invisible for the 5 known human senses and scientists instruments. The reason for this is to work as natural border between life forms each at their own level of development.

    As I wrote earlier that the universe is endless but born, become mature and die and then disappear for the same time as it exist to be born again to repeat this cycle again. In other words nothing in the universe is not in motion. The force who cause it is laying outside the universe but penetrating everything in the universe. This force is endless and for ever causing this cycles. The reason for this is to bring every new universe and it's life forms at a higher development level.

    Everything in universe is alive and has some form of development level and conscious from low to high. What scientist call 'matter' are clusters of life forms and every cluster of life forms is part of a bigger cluster what is also a life form on it's own. Often not aware of the life forms where it's build from and this repeat endless both directions.

    Universes, galaxies, milky ways, solar systems, planets, humans, animals, plants, minerals, atoms, etc. all acts as a life form with it's own development level and conscious level. For this reason there are no fixed sizes because every life form has its own development level and its own limited space to act freely. This is clear to see by humans where some eat to much and other to little and have a different size than the average human from the same race, gender and age.
    Last edited by Sportman; 2nd June 2010 at 01:43.

  24. #84
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    414
    Thanked 416 Times in 158 Posts
    http://mattmahoney.net/dc/dce.html#Section_557

    Typo: "the engine for WinRK by Malcolm Tayler" - His name is Malcolm Taylor

    You may specify a BCM compressor home:
    http://encode.narod.ru

  25. #85
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    Thanks.

  26. #86
    Member
    Join Date
    Mar 2009
    Location
    California
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Without source code we'll never know
    It's a typo. It should be: sum += abs(Bn[x]) + abs(Bw[x])

  27. #87
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    That would be logical, but how do you know?

  28. #88
    Member
    Join Date
    Mar 2009
    Location
    California
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Sorry for the very late post, I somehow missed this thread. I know the code . Nice write-up BTW.

  29. #89
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 798 Times in 489 Posts
    OK thanks. I updated the section. http://mattmahoney.net/dc/dce.html#Section_6163 (computation of avg(u,v)).

  30. #90
    Member
    Join Date
    May 2012
    Location
    Finland
    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I see you linked to my errata in section 6.1.6.3, but it seems you missed a few of them in the description still.

    One is that this section:

    pU = U00 - 2.2076(U01 - S01)
    pL = L00 - 2.2076(L10 - S10)

    Should be this (subtration changed to addition):

    pU = U00 - 2.2076(U01 + S01)
    pL = L00 - 2.2076(L10 + S10)

    There might be some more but I am not sure I want to hurt my brain any more by trying to remember that algorithm and all its errors. The errata should basically be correct (unless I made some typos of my own), as confirmed by the fact that I implemented an actual, working decompressor (which did not work without all of the changes listed in the errata, and possibly more in case I forgot any when I wrote it down.)

    (Also, as you point out, the entropy coder is pretty bad. For fun, I tried changing it to a much simpler one like the one used by LZMA and friends, and compression became noticeably better. I also fixed some of the other stupidities in the algorithm, like the entirely unnecessary slicing and separation of components.)

Page 3 of 4 FirstFirst 1234 LastLast

Similar Threads

  1. Any money in data compression?
    By bitewing in forum The Off-Topic Lounge
    Replies: 18
    Last Post: 19th March 2019, 11:34
  2. Data compression group on facebook
    By Matt Mahoney in forum The Off-Topic Lounge
    Replies: 8
    Last Post: 14th May 2010, 23:16
  3. Advice in data compression
    By Chuckie in forum Data Compression
    Replies: 29
    Last Post: 26th March 2010, 16:09
  4. Data Compression Crisis
    By encode in forum The Off-Topic Lounge
    Replies: 15
    Last Post: 24th May 2009, 20:30
  5. Data Compression Evolution
    By encode in forum Forum Archive
    Replies: 3
    Last Post: 11th February 2007, 15:33

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •