Page 1 of 2 12 LastLast
Results 1 to 30 of 31

Thread: US$3 Million Data compression Prize

  1. #1
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts

    Lightbulb US$3 Million Data compression Prize

    Greetings :

    check out :

    http://groups.google.com/group/comp....2f82e171?hl=en#

    there will be many happy just to be part of contribute to this history 'breakthrough' [ if indeed it turns out ] !
    however I am not averse to confirm REWARDS :

    . your choice whether to accept one-time immediate US$1,000 payment on delivery of the software ( prefers C# )

    OR

    . accept a retainer ( standard type agreement will be made available for yourself to decide ) whereby 'minimum' entitled to revenues share of US$3Million in return for 'part' time competent manner R&D development , over 3 years period

    you may also opt to communicate your solutions by private email 1st

    Cheers,
    LawCounsels

  2. #2
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts
    To be honest it does not look to hard. But why on earth would anyone pay that kind of money for the code to do it. I for one believe if the offer seems to good to be real then it's not a real offer. Why did you post this here?

  3. #3
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts

    YOU CAN GUARANTEE THIS WON'T BE SIMPLE

    Hello :

    think you did not read/ miss this in the POSTING:


    https://groups.google.com/forum/?fro...on/mWqW8C-C4XE
    " .... On Tuesday, April 3, 2012 9:32:39 AM UTC, Thomas Richter wrote:
    > For all
    > practical matters, pick a MQ coder, and encode the symbols by binary
    > decisions like 0->a, 10->b, 11-> c.


    yes this is a good start .... this certainly satisfies 'self-delimiting' criteria

    all sequence starts with 'no symbols' ( nothing ) then progressively accumulates 'a' or 'b' or 'c' .... the initial ( simplifying ) statstistics is such that among ALL the # of sequences ( one can always takes only exactly eg 1,000 etc sequences ... since the 'start' positions of each sequences all ascertainable) the total# 'a' exact = total # 'b' and total # 'c' exact = total # 'b'

    ... most 'startling' most certain youwill soon enough exasperate conclude decide there is simply no possible way for existing known compression techniques to possible do this ! ... somewhat surprsingly given each of these sequences definite 'mathematics' NOT RANDOM endowed with clear unambiguous structures and distributions bias !

    which then you may then find this further statistics may or may not help you toward a 'complete' new kind of compressions method : the length of each sequence ( the total # of symbols within ) follows mathematics derivable from the 1 : 1 : 2 distributions ratio of the 3 symbols

    YOU CAN GUARANTEE THIS WON'T BE SIMPLE !

    Warm Regards,
    LawCounsels

  4. #4
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 796 Times in 488 Posts
    This is why I stopped reading comp.compression.

  5. #5
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,568
    Thanks
    775
    Thanked 687 Times in 372 Posts
    i propose to move it into runglsih/chinglish section

  6. #6
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Matt Mahoney View Post
    This is why I stopped reading comp.compression.
    This topic is very different 'SPECIFIC' now clear capable of scientific investigation methods clear conclusion , for the very 1ST TIME EVER this came to be possible to be so very clear 'framed' , also for very 1ST TIME EVER 'random' bits has been transformed into non-random clear 'structured' output sub-sequence with distributions bias [ unlike Mark Nelson AMillionRandomDigits complete devoid of any discernible structures whatsoever ]

    . can the specified symbols sequences ( definite NOT random shows distributions bias and clear structured ) be encoded more economic ? IF SO , HOW ?

  7. #7
    Member
    Join Date
    Dec 2010
    Location
    Netherlands
    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Could someone please explain to me more clearly what this guy is going on about? I've read his posts here and on comp.compression several times and the way he writes is very confusing.

  8. #8
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,495
    Thanks
    26
    Thanked 131 Times in 101 Posts
    I've read somewhere a quote saying something like: If you are unable to describe something in simple way, then you don't understand that well. Unfortunately I forgot the exact quote and the author. Nevertheless, if that quote is true, then thread's author doesn't understand the problem well.

  9. #9
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LawCounsels View Post
    Greetings :

    check out :

    http://groups.google.com/group/comp....2f82e171?hl=en#

    there will be many happy just to be part of contribute to this history 'breakthrough' [ if indeed it turns out ] !
    however I am not averse to confirm REWARDS :

    . your choice whether to accept one-time immediate US$1,000 payment on delivery of the software ( prefers C# )

    OR

    . accept a retainer ( standard type agreement will be made available for yourself to decide ) whereby 'minimum' entitled to revenues share of US$3Million in return for 'part' time competent manner R&D development , over 3 years period

    you may also opt to communicate your solutions by private email 1st

    Cheers,
    LawCounsels

    NOTE : to win the REWARDS your solution needs attain 8 bits Net compression savings or more if taking compresses 100 such sequences , attain 80 bits Net compression savings or more if taking compresses 1,000 such sequences , attain 800 bits Net compression savings or more if taking compresses 10,000 such sequences ... so forth

  10. #10
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    On Sunday, April 8, 2012 11:25:47 AM UTC, James Dow Allen wrote:
    > On Apr 8, 2:41 pm, LawCouns...@aol.com wrote:
    > > NOTE : to win the REWARDS your solution needs attain 8 bits Net compression savings or more
    >
    > I enjoy compression puzzles and might investigate this one except ...
    >
    > Skimming your posts I find I have no idea whatsoever
    > what problem you're posing, nor how the compression savings would
    > be measured. You do mention some ternary system with a
    > token termination condition, but any (terminated) sequence
    > of trits would be valid, just with different token boundaries.
    >
    > At one point you imply a trit is 1.5 bits.
    > Wrong, it's 1.5849625 bits. Don't know if this
    > makes your puzzle easier or harder.
    >
    > Before you waste time trying to tell us what your
    > actual requirement is, be aware that, if I solve it,
    > I will not divulge my solution until the REWARD is in escrow.
    >
    > James


    you choose a # eg 1,000 such subsequence ( each such subsequence terminates when # of c = # of b + 2 ) .... yes , each such sequence can be of various lengths as you mentioned ( # of symbols within ) BUT you know the distributions of the sequence lengths eg you can ALWAYS generate any # of such sequences ( for testing your compressions algorithm ) from a source with probability of producing an 'a' symbol 25% of time a 'b' symbol 25% of times a 'c' symbol 50% of times

    Yes , a trit is 1.5849625 bits ( as when uses Arithmetic coder ) ... but I was
    thinking perhaps using combinatorial C(100 , 50, 25, 25 ) this comes to average near 1.5 bits each trit ?( ignoring recording the multiplicities costs )

    you should provide .exe takes in any generated # of such sequences , encode smaller then decode back to same # of such sequences [ NEEDS ONLY SHOW ON AVERAGE ATTAINS THIS , so wont be 'faulted' on very rare extreme input sequences generated ]

    YES , REWARDS WILL BE IN ESCROW on request provided an time-expired .exe 1st clear shows saves 0.08 bits per sequence encoded


    LawCounsels

  11. #11
    Member
    Join Date
    Dec 2011
    Location
    Germany / Hessen
    Posts
    18
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by TheVoid View Post
    I've read his posts here and on comp.compression several times and the way he writes is very confusing.

    Yes, indeed! Very confusing....

  12. #12
    Programmer michael maniscalco's Avatar
    Join Date
    Apr 2007
    Location
    Boston, Massachusetts, USA
    Posts
    140
    Thanks
    26
    Thanked 94 Times in 31 Posts
    Quote Originally Posted by Matt Mahoney View Post
    This is why I stopped reading comp.compression.
    Same for me. I remember the comp.compression of ten years back which was a great meeting place for great minds. Now most of the posts on comp.compression read as if the poster were perhaps bi-polar/manic and unable to construct a complete thought without loosing focus. This particular individual appears to specialize in posts which are completely non coherent gibberish.

    I hope that encode.su does not suffer the same fate as comp.compression.

  13. #13
    Member BetaTester's Avatar
    Join Date
    Dec 2010
    Location
    Brazil
    Posts
    43
    Thanks
    0
    Thanked 3 Times in 3 Posts
    Suppose I win the prize.
    Who will pay me money?

    You?
    Some unknown millionaire?
    Bill Gates?
    DARPA?
    Any specific institution?

  14. #14
    Member
    Join Date
    May 2008
    Location
    England
    Posts
    325
    Thanks
    18
    Thanked 6 Times in 5 Posts
    Quote Originally Posted by TheVoid View Post
    Could someone please explain to me more clearly what this guy is going on about? I've read his posts here and on comp.compression several times and the way he writes is very confusing.
    Yes he is, he has his location set as London but i doubt that very very much, English obviously isn't his native language, i'd put my money on India/China.

  15. #15
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LawCounsels View Post
    On Monday, 9 April 2012 17:03:04 UTC+1, biject wrote:
    > On Apr 9, 5:19 am, James Dow Allen <jdallen2...@yahoo.com> wrote:
    > > On Apr 9, 12:58 pm, lawcouns...@gmail.com wrote:
    > >
    > > > a source with probability of producing an 'a' symbol 25% of time
    > > > a 'b' symbol 25% of times a 'c' symbol 50% of times
    > >
    > > Allow me to recommend the optimal Huffman code:
    > > c - 0
    > > a - 10
    > > b - 11
    > > This can be improved, though only slightly, using details
    > > you've omitted from your summary.
    > >
    > > This was so trivial, I'll discount it down to, say $950.
    > >
    > > If this is unsatisfactory, I'll withdraw from the contest.
    > > Even paid at minimum wage I'm afraid it would take significant
    > > funds (payable in advance, please!) just to elicit a
    > > proper problem statement from you.
    > >
    > > I don't have PayPal. Contact me for instructions on how
    > > to pay the $950.
    > >
    > > James
    >
    > Lets see c is .5 * 1 = .5 b = .25*2 = .5 c = .25*2 = .5
    > see thats .5 + .5 + .5 = 1.5 for the average sequence while
    > if you encode each with 1.5849625 you save about .0849625 which
    > is more than the .08 It appears your in the money. I have a
    > hunch that there still is something missing in which case I would
    > not count on the money yet.
    >
    > First of all does he want at least .08 bits saved in every case
    > or just the average case. If its the average case you could be
    > on the right track. If its every case then since you write only
    > whole numbers of bits the .08 savings gets a little harder. It
    > would be nice if the guy decides you haven't won just what does
    > he want. I have read it several times and yet I do not think its
    > clear enough to tackle without him saying oh I meant this and not
    > that.
    >
    > Assuming he doesn't declare you the winner
    > 1) is the savings an average things or does each file have to be less.
    > 2) how do you measure the savings is it .08 from a 1.5849625 per
    > symbol
    > or is it .08 less then 1.5
    > 3) not sure why you say source C = .5 while A and B = .25 the
    > fact is even if the source is A = B = C = 1/3 for short files
    > if you run the sources enough times and created a 100 files each
    > you still could get the same set of 100 files for both cases.
    > So you test set up is not valid. There is nothing magical about
    > your source. Except if I know its a fixed IID souce from say 2 or
    > 3 different models as you create more files. You can with increasing
    > probability determine which one it most likely is. But you can't be
    > 100% certain which one it is unless you do an ever increasing number
    > of file.
    >
    >
    > David A. Scott
    > --
    > My Crypto code
    > http://bijective.dogma.net/crypto/scott19u.zip
    > http://www.jim.com/jamesd/Kong/scott19u.zip old version
    > My Compression code http://bijective.dogma.net/
    > **TO EMAIL ME drop the roman "five" **
    > Disclaimer:I am in no way responsible for any of the statements
    > made in the above text. For all I know I might be drugged.
    > As a famous person once said "any cryptograhic
    > system is only as strong as its weakest link"

    THE COMPLETE SPECIFICATIONS :
    ==========================

    1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

    2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

    3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * N ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )

    4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day

    LawCounsels

  16. #16
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    THE COMPLETE SPECIFICATIONS :
    =========================

    1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

    2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

    3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * # OF SEQUENCES ENCODED ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )

    4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day

    LawCounsels

    NB should anyone wins I certainly recommends to opt for the US$3M ( this minimum of US$3M is GUARANTEED BY YOUR WINNING SOLUTION ITSELF ! )

  17. #17
    Member
    Join Date
    Dec 2011
    Location
    Cambridge, UK
    Posts
    504
    Thanks
    184
    Thanked 177 Times in 120 Posts
    This makes no sense!

    If the data is truely randomly generated with p(A)=.25, p(B)=.25 and p(C)=.5 then on average the entropy will be 1.5 bits per symbol. It's been proven that you cannot go lower than this. (And a simple huffman tree of C=0, A=10, B=11 will work fine.) So you're just wasting time.

    If the data is NOT randomly generated, then it needs to be explained better. Are you saying that no randomly generated string of symbols can ever have more than 2 more c than b symbols (yet we still have p(C) = 2*p(B))? However you just slice the string of symbols at that point and keep going? If so it is completely identical to the initial random case, but split into segments. That cannot possibly improve your compression as on average it's the exact same data.

    The only practical way I see is to find a weakness or prediction of the random number generator, but then it's not truely random - only psuedo random - and it's all a fake and irrelevant.

  18. #18
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    On Tuesday, April 10, 2012 1:43:45 PM UTC+1, James Dow Allen wrote:

    > On Apr 10, 4:29 pm, LawCouns...@aol.com wrote:

    > > THE COMPLETE SPECIFICATIONS :
    > > =============================
    > >
    > > 1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%


    > I'm not clear on what "invariable near =" means. I think you specify
    > that the trit is from a random memoryless source. (Anyway, a
    > different intepretation would have smallish effect.)

    If source produces symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times THEN after eg 1,000 symbols sequences generated ( total N # of symbols within these 1,000 sequences ) THEN can with very high confidence level says the # of 'a' will be around N/4 the # of 'b' will be around N/4 the # of 'c' will be around N/2

    > > 2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences


    > Does the decompressor know, in advance, the exact number of bits in
    > the sequence? (Even if it does, the compression savings will be tiny,
    > when amortized over 1000 strings.)

    you can ALWAYS choose ONLY a particular fixed # of sequences eg 1,000 or 10,000 etc to compress , so decompressor knows in advance the EXACT # of sequences compressed ..... but may ONLY guess at the total # of bits quite accurate ( since this is not known in advance )


    > > 3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * N ) THEN YOU WIN THE REWARDS !


    > Starting with a source of exactly 1.500 bits/token of info,
    > we wait for the 1000th terminal, then compress it to
    > 1.420 bits/token. Right? Good luck!

    > I'll leave my bet on Shannon, Kraft, and the pigeons.


    SO WOULD I should these 1,000 sequences being 'random' structureless .... here the structure is within each sequence the # of 'c' is EXACT = the # of 'b' + 2

    IN FACT ... all sequence MUST END with a 'c' , immediate before this 'c' ONLY an 'a' OR a 'c' can occur [ NEVER a 'b' ! ] .... if you look more careful, there are many more restrictions on possible permutations 'ordered'arrangements between the # of 'a's & 'b' & 'c' within a sequence , eg if there are 6 symbols in a sequence 1st 2 symbols cant be both 'a's & last 3 symbols cant be ALL 'b's .... etc so forth

    .... IT IS THESE THAT MAY MAKE YOUR SOUGHT FOR 1.420bits/token possible achievable ( ? )

    >
    > James Dow Allen

  19. #19
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    > > Starting with a source of exactly 1.500 bits/token of info,
    > > we wait for the 1000th terminal, then compress it to
    > > 1.420 bits/token. Right? Good luck!
    > >
    > > I'll leave my bet on Shannon, Kraft, and the pigeons.

    SO WOULD I should these 1,000 sequences being 'random' structureless .... here the structure is within each sequence the # of 'c' is EXACT = the # of 'b' + 2

    IN FACT ... all sequence MUST END with a 'c' , immediate before this 'c' IF # of symbols within this sequence is > 2 THEN ONLY an 'a' OR a 'c' can occur [ NEVER a 'b' ! ] .... if you look more careful, there are many more restrictions on possible permutations 'ordered'arrangements between the # of 'a's & 'b' & 'c' within a sequence , eg if there are 6 symbols in a sequence 1st 2 symbols cant be both 'a's & last 3 symbols cant be ALL 'b's .... etc so forth .... IT IS THESE THAT MAY MAKE YOUR SOUGHT FOR 1.420bits/token possible achievable ( ? )

  20. #20
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts

    Exclamation

    Quote Originally Posted by JamesB View Post
    This makes no sense!

    If the data is truely randomly generated with p(A)=.25, p(B)=.25 and p(C)=.5 then on average the entropy will be 1.5 bits per symbol. It's been proven that you cannot go lower than this. (And a simple huffman tree of C=0, A=10, B=11 will work fine.) So you're just wasting time.

    If the data is NOT randomly generated, then it needs to be explained better. Are you saying that no randomly generated string of symbols can ever have more than 2 more c than b symbols (yet we still have p(C) = 2*p(B))? However you just slice the string of symbols at that point and keep going? If so it is completely identical to the initial random case, but split into segments. That cannot possibly improve your compression as on average it's the exact same data.

    The only practical way I see is to find a weakness or prediction of the random number generator, but then it's not truely random - only psuedo random - and it's all a fake and irrelevant.
    PROFOUND ! Thanks

    I would refer you to Entropy Definition : Request for Comments

    https://groups.google.com/forum/#!to...on/w7QSfqtfeg4


    These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!

    1st to reach 0.08 bits Net compression savings per sequence WINS US$3M

    LawCounsels





  21. #21
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    On Apr 10, 5:03 pm, James Dow Allen <jdallen2...@yahoo.com> wrote:

    > Does it seem strange that I knew I was going to dislike
    > New Google Groups? (though I didn't know *what* I'd dislike
    > about it.) Almost every change made in some systems seems
    > for the worse not better.
    >
    > In the example below, I click on lawcounsel's link only
    > to get some "overview" page with no content.
    >
    > (Old Google Groups is amusing too, of course. I'm now
    > looking at a window with no less than five scroll-bars,
    > 3 of which I had to manipulate just to get this far!)
    >
    > On Apr 10, 9:29 pm, lawcouns...@gmail.com wrote:
    >
    > > I would refer you to Entropy Definition : Request for Comments
    > >https://groups.google.com/forum/#!to...on/w7QSfqtfeg4
    > > These sequences have already been encoded compression saves minimum Net 0.063 bits per sequence !!!
    > > 1st to reach 0.08 bits Net compression savings per sequence WINS US$3M+
    >
    > Three comments:
    > 1. Your earlier post had
    > <= 1.5 * N - ( 0.08 * N )
    > Are we now to understand that the first N here is number of trits,
    > and the second number of sequences?

    YES ... corrected since to read <= 1.5 * N - ( 0.08 * # OF SEQUENCES )
    > 2. I didn't read about the "saves minimum Net 0.063 bits per
    > sequence."
    > (As I said the link doesn't work.) By "minimum" do you mean
    > "actual in one experiment"? I'm guessing fluke.

    FOR "ANY" ONE SUCH SEQUENCE [ mathematics 'guaranteed' minimum 0.063 bits savings ]
    .... ALSO VERIFIED OVER LARGE # OF SUCH SEQUENCES

    > 3. It may seem paradoxical that no compression savings are available
    > despite the c=b+2 constraint. But consider a simpler related problem:
    > Flip a coin and stop on the first Heads. Compress the result.
    > H occurs with prob. 1/2
    > TH occurs with prob 1/4
    > TTH with prob 1/8, etc.
    > No way to compress despite the constraint.

    > James

    perhaps particular constraint here is superficial ... reducible 'fundamental' equivalent to common 'fair coin' tosses type here

  22. #22
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    On Tuesday, April 10, 2012 6:34:11 PM UTC+1, James Dow Allen wrote:

    > On Apr 10, 11:56 pm, LawCounsels <lawcouns...@gmail.com> wrote:

    > > FOR "ANY" ONE SUCH SEQUENCE [ mathematics 'guaranteed' minimum 0.063
    > > bits savings ]

    > Congratulations! It sounds like you've disproved
    > the Kraft's Inequality. Have you published or is
    > it secret? And why the REWARD ($3 million or $1000?)
    > to extend the 0.063 win to 0.080?
    > Once you have a perpetual compressor, can't you just
    > run several copies in series to get as much compression
    > as you want?

    > James

    because the 'smaller' compressed 1,000 such sequences is NOT in
    same such sequences format again any more .....

    TO BE REPEATABLE needs 1st improve bits saving to average 0.08 bits per sequence
    .... possible improvements schemes a plenty , promising !

  23. #23
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    On Wednesday, April 11, 2012 3:36:05 PM UTC+1, James Dow Allen wrote:

    > On Apr 11, 9:26 pm, Thomas Richter <t...@math.tu-berlin.de> wrote:

    > > That you cannot compress below 1.5bits/sample is the Shannon
    > > result of lossless channel coding.


    > Yes, he already knows this much. What you seem to overlook is
    > that he's refuted, both experimentally and theoretically,
    > Shannon's theory, the pigeonhole principle, and even the
    > Kraft's Inequality. Naturally he's keeping details of his
    > method secret; wouldn't you?


    > I repeat my offer to OP. If you supply an .exe which achieves
    > the guaranteed reduction you describe, I will supply an .exe
    > which invokes that .exe and wins the $1,000,000 prize for
    > MillionDigits.
    > Can we split the $1,000,000 fifty-fifty?
    >
    > > Greetings,
    > > Thomas


    > Hello yourself,
    > James


    YES ... 50/50 will do nicely

    .... gives this few more days if anyone gets to 0.08 bits savings per sequence
    1st

    will be in touch private email re signing

    ALSO let me know if companies / compressions reseach Institute / prominent scientists acquaintances happy to collaborate co-develop for the markets

    Cheers,
    LawCounsels

  24. #24

  25. #25
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts

    UPDATE ANNOUNCEMENT : US$3M DATA COMPRESSION PRIZE

    ================================================== ===========
    UPDATE ANNOUNCEMENT :
    US$3M DATA COMPRESSION PRIZE ================================================== ===========

    AM NOW MADE AWARE OF COUPLE OF SOLUTIONS PUT FORTH CERTAIN
    TO EASILY FAR EXCEED THE REQUIRED 0.08 BIT COMPRESSION SAVINGS PER SEQUENCE NEEDED ....
    WILL NOW NEEDS TAKE SOME TIME TEST DEVELOP CONFIRM THE SOLUTION
    KEEP YOUR SOLUTIONS COMING .....
    .
    Warm Regards,
    LawCounsels


    THE COMPLETE SPECIFICATIONS :
    =============================


    1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

    2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

    3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * N ) THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )
    4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day



  26. #26
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LawCounsels View Post
    ================================================== ===========
    UPDATE ANNOUNCEMENT :
    US$3M DATA COMPRESSION PRIZE ================================================== ===========

    AM NOW MADE AWARE OF COUPLE OF SOLUTIONS PUT FORTH CERTAIN
    TO EASILY FAR EXCEED THE REQUIRED 0.08 BIT COMPRESSION SAVINGS PER SEQUENCE NEEDED ....
    WILL NOW NEEDS TAKE SOME TIME TEST DEVELOP CONFIRM THE SOLUTION
    KEEP YOUR SOLUTIONS COMING .....
    .
    Warm Regards,
    LawCounsels


    THE COMPLETE SPECIFICATIONS :
    =============================


    1. generates a number eg 1,000 of such sequences ( each sequence composed of ternary symbols 'a' 'b' 'c' , when # of 'c' = # of 'b' + 2 Then sequence ENDS & next sequences begins ) using a source producing symbol 'a' 25% of times symbol 'b' 25% of times symbol 'c' 50% of times ) .... call the total # of symbols in these 1,000 sequences N . NOTE : among these eg 1,000 sequences the # of 'a' is invariable near = the # of 'b' & the # of 'c' is invariable near = 2 * the # of 'b' THUS the probability model here is 25% : 25% : 50%

    2. compresses these eg 1,000 generated sequences using your .exe , & must decode back to the same 1,000 sequences

    3. IF you compressed file bitslength =< 1.5 * N - ( 0.08 * # of sequences ) BITS THEN YOU WIN THE REWARDS ! ie if your .exe saves 'on average' 0.08 bit each sequences you WON ( needs not be invariable every time on every conceivable file ! ) , but note the original # of sequences is here taken to be of bitslength N * 1.5 bits long ( as originally 'explicit' stated to be 1.5 * N bits long , NOT 1.5849625 * N bits long )
    4. there is no restrictions on memory storage requirements , you may even show your .exe works on 'research network supercomputer cluster' , BUT processing must complete within a day


    On Apr 16, 10:27 pm, James Dow Allen <jdallen2...@yahoo.com> wrote:


    - Hide quoted text -
    - Show quoted text -
    > On Apr 17, 12:31 am, Thomas Richter <t...@math.tu-berlin.de> wrote:

    > > As far as I understand
    > > the problem now, the problem is that *each* of the individual
    > > subsections of the file have to end at 513 bits (output bits). Why that
    > > makes sense I do not know, ... All that depends of
    > > course on how this "513 bits" constraint comes into play, which I do not
    > > understand fully.

    > I apologize; it looks like I missed the latest memo.

    > Do we still get the $1 MILLION REWARD for satisfying the
    > FINAL and COMPLETE specification, with this latest frill just
    > an extra-credit problem for the $3 MILLION bonus?

    > Is there a YouTube link?
    > Best ever,
    > James



    Hi :

    initial tests confirmed easy exceeded 8.0 bits savings per sequence
    target .... am making this into .exe now

    can any of you help lead arrange R&Ds at Intel/ Google / NASA / CERN
    the likes ?

    BTW : has any of you get close to 8.0 as yet ? pls tell
    Warm Regards,
    LawCounsels

  27. #27
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by LawCounsels View Post
    [TR]
    No code has to be inserted here.

    No code has to be inserted here.

    [/TR]
    On Monday, April 30, 2012 5:58:38 PM UTC+1, Fibonacci Code wrote:
    > On Apr 30, 11:34 pm, lawcouns...@gmail.com wrote:
    > > On Monday, April 30, 2012 3:36:33 PM UTC+1, Fibonacci Code wrote:
    > > > Any very high entropy binary file when chopped to ternary trits will
    > > > have close distribution of 50 percent 0, 25 percent 10 , 25 percent
    > > > 11.

    > > am awful busy at the moment hardly time to spare .... can you kindly 1st help quick clear simple explain how arbitrary choosing any # of bits from a random binary file will ALWAYS give 50% '0' 25% '10' 25% '11' ?

    > > ... to proceed then : )

    > > BTW : are you or anyone here fluent with C# & leading edge combinatorics ( 'multinomial' enumerative lexicocographic rank Index etc , or can easy quick pick up ) ... am looking for a few more high calibre 'confidential' co-researchers developers on this unprecedented immensely rewarding profit-shares .

    > > email your short details private to LawCounsels at aol dot com

    > > > Where reverse Huffman apply 0-A, 10-B, 11-C

    > > > So this is what law mean turning a random file to biases probability.

    > > > He is now try to ask this room to compress the reverse Huffman
    > > > probability,

    > > > An attempt to create recursive compression

    > > > 0101100010101111000001111
    > > > 0
    > > > 10
    > > > 11

    > > > Well, 3 million usd was nothing to this kind of feat, well until now
    > > > luckily,

    > > > No one found the solution yet, include me !

    > > > it won't work dear, I did work on the same bias 7 years ago.

    > > > I do have a few close to jackpot algorithms, but still I had abandon
    > > > them, the reason is simple,

    > > > 50% 25% 25% is not precise and not constant for all the files you try
    > > > to resolve.

    > > > The differences need entropy and thus overwrite the savings.

    > > > Regards,

    > > > Fibonacci

    > Not always, as I mentioned

    > 50% 25% 25% is not precise and not constant for all the files you try
    > > to resolve.

    > It is very likely for high entropy file, eg densely packed.

    > So it won't work.



    Agrees ! it certainly will not work as you correct said on any very high entropy binary file chopped to ternary trits ( since very likely will have close distribution of 50 percent 0, 25 percent 10 , 25 percent 11 ).....

    IMPORTANT TO NOTE your very same high entropy binary file chopped to ternary trits above IS VERY COMPLETE DIFFERENT from the 'constraint' present in the US$3M data compression sequences NAMELY EACH SEQUENCE WHICH IS COMPRESSABLE ENDS WITH EXACT 2 MORE 'C's THAN # OF 'B's within each sequence [ this is complete different & absent from merely 50% '0' 25% '10' 25% '11' probabilities ]


    Warm regards,
    LawCounsels

  28. #28
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    I think it's time to cut this off, before encode.su follows comp.compression's fate.
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  29. #29
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 796 Times in 488 Posts
    I agree.

  30. #30
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,568
    Thanks
    775
    Thanked 687 Times in 372 Posts
    i wonder why it's still here

Page 1 of 2 12 LastLast

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Random Compression
    Replies: 244
    Last Post: 23rd March 2020, 16:33
  2. Hutter prize awarded
    By Matt Mahoney in forum Data Compression
    Replies: 2
    Last Post: 19th August 2009, 21:17
  3. Ocarina Compression Challange (Total Prize: $1 Million)
    By osmanturan in forum Data Compression
    Replies: 8
    Last Post: 2nd October 2008, 09:19
  4. Alexander Rhatushnyak wins Hutter Prize!
    By LovePimple in forum Forum Archive
    Replies: 1
    Last Post: 5th November 2006, 18:04
  5. The Hutter Prize
    By LovePimple in forum Forum Archive
    Replies: 7
    Last Post: 22nd September 2006, 12:28

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •