I develop a algorithm can compress text from 10000 letters to 9 letters ,it can decompress so.But useless for me , i want to sell it .
I develop a algorithm can compress text from 10000 letters to 9 letters ,it can decompress so.But useless for me , i want to sell it .
What kind of text? Are there many repetitions? If so, it´s easily compressible.
If your file is random, then you are out of luck. But random content is compressible at least to few percent, but not that big as you have mentioned. But never say never - I am working on my custom data compression software that will be able to handle ANY filetype and compress it at least to 90% losslessly, but it will be terribly slow.
Could you post some screenshots of your algorithm or at least compressed sample? Maybe then we can tell you more about it and also we can help you to compress it much better.
Thanks.
Last edited by CompressMaster; 9th May 2019 at 22:05. Reason: small typo
You don't need to sell it directly.
Just apply to http://prize.hutter1.net/ or http://mailcom.com/challenge/ or https://marknelson.us/posts/2012/10/...turns-ten.html .
There're also plenty of other contests where you can advertise your work.
Sorry , i can't show you. mine just algorithm not software so got some lack. It can compress random letters and 10k letters is random.
thanks.But my algorithm not so advance , can't reach at that level.
Just split it to blocks. If you can compress 10000 letters to 9 bytes, it means you can split enwik8 to 10k blocks and compress them to 10k*9=90k total.
It means you can claim the whole 50k euro prize.
but when decompress it need huge database or super computer and it cant compress chinese word .That's why i want let it go.
Try to compress these pure random files and post compressed results in attachment.
Database size is not a problem for me and chinese strings can be completely filtered. So that´s not problem. I don´t need to decompress your results back to original files, I want only compressed archive of original files. Thanks.
Its not a problem even if it can only compress valid english... Just type out the data as text, ie 0xFF = 255 = "two five five".
Even if enwik8 becomes 1G, it should be still compressible to 900k, so you'd still get the full prize.
Btw, hashing is not a solution for compression, not because it needs "huge database or super computer"
to restore input data from hash value, but because of collisions.
Even with assumed charset [\x20a-z] of 27 letters, you'd still start having collisions with 16 input symbols
and 9 bytes of output:Also, you can't sell software rights that easily - the trade has to be officially registered in some way,Code:16 letters = 27^16 = 79766443076872509863361 9 bytes = 256^9 = 4722366482869645213696
usually you'd get a patent for your algorithm, then sell it.
Otherwise you can always claim that your software was stolen, once the buyer starts making money from it.
Obama (10th May 2019)
for 1 million random alphabet letters.txt -1028 bytes result
Last edited by Obama; 10th May 2019 at 14:46.
You so nice , lead the stranger to the point.So good you are here.
Input:
1,000,000 bytes - 1 million random alphabet letters.txt
Output:
588,286 bytes - paq8px v178
588,001 bytes - cmix v17
-------------------------------------------------------
Input:
1,000,000 bytes - 1 million pure random data.txt
Output:
749,400 bytes - paq8px v178
748,956 bytes - cmix v17
1 million random alphabet letters.txt
charset = [a-z], size 26
1000000*Log[256.,26] = 587555
1 million pure random data.txt
charset = [\x0C\x1E07-9;=?ABD-FHIKMO-QTY\x5D\x5Ea-z\x7F\x83\x8D\x9E\xAF\xB0\xC6\xC7\xCE\xD3\xD5\xD8\ xDF\xE0\xE5\xE7\xEC-\xF0\xF3\xF6\xF8\xFA\xFC], size 79
1000000*Log[256.,79] = 787973
how do you know the charset ?
This script prints it. Its in perl.
Obama (10th May 2019)
To apply patent it need around RM15k .My algorithm can make unlimited compress data (I think , just tried 100k compress to 11 letters) , if apply patent my algorithm worth it or not ?
You mean 1,000,000 bytes input, 1,028 bytes output is 973 times smaller?
Did decompress also work and is file compare output equal to input?
What file size has your compress and decompress software and do it use a database, if yes what size has the database?
How long took it to compress and decompress?
What program language did you use?
Any idea for what price you want to sell your algorithm?
It is obvious that this fellow is pulling our chains and pressing our buttons. Let's make a graceful exit from his nonsense.
1,000,000 random digits to 1028 bytes? Absolute rubbish, and no way even given 10^9 years of time...
It depends on how you define "random" really: https://encode.su/threads/3099-Compr...ll=1#post59940
xinix (14th May 2019)
> How much compress text worth ?
I had heard of your algorithm in the high places during the Cold War. But there are only 2 ^ (8,224) files addressed or compressed by your algorithm, not enough to cover all the files in your 2 ^ (8,000,000) source file space, of course.
But how much compression algorithm worth? I guess it must be larger than $120 million paid by Microsoft to Stac for the Doublespace program infringement. Maybe compression algorithm must be worth more than $210 million or $220 million, considering that DeepMind was bought by Google for only $400 million.
Last edited by compgt; 18th May 2019 at 09:59.
I skim through this thread for the third time and it strikes me again that this is a "Nigerian Prince" kind of thing. Bogus claims for gullible audience. But what if this could be true? Well, it reminds me that patent offices in some countries are prohibited by law to grant patents related to any kind of perpetual motion machines. The laws must be amended to prohibit the possibility of compression below the entropy.
Disallowing perpetual motion machines was, i think, at the height of thermodynamics theory debates. Heat is entropy. Heat is disorder or causes disorder in a system. As such, we don't want perpetual machines that generate heat and clog the universe of unnecessary motions.
If he's the real Barack Obama, then he's from early information theory days, more like Claude Shannon adherents, who continue to revise the Shannon papers to suit the times.
It could be true to them who don't write actual code for their compression ideas. But since i was remembering from the Cold War, highly-classified geniuses lurking in the background might actually have their own breakthrough fast compressors but limited by an NDA. Think men in black suits from the early days of information theory history.
And i recall, back in the early days, some compilers were actually "rigged" or bugged for lesser compression, as well as MS-DOS etc, maybe corrupting the file_length() function or in the printed console. Thanks to Open Source nowadays, expert programmers can scrutinize how the file_length() function is implemented or how the file length is actually printed on screen. Or have they? Are there anomalies found? If you're a crooked Windows or Linux systems programmer, how would you do it? [Beware, this is of the conspiracy theory kind. I was deciding for the computing industry before; we were definitely concerned of these things.]
Last edited by compgt; 18th May 2019 at 12:41.
All patents who are a threat to a country or their allies (military or economic against big companies) are wiped out after paying a compensation (by both agreed):
"If the office considers that the secrecy of the contents of a patent application may be in the interests of the defense of the country or its allies, it shall make this as soon as possible, but no later than three months after the submission of the application is known. Our defense minister may give instructions to the agency regarding the assessment of the question or such interest."
It depends on you want to use database or just calculate.
if database sure will faster,if calculate only need super computer![]()