Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Trench's Avatar
    Today, 20:40
    Compression with decompression is implied. since their is no point of compression without decompression. But if you need a another example to have the entire internet with lossless videos, music, images, programs, etc in 1gb or 1mb. I am not saying if its possible to achieve I am saying if it was possible would it be allowed when so many people of power and influence will get upset over it that have invested Billion in other forms directly or indirectly. The examples of people may or may not be true but the point is the reason behind not allowing it. If their enough evidence even to 1% that someone of power can prevent it? If you invest 1 billion dollars in something today and tomorrow someone comes up with something to make your investment useless would you be happy that you lost your money? Would nations be happy if they also lost money and revenue? obviously no. People lives are risk to a degree to have built up an infrastructure that sustains million? Won't at least 1% of someone that loses to be enraged do something? Obviously not right but as the saying goes when something could go wrong can go wrong. you may be content if someone takes form you but another might not be that you are taking their lively hood. Not everyone is the same. Existing solutions can be replaced overnight in a way if the difference is that great. Look at the stock market now how closing the boarder to china billions were lost but that is temporary and many people are not happy. It will bounce back but will take time. But if it was permanent people can lose their minds. You cant speak for someone else when they have their own experiences which validates the contrary. And I am not talking about 2nd hand experience but 1st hand. Everything is political from the food you eat, the cloths you have, the news you are presented, to the science discoveries, etc. If what I say is true at 1% how would would you deal with it than dismiss it? Can it be?
    4 replies | 189 view(s)
  • Jarek's Avatar
    Today, 18:06
    Maybe there is, but getting a piece of it for an outsider is an impossible task. For example WaveOne had revolutionary image compression in 2017, then video compression in 2018 ... then silence http://www.wave.one/video-compression
    3 replies | 204 view(s)
  • User's Avatar
    Today, 17:48
    User replied to a thread paq8px in Data Compression
    All versions PAQ8px, PAQ8pxd, FP8 do not detect jpg in this file
    1819 replies | 520834 view(s)
  • Jyrki Alakuijala's Avatar
    Today, 16:34
    Money in compression is in video compression. Literally billions per year. Hutter prize may be giving out a thousand per year, likely it will be about 1 : 7'777'777 of the financial impact of video compression. It is more symbolic than financial.
    3 replies | 204 view(s)
  • schnaader's Avatar
    Today, 13:09
    My vote is yes, but I guess you won't like my explanation. It is possible to compress even millions of movies below 1 GB, yes. And there are many services and companies that do so. One of the currently best known is YouTube. They take the movie, compress it to a fraction of its size (roundabout ratio is 1:100 from uncompressed -> video codec). After that, they make the movie accessible to users by assigning an URL to it that is only about 30 bytes (for example: youtube.com/watch?v=NN75im_us4k) or even shorter (youtube.be short URLs). Yeah, I know. This is not what you meant. But this is the problem with all those random compression stuff here - you just don't ask the right questions. It would be "Can I losslessly compress all possible files by 99% and get the original file back afterwards by decompression?" And the answer to that is a straightforward and easy "No", backed by the pigeonhole principle. Note that both your post and the poll don't even mention those both words that are essential here. Now let's address some other things from your post: I guess you mean Jan Sloot. Well, let me explain my theory: 1. It was a hoax. 2. He died. Two totally unrelated things. As a matter of fact, everyone will die sometime regardless of his inventions or things he discovered. And it's not that I deny that people might sometimes get killed by others, it's just that I don't believe in the claims he made. Can I prove my theory? No. Should I? I don't think so, there are better things to do for me than hunting some mysterious stories. Similar recent story is the death of Mad Mike. I don't think his death implies that the earth is flat or that he got killed by someone to hide this. And this is because some basic science can show that the earth is in fact not flat. Same thing here: What if those "free energy" concepts just don't work? The laws of thermodynamics say so and I think they're valid, so I don't care "researching" "free energy". Of course you can go that way, deny science facts and go on a hunt for some magical things using all those pseudo-science buzzwords. But don't expect others to follow your way, especially in a forum filled with data compression experts. Don't get me wrong, you should think out of the box, that is was made me develop Precomp - which does a good job compressing most of the files you'll throw at it to about 30-50% less than pure LZMA2. In a way, it was me denying the statement that "compressed files can't be compressed further". Well, again, watch your wording. Compressed files can be compressed further if you decompress them first. Also, compressing them further losslessly does work, but involves some advanced techniques and work because usually, compression is not bijective. And now we can proceed to your "Can't compete, can't release, evil companies" part. Did I get killed because I developed Precomp? No, but of course you could say that Precomp isn't one of the revolutionary and foundation-shattering random compression things you're talking about. Did people try to "steal" my idea? Well, in the beginning there was some tool using Precomp without my permission, but it quickly went away. Also, a Google employee had a failed attempt with Grittibanzli. Back when Precomp was closed source, AntiZ was a more successful open source attempt. Dirk Steinke made Preflate and PowerArchiver integrates a commercial solution. And I even got a job at Ocarina Networks because of my work in the field, so yeah, I indirectly made money with the ideas from Precomp. But did I sell my soul to some evil company and did they steal all my ideas? No. All this "competition" isn't threatening me, quite the contrary - I'm happy for each one of them to exist. First, because the recompression idea is spread and second, because smart people do hard work to push and promote the field of (re)compression. So, why doesn't recompression get more mainstream very quickly, isn't this evil and unfair? No. It's hard work to replace existing solutions (in a safe and robust way that fulfills different requirements), even if your core techniques are clearly superior. Same for FLIF->FUIF->JPEG XL, for example. Smart people doing hard work. For me, those are the people I look up to and not some random compression techno-babble kiddies discussing bullshit. Sorry for addressing the elephant in the room that harsh, but that's the truth I believe in.
    4 replies | 189 view(s)
  • Darek's Avatar
    Today, 12:31
    Darek replied to a thread Paq8pxd dict in Data Compression
    first enwik scores for paq8pxd_v75: 16'339'122 - enwik8 -s8 by Paq8pxd_v74_AVX2 15'993'409 - enwik8 -s15 by Paq8pxd_v74_AVX2 15'956'537 - enwik8.drt -s15 by Paq8pxd_v74_AVX2 16'279'540 - enwik8 -x8 by Paq8pxd_v74_AVX2 15'928'916 - enwik8 -x15 by Paq8pxd_v74_AVX2 15'880'133 - enwik8.drt -x15 by Paq8pxd_v74_AVX2 16'319'686 - enwik8 -s8 by Paq8pxd_v75_AVX2 - 0.12% of improvement 15'976'838 - enwik8 -s15 by Paq8pxd_v75_AVX2 - 0.10% of improvement 15'934'372 - enwik8.drt -s15 by Paq8pxd_v75_AVX2 - 0.14% of improvement 16'260'265 - enwik8 -x8 by Paq8pxd_v75_AVX2- 0.12% of improvement 15'912'509 - enwik8 -x15 by Paq8pxd_v75_AVX2- 0.10% of improvement -> this score could provide to about 125'58x'xxx bytes for enwik9_1423 15'859'187 - enwik8.drt -x15 by Paq8pxd_v75_AVX2- 0.13% of improvement, best score for paq8pxd series According to time there are different changes -> from 5-7% up to 18% quicker (for enwik8.drt -x15)!
    717 replies | 283131 view(s)
  • Shelwien's Avatar
    Today, 12:16
    We can easily extrapolate the results of sudden compression breakthrough from previous similar cases. In 199x people were busy collecting plaintext books and floppy-sized games. Then hardware improved and it became possible to quickly download whole collections with millions of books at once. The result? Book downloading is not a mainstream topic anymore, also OCR'ers started posting books as 10-100M pdfs instead of <1M plaintext. Also now the main content type is video, it takes 90%+ of traffic and storage. But its lossy, so better compression methods would just lead to increase in quality until same balance is reached. P.S. I'd be moving this and your other thread to "random compression".
    4 replies | 189 view(s)
  • Shelwien's Avatar
    Today, 12:00
    Its possible, but I think its too early for that. Killing all activity in the main forum is risky too. Why don't you post some on-topic thread instead? :)
    3 replies | 204 view(s)
  • michael maniscalco's Avatar
    Today, 08:26
    I worry that this new Hutter prize is bringing the worst sorts of influences to this group in the form of gold prospectors. Eugene, might it be wise to create a sub forum for the Hutter Prize similar to the very effective "random compression" sub forum?
    3 replies | 204 view(s)
  • michael maniscalco's Avatar
    Today, 08:18
    I and almost every long time user of this board, I would wager, are here because we love what we do. There is no money, fame, nor glory involved. It is, for the most part, just another example of the "hacker ethic". We follow our passion, work our craft, and trust that it will be appreciated or beneficial in some way. Those who come here for money, fame, etc are soon forgotten. Those who remain are likely the finest kinds of people. - Michael
    4 replies | 189 view(s)
  • Trench's Avatar
    Today, 08:02
    Imagine 1000 movies ore more fitting in a 1gb flash drive. I am not saying it possible but imagine. 1 What would be the negative issues with that? None? Are you sure? 2 What do you hope to achieve? Money? for how long? I assume a small percentage gain a year to please everyone and not rock the boat. Even if you are able to compress every file at 99% logical it won't matter for a few reasons but 2 main things are. A Can't compete against free versions to sell it 1 Companies won't most likely buy it since again you have to go against free. Which people are not waiting in line to buy winrar. 2 Can be reversed engineered and other multiple alterations can change it which makes it risky. 3 Current devices are enough to transfer files for a while now. 4 money easy come and easy go, due to suits against you to mess up companies without warning and even if warned not allowed. and many more B Can't release it for free since it would mess up the market and nations 1 It would cause more people being able to download entire libraries, programs in an instant which would be like wilfe fire which can not be stopped easily which companies would hare that. 2 The stock market for many companies would fall which would have a market panic which goes against fiduciary laws to some degree. Countless of companies like of storage devices, politicians, online storage companies, phone companies, internet providers, 3 it will put many out of the job which will also cause many angry people and see it as an international threat to mess with trade. 4 If a company buys it it will be a target against many and no want to be seen as the bad guy and many more Anyone remember this guy from the 1990's about which said he can have an entire movie on floppy drive and was going to sell it. The day before he sold it he just mysteriously died of a heart attack. His files could not be found. You against millions of people, plenty of rich and powerful peoples lively hood which they will feel like they are threatened and one out of the million might be nuts. Since as the saying goes "hurt people hurt people", or in other words people that have been hurt will hurt others. More money to be made to find a cancer cure than the cure. More money to be made for the effort of world peace than world peace. And the list goes on. Some say even free energy is against the law which would destabilize nations, tax revenue, billions lost, a "national security" threat, etc. Even Edison and his financiers tried everything to stop Tesla and cut off funding. Most have not heard of Tesla before the internet. Money makes the world go round, and to stop that is going against the world it seems. And the masses wont be with the smart people since they do as the people on top say since they present security and money for them to live. People are not morally ready and can't handle such great difference. It feels like if one rock the boat and the boat will rock you off the boat. Since this seems like a logic issue. OR maybe I can be wrong. So tell me how what I just said is wrong and what guarantee you have if you do it?
    4 replies | 189 view(s)
  • michael maniscalco's Avatar
    Today, 07:59
    michael maniscalco replied to a thread cmix in Data Compression
    8.76 days ... think about that (for 8.76 days). :_crazy2:
    436 replies | 107669 view(s)
  • Trench's Avatar
    Today, 06:46
    Trench replied to a thread 2019-nCoV in The Off-Topic Lounge
    more people die from doctors medical treatment or the average flu than coronavirus so far. So dont stress over it since stresses what gets you more than the virus or/and can compound the issue. Map of people that have it, deaths, and recovered. https://gisanddata.maps.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6 But some tips are 1. Vitamin D (need k2 with it to not calcify arteries) 2. Zinc (might need copper with it) 3. Fulvic acid 4. Garlic 5. Selenium 6. Olive leaf 7. Oregano 8. Colloidal silver 9. Curcumin 10. Licorice root 11. Vitamin E 12. Green tea 13. Monolaurin ELDERBERRY ECHINACEA PAU D’ARCO ST JOHN’S WORT (black pepper helps absorbance over 100 or 1000 times) N95 Mask =95% protection at 0.3 micron to stop bacteria but not viruses (virus need sa host The coronavirus itself measures between .05 and 0.2 microns in diameter Masks wont work well on facial hair ​masks coated in salt could neutralize viruses like the coronavirus in 5 minute and destroyed within 30 minutes. Medical face masks can block some germs, but germs also linger on their surfaces. salt is crystalline, its hard, sharp corners can pierce viruses. flu virus is transported from patient to patient on droplets of excretions from sneezing and coughing. These particles are typically 5 microns or larger. But when the droplets dry off when on the mask it might get blown off or get passed the barrier if not destroyed. overall a salty humid environment at warmer climates is safer. for example 1000 Thread Count Tight Weave is at 3 MicronPore Size which is big N95 mask offers 50% better breathing than N99 masks Salt can be as small as 30 micron or as big as 30mm Also colloidal silver on masks also destroys the virus. Even Mold Spores are as small as 3 micron and Pollen is at 10 micron. As soon as you experience that sore, tickly feeling in your throat that precedes a full-blown cold, gargle with warm salt water. High-efficiency particulate air (HEPA) filters are 0.3 micron HEPA composed of fiberglass and possess diameters between 0.5 and 2.0. But newer versions and better. Reports say Asians are more vulnerable to the virus, especially a greater number of men than women in the 99 cases of 2019-nCoV infection. half of patients infected by 2019-nCoV hadchronic underlying diseases,mainly cardiovascular and cerebrovascular diseases and diabetes. Also elderly as susceptible. A keto diet can help when you stop eating carbohydrates which starve out the virus. references https://en.wikipedia.org/wiki/Polypropylene https://www.fda.gov/medical-devices/personal-protective-equipment-infection-control/masks-and-n95-respirators https://marketing.industrialspec.com/acton/attachment/30397/f-0045/1/-/-/-/-/mesh-micron-sizes-chart-ebook-from-ism/ https://www.businessinsider.com/mask-coated-in-salt-neutralizes-viruses-like-coronavirus-2020-2/ https://www.bibliotecapleyades.net/ciencia/ciencia_influenza82.htm https://en.wikipedia.org/wiki/HEPA https://www.businessinsider.com/wuhan-coronavirus-face-masks-not-entirely-effective-2020-1 https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(20)30211-7/fulltext https://www.envirosafetyproducts.com/resources/dust-masks-whats-the-difference.html https://www.oransi.com/page/particle-size https://www.youtube.com/watch?v=PthYoKKS6FE https://www.youtube.com/watch?v=K_qmjTJ6RLQ
    10 replies | 305 view(s)
  • Sportman's Avatar
    Yesterday, 21:46
    Pinetools random file generator 1MB file option c compress 1 bytes smaller, decompress and file compare ok (same file paq8pxd_v75 compress 4 bytes smaller). 1MB random.org file option c compress to 0 bytes, c1 compress to 1,050,441 bytes and c2 also to 0 bytes. If you can make it work at 10 different files from https://archive.random.org/binary you have something.
    32 replies | 1823 view(s)
  • Sportman's Avatar
    Yesterday, 21:32
    Sportman replied to a thread cmix in Data Compression
    cmix (v18 ) -c english.dic enwik9: 115,739,547 bytes, 756,510.756 sec. (8.76 days, 25GB memory use, cross entropy 0.926)
    436 replies | 107669 view(s)
  • kaitz's Avatar
    Yesterday, 19:59
    kaitz replied to a thread paq8px in Data Compression
    PIF.mht -> px version detects large part as text, its base64. This is main speed difference.
    1819 replies | 520834 view(s)
  • well's Avatar
    Yesterday, 19:57
    well replied to a thread Hutter Prize update in Data Compression
    Jarek, big thanks!:_yahoo2: that's all i need to know about money, if this video is not designed by 3d design software then everything is ok! if this is computer graphics product then you receive just what it cost huge amount of human labour...
    42 replies | 1748 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 14:36
    Mathematical analysis works best on the y channel, and punishes proper color modeling significantly. Consider repeating the effort for an image that has been converted to gray after compression (or before compression).
    92 replies | 19705 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 14:32
    Thank you so much for doing this! This is very useful. FYI First image is 0.1 BPP, i.e, 1 : 240 compression. The second image is 0.28 BPP, i.e., 1 : 86 compression. I consider that both of these are aggressive compression settings, and in practical use people would use from 1 : 50 to 1 : 15 compression.
    92 replies | 19705 view(s)
  • Shelwien's Avatar
    Yesterday, 14:24
    1) Are you sure that get_prob() gives the probability of 1? 2) Normally you'd need at least one context and context set. It may be simpler to insert an extra input into an existing mix. Just see here how the mixer works: https://github.com/hxim/paq8px/blob/master/Mixer.hpp#L124 https://github.com/hxim/paq8px/blob/master/SimdMixer.hpp#L111
    12 replies | 2363 view(s)
  • martinradev's Avatar
    Yesterday, 14:10
    Just wanted to bump this up.
    12 replies | 2363 view(s)
  • Self_Recursive_Data's Avatar
    Yesterday, 12:42
    Time to release my code. I made this from scratch in Blockly, tree, Online Learning, searching, context mixing, arithmetic encoding, decompressor. It is my first actual coding project (I'm 24 btw :p). It can compress 100MB down to ~23MB. I skipped refactoring it for now (there's a lot to shave off...), it is still small code too. https://blockly-demo.appspot.com/static/demos/code/index.html# Attached below is the big input version, try that one, the Blockly one is just to show the toy version. It makes 266,700 bytes into 74,959 bytes and Shelwien's compressor that makes 100MB into 21.8MB makes the 266,700 bytes into 70,069 bytes. Note I didn't exhaustively find the best parems, there IS a few more bytes that can squeeze out of it (in? out..in..) and I'm not done with it just yet :-). To switch to decompression put 'no' at the top and make the input only have the first 15 letters, not all the input of 266700 letters, and put the code it encoded into the decode input at top ex. 0.. You can use the following link to make the code into bits https://www.rapidtables.com/convert/number/binary-to-decimal.html (although I just divide the length of the encoding by 3, then make 1 3rd * 4, and 2 3rds * 3, which results in approx. same bit length. Ex. you create 0.487454848 which is 9 digits long, so: 9 / 3 = 3, 3*4=12 and 6*3=18, 18+12=30, 0.487454848 = 30 bits long!
    113 replies | 4890 view(s)
  • Darek's Avatar
    Yesterday, 12:42
    Darek replied to a thread Paq8pxd dict in Data Compression
    Scores of 4 Corpuses for paq8pxd_v75 -> very nice improvement on Silesia, however there are best scores for all corpuses :). Additionaly there are -x15 option comparison of v75 vs.v74.
    717 replies | 283131 view(s)
  • Darek's Avatar
    Yesterday, 12:40
    Darek replied to a thread Paq8pxd dict in Data Compression
    Here you are only file I have. There is an official source which was published on the forum, however I don't know if it's ok.
    717 replies | 283131 view(s)
  • Jarek's Avatar
    Yesterday, 09:51
    Jarek replied to a thread Hutter Prize update in Data Compression
    https://lexfridman.com/marcus-hutter/
    42 replies | 1748 view(s)
  • Trench's Avatar
    Yesterday, 04:32
    It's great... for kids... maybe. Well some things are fine but what is the end goal of it all is the issue. What about the sphinx secret chambers that were talked about by the news, historical references, and records? Or what about the Ancient Greek artifacts in America predating Indians artifacts? Or ancient Ancient Greek copper identical or the copper ore of the US? The Petralona Skull where the archaeologist believes 90% of Greek history is intentionally destroyed. Might not fit the world narrative maybe. human skeletons that were dated to be 800,000 years old. And the list goes on. How can one learn history when presented in a nice bow and the rest that cant be explained thrown away and not show? Hiding or destroying history is like book burning which everyone says its bad but ok when in other formats as if dyslexia of the mind is happening. History is literally history to be a distant forgotten memory distracted by other pointless info. Can't make a future with historical facts in the way. The math does not add up. Money talks which that math does make sense. ;) In the end if the means do not justify the ends and people believe the ends justify the means then they will find a means to their end. If something can go wrong it will go wrong and if people can be compromised they will be compromised. https://www.google.com/search?q=sphinx+chambers&source=lnms&tbm=isch&sa=X https://en.wikipedia.org/wiki/Aris_Poulianos https://www.jstor.org/stable/2742205?seq=1
    1 replies | 90 view(s)
  • djubik's Avatar
    Yesterday, 01:55
    I tried to compile this on linux, and it fails. Firstly, it looks like the make format is old? I'm not a make wiz, but the line: ".c.o" I think the modern way of doing this is "%.c : %.o" But when I tried changing that, I get a different error: ​Any idea how to fix it?
    14 replies | 6463 view(s)
  • lzhuff's Avatar
    Yesterday, 01:08
    lzhuff replied to a thread Paq8pxd dict in Data Compression
    @darek, do you still have the paq8pxd48_bwt1 source code ??
    717 replies | 283131 view(s)
  • Darek's Avatar
    26th February 2020, 22:32
    Darek replied to a thread Paq8pxd dict in Data Compression
    paq8pxd_v75 scores on my testset with -s9 option. Some better scores mainly for exe and txt files.
    717 replies | 283131 view(s)
  • Shelwien's Avatar
    26th February 2020, 17:57
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    DRT uses a static dictionary with only ~45k words. I guess enwik9 gets too many unconverted words. I wonder if nncp preprocessor could be used instead - either directly, or followed by utf16-to-utf8 conversion.
    717 replies | 283131 view(s)
  • image28's Avatar
    26th February 2020, 17:10
    Early in the morning down under, thought I would quickly introduce myself. As in my bio, dug up some old code from my 20's yesterday after finding this site. Still early stages yet. wrote a quick bash test to compress the base enwiki9(*) 1gb file down two ~500mb using mostly grep regex's to generate dictionaries. while I sift through 20 years of backups. (*) updated, had en8 but was talking about enwiki9, had a look at improving the regex's a minute ago (seperating the xml tags may yield a better results) e.g cat enwik9 | grep -iEo "(<|<\/)*>" | sort | uniq > enwik9-tags cat enwik9 | grep -iEv "(<|<\/)*>" | grep -iEo "* " | sort | uniq > emwik9-words Forgot the regex I used to extract all other symbols :T One of my best algo's may be ready to go if I can find it. Only ever tested it on already compressed data mp3/xvid files, but could out compress tar/gz/bz2/rar/zip in those test cases. ( Once ran it for 5 days of iterations before it was compressing less than 8 bits per iteration. Decompression was much faster ). Will try find something to post each week over the coming weeks if ya'll are keen! Anyway, Hello, happy Wednesday/Thursday depending where you are, Kevin
    84 replies | 28893 view(s)
  • fab's Avatar
    26th February 2020, 16:20
    fab replied to a thread Hutter Prize update in Data Compression
    I think the rules should be modified so that the program is not required to run on CPUs older than the Intel Haswell family. This way it is possible to assume that AVX2+FMA is always available. Otherwise the performance of machine learning algorithms such as the ones implemented by NNCP will be too low to be usable (if AVX2+FMA is not available, the performance is 4x slower per cycle...). So I suggest to remove the "Intel Core i7-620M" as test machine. It is a CPU of 2010 which does not support AVX2 nor FMA.
    42 replies | 1748 view(s)
  • Darek's Avatar
    26th February 2020, 11:31
    Darek replied to a thread Paq8pxd dict in Data Compression
    Quite interesting thing is that enwik8.drt is compressed 0,23% better than pure enwik8 but enwik9.drt is compressed 0,31% worse than pure enwik9 -> that's 0,54% of difference. Looks like text/word model is more efficient for bigger files than default model. Learning rate is different? Hmmmm....
    717 replies | 283131 view(s)
  • pacalovasjurijus's Avatar
    26th February 2020, 10:49
    Yes, and not already one time. The version 1.0.0.1.7 of White hole Software can compress and extract files.
    32 replies | 1823 view(s)
  • brispuss's Avatar
    26th February 2020, 07:53
    brispuss replied to a thread Paq8pxd dict in Data Compression
    I've run some further tests with paq8pxd V75 and added the results to the table below. Brief tests were done with paq8pxd v74, but there was no improvement in compression with slightly quicker compression times. I didn't think it was worth posting results for paq8pxd v74 tests. Tests run under Windows 7 64 bit, with i5-3570k CPU, and 8 GB RAM. Used SSE4 compiles of paq8pxd V*. Compressor Total file(s) size (bytes) Compression time (seconds) Compression options Original 171 jpg files 64,469,752 paq8pxd v69 51,365,725 7,753 -s9 paq8pxd v72 51,338,132 7,533 -s9 paq8pxd v73 51,311,533 7,629 -s9 paq8pxd v75 51,311,427 7,509 -s9 Tarred jpg files 64,605,696 paq8pxd v69 50,571,934 7,897 -s9 paq8pxd v72 50,552,930 7,756 -s9 paq8pxd v73 50,530,038 7,521 -s9 paq8pxd v75 50,528,772 7,501 -s9 For v75, improved compression, and slight reduction in compression time!
    717 replies | 283131 view(s)
  • Scope's Avatar
    26th February 2020, 06:56
    https://www.smithsonianmag.com/smithsonian-institution/smithsonian-releases-28-million-images-public-domain-180974263/ https://www.si.edu/openaccess
    1 replies | 90 view(s)
  • Sportman's Avatar
    25th February 2020, 23:29
    I generated a 1MB random file with this Pinetools random file generator but paq8pxd_v75 compress it 4 bytes smaller, so not a good tool to generate random files. Did you ever programmed a decompressor to see if one of your compressor outputs can be reversed back to the input and was binary identical?
    32 replies | 1823 view(s)
  • Scope's Avatar
    25th February 2020, 21:36
    Some metrics comparison: 2048x1320_nitish-kadam-34748 Butteraugli (AVIF - HEIC - HTJ2K - WebP - JPEG XL - MozJpeg) ​3-norm: (AVIF - HEIC - HTJ2K - JPEG XL - WebP - MozJpeg) DSSIM (AVIF - HEIC - JPEG XL - HTJ2K - WebP - MozJpeg) SSIMULACRA (AVIF - HEIC - JPEG XL - HTJ2K - WebP - MozJpeg) ​ ​VMAF: (AVIF - HEIC - HTJ2K - JPEG XL - WebP - MozJpeg) ​ But, on a larger bpp (in other examples, the result is almost similar): 2048x1320_alex-siale-95113 (~506000 bytes each image) Butteraugli (JPEG XL - MozJpeg - HEIC - AVIF - HTJ2K - WebP) ​3-norm: (JPEG XL - MozJpeg - HEIC - WebP - HTJ2K - AVIF) DSSIM (JPEG XL - HEIC - AVIF - MozJpeg - WebP - HTJ2K) SSIMULACRA (JPEG XL - HEIC - AVIF - WebP - HTJ2K - MozJpeg) VMAF: (MozJpeg - HTJ2K - HEIC - AVIF - WebP - JPEG XL) SSIM: (MozJpeg - HEIC - AVIF - WebP - HTJ2K - JPEG XL) 2048x1320_andrew-coelho-46449 (~432000 bytes each image) Butteraugli, DSSIM, SSIMULACRA (JPEG XL - best ... HTJ2K - worst) VMAF, SSIM: (MozJpeg - best ... JPEG XL - worst) - For example, Netflix uses these metrics in the Framework and on the blog (with a note that VMAF is more suitable for video).
    92 replies | 19705 view(s)
  • Piglet's Avatar
    25th February 2020, 19:46
    ​I am curious 10 KB landscape (https://imgsli.com/MTIyMzA/) mathematical numbers : ​ 10KB (lila) number of unique colors mean error min error max error standard RMS error Error num Error landscape (IrfanView) (DiffImg 2.2.0) deviation deviation (pixels) (% pixels) original 107245 avif 24780 3.91458 0 52 1.45583 4.17662 2 446 559 90.50067 heic 29750 5.23441 0 88 2.06841 5.62827 2 607 109 96.43958 jxl 98873 5.01614 0 104 2.65103 5.67359 2 432 727 89.98901 (!!!) mozjpeg 764 15.14926 0 116 4.80290 15.89238 2 578 094 95.36629 + mozjpeg better than heic error num pixels. :_rofl2:
    92 replies | 19705 view(s)
  • kaitz's Avatar
    25th February 2020, 19:37
    kaitz replied to a thread Paq8pxd dict in Data Compression
    Its detected as default data and its slower. Compression probably is worse for this reason.
    717 replies | 283131 view(s)
  • kaitz's Avatar
    25th February 2020, 19:00
    kaitz replied to a thread paq8px in Data Compression
    Text detection -> https://github.com/hxim/paq8px/issues/122#issue-565946723
    1819 replies | 520834 view(s)
  • kaitz's Avatar
    25th February 2020, 18:56
    kaitz replied to a thread Paq8pxd dict in Data Compression
    paq8pxd_v75 - Change wordModel1 enwik8 -s8 is about 16kb smaller. Time should be same, memory usage maybe bit smaller.
    717 replies | 283131 view(s)
  • Sportman's Avatar
    25th February 2020, 16:26
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Yesterday Iran Minister of Health and Medical Education signed document: 9761 infected, 468 died (4.8%). Prepare.
    10 replies | 305 view(s)
  • Mauro Vezzosi's Avatar
    25th February 2020, 12:08
    Mauro Vezzosi replied to a thread paq8px in Data Compression
    Also paq8px_v183fix1 has this problem: it executes an infinite loop of decoding of a base64 segment. I compared paq8px and paq8pxd: paq8pxd fixed this problem by adding the variable "g" in decode_base64(): int tlf=0,g=0; inn=0,g=1; if (g) break; //if past eof, break 197.959 paq8px_v183fix1+g -9 106.607 paq8pxd_74_SSE4 -s9 (much faster than paq8px_v183fix1+g, ~~3x)
    1819 replies | 520834 view(s)
  • Shelwien's Avatar
    24th February 2020, 22:40
    AI/ML/NN are totally useless for cryptography (too slow and inefficient).
    10 replies | 445 view(s)
  • LawCounsels's Avatar
    24th February 2020, 22:11
    With a few deliberate specific crafted inputs differing from each other in smallest possible adjacent next ( e.g. differ by just 1 single bit only ...etc ) AI neural network may extract some meaningful underlying salient patterns ( in total intelligibible manner to us human ) despite decryptions , which AI Neural Networks are most adapt at ) ?
    10 replies | 445 view(s)
  • Shelwien's Avatar
    24th February 2020, 21:27
    > With AI this can be much less than billion ? No, unless we're talking about detecting a specific version of a known algorithm, or parameters. We basically need to extract the algorithm description from sample comparison... at the very least we need to test every path through the algorithm (unique combination of taken branches). I guess you can find some complexity estimations for https://en.wikipedia.org/wiki/Fuzzing since it does something similar. > Again with quantum AI can reverse engineer despite encryption of compressed output ? As I said, quantum computing is not magical, its just computing based on elementary particles and physical laws. But even if we can test 2^100 keys in parallel, its still only equivalent to reducing the key size by 100 bits... which doesn't change anything if full key has 1024 bits. Also the actually existing "quantum" hardware is still slower than modern electronics.
    10 replies | 445 view(s)
  • LawCounsels's Avatar
    24th February 2020, 20:46
    >>You're talking about this: https://en.wikipedia.org/wiki/Chosen-plaintext_attack With billions of samples it should be possible to reverse-engineer normal compression algorithms With AI this can be much less than billion ? >>but its a known case in cryptography, so adding encryption after compression would still beat this type of attacks. Again with quantum AI can reverse engineer despite encryption of compressed output ?
    10 replies | 445 view(s)
  • pacalovasjurijus's Avatar
    24th February 2020, 20:24
    In version 1.0.0.1.7: In White hole software I used algorithm paq for c1 and u1, c and u I use algorithm and c2 and u3 I use my algorithm Calculus. Now, I am working in version 1.0.0.1.8. I use in the version my new algorithm for: c3 and u3 algorithm sorting information of 1 and 0 like yes and no.
    32 replies | 1823 view(s)
  • User's Avatar
    24th February 2020, 17:48
    User replied to a thread paq8px in Data Compression
    Test file PIF.mht PAQ8pxd (all versions) - ok PAQ8px to v71 - ok FP8 to v4 - ok PAQ8px over v72 (to v132) - error (file not created) FP8 over v5 (to v6) - error (file not created) i3-4130, 16 GB, Win 8.1
    1819 replies | 520834 view(s)
  • Marco_B's Avatar
    24th February 2020, 17:06
    ​I finished to elaborate Lens with a standard contest machinery, now every node in the trie has a presence in all the recency lists associated to the characters of an order 1 set: in case of an actually encountered contest the node is placed at the head of the list, for other ones at the tail. This is necessary to keep the consistency of the specific system of the Lens series (see above) to transmit a symbol. I conserved a single AC statistic for the fatherhood instead to reply it for every contest because in the mean the differences should be zero. I have been forced to decouple the LRU for the full dictionary from the list for zero children because it would be difficult to make a choice regarding the various rank of the symbols in respect to their contests. Unfortunately, though the compression ratio is better than that of Lens3, it remains far worse than simply emitting the appropriate bit index as in classical LZW. So I must admit this is a sterile path and I stop it here.
    1 replies | 432 view(s)
  • Sportman's Avatar
    24th February 2020, 13:24
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Today an Iranian news agency interview with a member of parliament gave confirmation that at Feb 13, 2020 there where already 50 deaths in Iran. Patient zero was a trader who flew regular to China for work. Last week elections where not canceled, 42.6% voted (from people with vote rights).
    10 replies | 305 view(s)
  • Sportman's Avatar
    24th February 2020, 04:03
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Italy passed Singapore and Japan (patient zero still not found): South Korea 602, Italy 157, Japan 146, Singapore 89, Hong Kong 74. Real CN, IR and HK counts are probably 10-60 times higher then reported. Serious or critical: 20-22% (each hospital can handle only a limited amount of cases) Deaths: 2-15% (flu 0.1-0.2%) (Finger) puls oximeter round 15-20 euro for early detection.
    10 replies | 305 view(s)
  • Shelwien's Avatar
    24th February 2020, 03:05
    You're talking about this: https://en.wikipedia.org/wiki/Chosen-plaintext_attack With billions of samples it should be possible to reverse-engineer normal compression algorithms, but its a known case in cryptography, so adding encryption after compression would still beat this type of attacks.
    10 replies | 445 view(s)
  • LawCounsels's Avatar
    24th February 2020, 02:52
    How many input and compressed file sets would you need to be able reverse engineer ? You may also want to design your special input files set !
    10 replies | 445 view(s)
  • Shelwien's Avatar
    24th February 2020, 02:32
    > it has 0-bits of advantages over ordinary bruteforce Well, the main idea of quantum computing is to use elementary particles and laws of physics for computing. But even normal electronics have some non-zero probability of errors, and require more and more compensation is circuit design (parity checks, ECC etc). And quantum logic has tens of percents of error probability per operation, so it was necessary to invent a whole new type of algorithms as a workaround. Still, there's a potential for higher density and parallelism than what we can have with further evolution of semiconductor electronics.
    10 replies | 445 view(s)
  • well's Avatar
    24th February 2020, 01:16
    quantum computing is term for selling more oil and gas as usually;) ordinary computer has one pseudo random number generator(prng) quantum computer has n-th true random number generators, where n is quantity of cubits... in gaussian distribution it has 0-bits of advantages over ordinary bruteforce...but in human associative style of actions it may be useful may be not:p it is nice to play cossacks with 8 000 units acting independently, that's all for what quantum computing is invented:D for your task in common case set of functions mapping one file to smaller file through instruction sets is countable but very big, that's why this task likes chess game is unsolved for a long time may be till the end of humanity for ia-32 and amd64!i'm sorry can not solved in one defined way but many algorithms and many programs you can get as hex-ray c-style decompiler:rolleyes:
    10 replies | 445 view(s)
  • Shelwien's Avatar
    24th February 2020, 00:32
    1) It may be possible to guess a known compression method from compressed data. There're even "recompression" programs (precomp etc) which make use of this to undo existing compression and apply something better. But even simple detection can be pretty hard (for example, lzham streams are encoded for specific window-size value - basically its necessary to try multiple codec versions and all possible window-size-log values to attempt decoding). 2) For some data samples (where the structure is easy to understand and predict from small substrings of data) and static bitcode algorithms it may be possible to reverse-engineer an unknown compression method - though even this requires luck and a lot of work. 3) Universal automatic solution to this task should be equivalent to Kolmogorov compression, and also breaking all cryptography etc. According to my estimations in some previous related threads here (based on Planck length and time), even the impossibly perfect quantum computers would just add ~150 bits to the key size that can be realistically bruteforced. So I'd say that a _new_ adaptive AC-based compression method can't be reverse-engineered from the data.
    10 replies | 445 view(s)
  • Sportman's Avatar
    23rd February 2020, 22:19
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    No comment. https://www.youtube.com/watch?v=F_TPjbu4FAE
    10 replies | 305 view(s)
  • Bulat Ziganshin's Avatar
    23rd February 2020, 22:19
    it's so easy that noone bothers to implement that. you may be first
    10 replies | 445 view(s)
  • Matt Mahoney's Avatar
    23rd February 2020, 21:29
    The new contest was entirely Marcus Hutter's idea. His money, his rules, although we reviewed them before the announcement. To me, the compressor is irrelevant to Kolmogorov complexity, but I think it does give an interesting twist to the contest. It will be interesting to see how you could take advantage of the shared code between the compressor and decompressor.
    42 replies | 1748 view(s)
  • well's Avatar
    23rd February 2020, 19:23
    well replied to a thread Hutter Prize update in Data Compression
    if i'm just playing with bits...c++ i do not like rather prefer pure c or assembler,,,btw, cmix v18 can compress to ~ 115 900 000 bytes with ssd swap file and 10gib of ram within 100 hours, the task is squeeze it up to 400k and code proper memory manager for cmix, i'm too lazy to take a part, my price begin with ...it depends on many factors, but may be someone want to win the hill;)
    42 replies | 1748 view(s)
  • LawCounsels's Avatar
    23rd February 2020, 19:18
    How easy with quantum computation ? Can it be prevented ?
    10 replies | 445 view(s)
  • Jarek's Avatar
    23rd February 2020, 18:59
    Jarek replied to a thread Hutter Prize update in Data Compression
    enwik10 as built of 10 languages does not only regard knowledge extraction, but also ability to find correspondence between languages, kind of automatic translation. It is an interesting question if/which compressors can do it. It can be tested as in this "Hilberg conjecture" for finding long range dependencies. For example in http://www.byronknoll.com/cmix.html we can read that enwik9, 8, 6 are compressed into correspondingly: 115714367, 14838332, 176377 bytes. We see it is sublinear behavior - larger files can be better compressed thanks to exploiting long range dependencies. The question is we have something similar in this enwiki10: what is the size difference between compressing its 10 language files together and separately? Improvement for compressing together can be interpreted as kind of automatic translation ability (rather only for the best compressors). Ok, not exactly translation - they probably contain different texts, so it is rather ability to exploit similarities between different languages (their models).
    42 replies | 1748 view(s)
  • bwt's Avatar
    23rd February 2020, 18:20
    bwt replied to a thread Hutter Prize update in Data Compression
    Maybe it is more interesting to compress eneik10 with <48 hours like sportsman do before. I think it is more applicative rather than compress enwik9. It is so long to compress 1gb file until 4-5 days....:eek:
    42 replies | 1748 view(s)
  • bwt's Avatar
    23rd February 2020, 18:12
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    @darek have you saved the paq8pxd48_bwt1 source code ?my HDD has crashed a long time ago so I don't have the source copy
    35 replies | 2565 view(s)
  • Jarek's Avatar
    23rd February 2020, 17:58
    Jarek replied to a thread Hutter Prize update in Data Compression
    It got some interest in https://old.reddit.com/r/MachineLearning/comments/f7z5sa/news_500000_prize_for_distilling_wikipedia_to_its/ https://news.slashdot.org/story/20/02/22/0434243/hutter-prize-for-lossless-compression-of-human-knowledge-increased-to-500000 There are compressors based on these huge BERT, GPL etc. models, but it is a disqualifying requirement. Also such advanced ML methods are deadly for CPU - this "no GPU usage" can discourage some interested people.
    42 replies | 1748 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:39
    This one actually compiles, but has the same problem with TextModel::p() not returning a value.
    35 replies | 2565 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:20
    I want to try writing an "Anti-LZ" coder - one that would encode some bitstrings that don't appear in the data, then enumeration index of the data with known exclusions. As to Nelson's file though, its produced with a known process: https://www.rand.org/pubs/monograph_reports/MR1418/index2.html
    21 replies | 1059 view(s)
  • bwt's Avatar
    23rd February 2020, 16:19
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    how about this source , can it be compiled ? thank you
    35 replies | 2565 view(s)
More Activity