Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • djubik's Avatar
    Today, 01:55
    I tried to compile this on linux, and it fails. Firstly, it looks like the make format is old? I'm not a make wiz, but the line: ".c.o" I think the modern way of doing this is "%.c : %.o" But when I tried changing that, I get a different error: ​Any idea how to fix it?
    14 replies | 6449 view(s)
  • lzhuff's Avatar
    Today, 01:08
    lzhuff replied to a thread Paq8pxd dict in Data Compression
    @darek, do you still have the paq8pxd48_bwt1 source code ??
    714 replies | 282850 view(s)
  • Darek's Avatar
    Yesterday, 22:32
    Darek replied to a thread Paq8pxd dict in Data Compression
    paq8pxd_v75 scores on my testset with -s9 option. Some better scores mainly for exe and txt files.
    714 replies | 282850 view(s)
  • Shelwien's Avatar
    Yesterday, 17:57
    Shelwien replied to a thread Paq8pxd dict in Data Compression
    DRT uses a static dictionary with only ~45k words. I guess enwik9 gets too many unconverted words. I wonder if nncp preprocessor could be used instead - either directly, or followed by utf16-to-utf8 conversion.
    714 replies | 282850 view(s)
  • image28's Avatar
    Yesterday, 17:10
    Early in the morning down under, thought I would quickly introduce myself. As in my bio, dug up some old code from my 20's yesterday after finding this site. Still early stages yet. wrote a quick bash test to compress the base enwiki9(*) 1gb file down two ~500mb using mostly grep regex's to generate dictionaries. while I sift through 20 years of backups. (*) updated, had en8 but was talking about enwiki9, had a look at improving the regex's a minute ago (seperating the xml tags may yield a better results) e.g cat enwik9 | grep -iEo "(<|<\/)*>" | sort | uniq > enwik9-tags cat enwik9 | grep -iEv "(<|<\/)*>" | grep -iEo "* " | sort | uniq > emwik9-words Forgot the regex I used to extract all other symbols :T One of my best algo's may be ready to go if I can find it. Only ever tested it on already compressed data mp3/xvid files, but could out compress tar/gz/bz2/rar/zip in those test cases. ( Once ran it for 5 days of iterations before it was compressing less than 8 bits per iteration. Decompression was much faster ). Will try find something to post each week over the coming weeks if ya'll are keen! Anyway, Hello, happy Wednesday/Thursday depending where you are, Kevin
    84 replies | 28799 view(s)
  • fab's Avatar
    Yesterday, 16:20
    fab replied to a thread Hutter Prize update in Data Compression
    I think the rules should be modified so that the program is not required to run on CPUs older than the Intel Haswell family. This way it is possible to assume that AVX2+FMA is always available. Otherwise the performance of machine learning algorithms such as the ones implemented by NNCP will be too low to be usable (if AVX2+FMA is not available, the performance is 4x slower per cycle...). So I suggest to remove the "Intel Core i7-620M" as test machine. It is a CPU of 2010 which does not support AVX2 nor FMA.
    40 replies | 1507 view(s)
  • Darek's Avatar
    Yesterday, 11:31
    Darek replied to a thread Paq8pxd dict in Data Compression
    Quite interesting thing is that enwik8.drt is compressed 0,23% better than pure enwik8 but enwik9.drt is compressed 0,31% worse than pure enwik9 -> that's 0,54% of difference. Looks like text/word model is more efficient for bigger files than default model. Learning rate is different? Hmmmm....
    714 replies | 282850 view(s)
  • pacalovasjurijus's Avatar
    Yesterday, 10:49
    Yes, and not already one time. The version 1.0.0.1.7 of White hole Software can compress and extract files.
    31 replies | 1762 view(s)
  • brispuss's Avatar
    Yesterday, 07:53
    brispuss replied to a thread Paq8pxd dict in Data Compression
    I've run some further tests with paq8pxd V75 and added the results to the table below. Brief tests were done with paq8pxd v74, but there was no improvement in compression with slightly quicker compression times. I didn't think it was worth posting results for paq8pxd v74 tests. Tests run under Windows 7 64 bit, with i5-3570k CPU, and 8 GB RAM. Used SSE4 compiles of paq8pxd V*. Compressor Total file(s) size (bytes) Compression time (seconds) Compression options Original 171 jpg files 64,469,752 paq8pxd v69 51,365,725 7,753 -s9 paq8pxd v72 51,338,132 7,533 -s9 paq8pxd v73 51,311,533 7,629 -s9 paq8pxd v75 51,311,427 7,509 -s9 Tarred jpg files 64,605,696 paq8pxd v69 50,571,934 7,897 -s9 paq8pxd v72 50,552,930 7,756 -s9 paq8pxd v73 50,530,038 7,521 -s9 paq8pxd v75 50,528,772 7,501 -s9 For v75, improved compression, and slight reduction in compression time!
    714 replies | 282850 view(s)
  • Scope's Avatar
    Yesterday, 06:56
    https://www.smithsonianmag.com/smithsonian-institution/smithsonian-releases-28-million-images-public-domain-180974263/ https://www.si.edu/openaccess
    0 replies | 39 view(s)
  • Sportman's Avatar
    25th February 2020, 23:29
    I generated a 1MB random file with this Pinetools random file generator but paq8pxd_v75 compress it 4 bytes smaller, so not a good tool to generate random files. Did you ever programmed a decompressor to see if one of your compressor outputs can be reversed back to the input and was binary identical?
    31 replies | 1762 view(s)
  • Scope's Avatar
    25th February 2020, 21:36
    Some metrics comparison: 2048x1320_nitish-kadam-34748 Butteraugli (AVIF - HEIC - HTJ2K - WebP - JPEG XL - MozJpeg) ​3-norm: (AVIF - HEIC - HTJ2K - JPEG XL - WebP - MozJpeg) DSSIM (AVIF - HEIC - JPEG XL - HTJ2K - WebP - MozJpeg) SSIMULACRA (AVIF - HEIC - JPEG XL - HTJ2K - WebP - MozJpeg) ​ ​VMAF: (AVIF - HEIC - HTJ2K - JPEG XL - WebP - MozJpeg) ​ But, on a larger bpp (in other examples, the result is almost similar): 2048x1320_alex-siale-95113 Butteraugli (JPEG XL - MozJpeg - HEIC - AVIF - HTJ2K - WebP) ​3-norm: (JPEG XL - MozJpeg - HEIC - WebP - HTJ2K - AVIF) DSSIM (JPEG XL - HEIC - AVIF - MozJpeg - WebP - HTJ2K) SSIMULACRA (JPEG XL - HEIC - AVIF - WebP - HTJ2K - MozJpeg) VMAF: (MozJpeg - HTJ2K - HEIC - AVIF - WebP - JPEG XL) SSIM: (MozJpeg - HEIC - AVIF - WebP - HTJ2K - JPEG XL)
    90 replies | 19344 view(s)
  • Piglet's Avatar
    25th February 2020, 19:46
    ​I am curious 10 KB landscape (https://imgsli.com/MTIyMzA/) mathematical numbers : ​ 10KB (lila) number of unique colors mean error min error max error standard RMS error Error num Error landscape (IrfanView) (DiffImg 2.2.0) deviation deviation (pixels) (% pixels) original 107245 avif 24780 3.91458 0 52 1.45583 4.17662 2 446 559 90.50067 heic 29750 5.23441 0 88 2.06841 5.62827 2 607 109 96.43958 jxl 98873 5.01614 0 104 2.65103 5.67359 2 432 727 89.98901 (!!!) mozjpeg 764 15.14926 0 116 4.80290 15.89238 2 578 094 95.36629 + mozjpeg better than heic error num pixels. :_rofl2:
    90 replies | 19344 view(s)
  • kaitz's Avatar
    25th February 2020, 19:37
    kaitz replied to a thread Paq8pxd dict in Data Compression
    Its detected as default data and its slower. Compression probably is worse for this reason.
    714 replies | 282850 view(s)
  • kaitz's Avatar
    25th February 2020, 19:00
    kaitz replied to a thread paq8px in Data Compression
    Text detection -> https://github.com/hxim/paq8px/issues/122#issue-565946723
    1817 replies | 519834 view(s)
  • kaitz's Avatar
    25th February 2020, 18:56
    kaitz replied to a thread Paq8pxd dict in Data Compression
    paq8pxd_v75 - Change wordModel1 enwik8 -s8 is about 16kb smaller. Time should be same, memory usage maybe bit smaller.
    714 replies | 282850 view(s)
  • Sportman's Avatar
    25th February 2020, 16:26
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Yesterday Iran Minister of Health and Medical Education signed document: 9761 infected, 468 died (4.8%). Prepare.
    9 replies | 266 view(s)
  • Mauro Vezzosi's Avatar
    25th February 2020, 12:08
    Mauro Vezzosi replied to a thread paq8px in Data Compression
    Also paq8px_v183fix1 has this problem: it executes an infinite loop of decoding of a base64 segment. I compared paq8px and paq8pxd: paq8pxd fixed this problem by adding the variable "g" in decode_base64(): int tlf=0,g=0; inn=0,g=1; if (g) break; //if past eof, break 197.959 paq8px_v183fix1+g -9 106.607 paq8pxd_74_SSE4 -s9 (much faster than paq8px_v183fix1+g, ~~3x)
    1817 replies | 519834 view(s)
  • Shelwien's Avatar
    24th February 2020, 22:40
    AI/ML/NN are totally useless for cryptography (too slow and inefficient).
    10 replies | 424 view(s)
  • LawCounsels's Avatar
    24th February 2020, 22:11
    With a few deliberate specific crafted inputs differing from each other in smallest possible adjacent next ( e.g. differ by just 1 single bit only ...etc ) AI neural network may extract some meaningful underlying salient patterns ( in total intelligibible manner to us human ) despite decryptions , which AI Neural Networks are most adapt at ) ?
    10 replies | 424 view(s)
  • Shelwien's Avatar
    24th February 2020, 21:27
    > With AI this can be much less than billion ? No, unless we're talking about detecting a specific version of a known algorithm, or parameters. We basically need to extract the algorithm description from sample comparison... at the very least we need to test every path through the algorithm (unique combination of taken branches). I guess you can find some complexity estimations for https://en.wikipedia.org/wiki/Fuzzing since it does something similar. > Again with quantum AI can reverse engineer despite encryption of compressed output ? As I said, quantum computing is not magical, its just computing based on elementary particles and physical laws. But even if we can test 2^100 keys in parallel, its still only equivalent to reducing the key size by 100 bits... which doesn't change anything if full key has 1024 bits. Also the actually existing "quantum" hardware is still slower than modern electronics.
    10 replies | 424 view(s)
  • LawCounsels's Avatar
    24th February 2020, 20:46
    >>You're talking about this: https://en.wikipedia.org/wiki/Chosen-plaintext_attack With billions of samples it should be possible to reverse-engineer normal compression algorithms With AI this can be much less than billion ? >>but its a known case in cryptography, so adding encryption after compression would still beat this type of attacks. Again with quantum AI can reverse engineer despite encryption of compressed output ?
    10 replies | 424 view(s)
  • pacalovasjurijus's Avatar
    24th February 2020, 20:24
    In version 1.0.0.1.7: In White hole software I used algorithm paq for c1 and u1, c and u I use algorithm and c2 and u3 I use my algorithm Calculus. Now, I am working in version 1.0.0.1.8. I use in the version my new algorithm for: c3 and u3 algorithm sorting information of 1 and 0 like yes and no.
    31 replies | 1762 view(s)
  • User's Avatar
    24th February 2020, 17:48
    User replied to a thread paq8px in Data Compression
    Test file PIF.mht PAQ8pxd (all versions) - ok PAQ8px to v71 - ok FP8 to v4 - ok PAQ8px over v72 (to v132) - error (file not created) FP8 over v5 (to v6) - error (file not created) i3-4130, 16 GB, Win 8.1
    1817 replies | 519834 view(s)
  • Marco_B's Avatar
    24th February 2020, 17:06
    ​I finished to elaborate Lens with a standard contest machinery, now every node in the trie has a presence in all the recency lists associated to the characters of an order 1 set: in case of an actually encountered contest the node is placed at the head of the list, for other ones at the tail. This is necessary to keep the consistency of the specific system of the Lens series (see above) to transmit a symbol. I conserved a single AC statistic for the fatherhood instead to reply it for every contest because in the mean the differences should be zero. I have been forced to decouple the LRU for the full dictionary from the list for zero children because it would be difficult to make a choice regarding the various rank of the symbols in respect to their contests. Unfortunately, though the compression ratio is better than that of Lens3, it remains far worse than simply emitting the appropriate bit index as in classical LZW. So I must admit this is a sterile path and I stop it here.
    1 replies | 420 view(s)
  • Sportman's Avatar
    24th February 2020, 13:24
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Today an Iranian news agency interview with a member of parliament gave confirmation that at Feb 13, 2020 there where already 50 deaths in Iran. Patient zero was a trader who flew regular to China for work. Last week elections where not canceled, 42.6% voted (from people with vote rights).
    9 replies | 266 view(s)
  • Sportman's Avatar
    24th February 2020, 04:03
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Italy passed Singapore and Japan (patient zero still not found): South Korea 602, Italy 157, Japan 146, Singapore 89, Hong Kong 74. Real CN, IR and HK counts are probably 10-60 times higher then reported. Serious or critical: 20-22% (each hospital can handle only a limited amount of cases) Deaths: 2-15% (flu 0.1-0.2%) (Finger) puls oximeter round 15-20 euro for early detection.
    9 replies | 266 view(s)
  • Shelwien's Avatar
    24th February 2020, 03:05
    You're talking about this: https://en.wikipedia.org/wiki/Chosen-plaintext_attack With billions of samples it should be possible to reverse-engineer normal compression algorithms, but its a known case in cryptography, so adding encryption after compression would still beat this type of attacks.
    10 replies | 424 view(s)
  • LawCounsels's Avatar
    24th February 2020, 02:52
    How many input and compressed file sets would you need to be able reverse engineer ? You may also want to design your special input files set !
    10 replies | 424 view(s)
  • Shelwien's Avatar
    24th February 2020, 02:32
    > it has 0-bits of advantages over ordinary bruteforce Well, the main idea of quantum computing is to use elementary particles and laws of physics for computing. But even normal electronics have some non-zero probability of errors, and require more and more compensation is circuit design (parity checks, ECC etc). And quantum logic has tens of percents of error probability per operation, so it was necessary to invent a whole new type of algorithms as a workaround. Still, there's a potential for higher density and parallelism than what we can have with further evolution of semiconductor electronics.
    10 replies | 424 view(s)
  • well's Avatar
    24th February 2020, 01:16
    quantum computing is term for selling more oil and gas as usually;) ordinary computer has one pseudo random number generator(prng) quantum computer has n-th true random number generators, where n is quantity of cubits... in gaussian distribution it has 0-bits of advantages over ordinary bruteforce...but in human associative style of actions it may be useful may be not:p it is nice to play cossacks with 8 000 units acting independently, that's all for what quantum computing is invented:D for your task in common case set of functions mapping one file to smaller file through instruction sets is countable but very big, that's why this task likes chess game is unsolved for a long time may be till the end of humanity for ia-32 and amd64!i'm sorry can not solved in one defined way but many algorithms and many programs you can get as hex-ray c-style decompiler:rolleyes:
    10 replies | 424 view(s)
  • Shelwien's Avatar
    24th February 2020, 00:32
    1) It may be possible to guess a known compression method from compressed data. There're even "recompression" programs (precomp etc) which make use of this to undo existing compression and apply something better. But even simple detection can be pretty hard (for example, lzham streams are encoded for specific window-size value - basically its necessary to try multiple codec versions and all possible window-size-log values to attempt decoding). 2) For some data samples (where the structure is easy to understand and predict from small substrings of data) and static bitcode algorithms it may be possible to reverse-engineer an unknown compression method - though even this requires luck and a lot of work. 3) Universal automatic solution to this task should be equivalent to Kolmogorov compression, and also breaking all cryptography etc. According to my estimations in some previous related threads here (based on Planck length and time), even the impossibly perfect quantum computers would just add ~150 bits to the key size that can be realistically bruteforced. So I'd say that a _new_ adaptive AC-based compression method can't be reverse-engineered from the data.
    10 replies | 424 view(s)
  • Sportman's Avatar
    23rd February 2020, 22:19
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    No comment. https://www.youtube.com/watch?v=F_TPjbu4FAE
    9 replies | 266 view(s)
  • Bulat Ziganshin's Avatar
    23rd February 2020, 22:19
    it's so easy that noone bothers to implement that. you may be first
    10 replies | 424 view(s)
  • Matt Mahoney's Avatar
    23rd February 2020, 21:29
    The new contest was entirely Marcus Hutter's idea. His money, his rules, although we reviewed them before the announcement. To me, the compressor is irrelevant to Kolmogorov complexity, but I think it does give an interesting twist to the contest. It will be interesting to see how you could take advantage of the shared code between the compressor and decompressor.
    40 replies | 1507 view(s)
  • well's Avatar
    23rd February 2020, 19:23
    well replied to a thread Hutter Prize update in Data Compression
    if i'm just playing with bits...c++ i do not like rather prefer pure c or assembler,,,btw, cmix v18 can compress to ~ 115 900 000 bytes with ssd swap file and 10gib of ram within 100 hours, the task is squeeze it up to 400k and code proper memory manager for cmix, i'm too lazy to take a part, my price begin with ...it depends on many factors, but may be someone want to win the hill;)
    40 replies | 1507 view(s)
  • LawCounsels's Avatar
    23rd February 2020, 19:18
    How easy with quantum computation ? Can it be prevented ?
    10 replies | 424 view(s)
  • Jarek's Avatar
    23rd February 2020, 18:59
    Jarek replied to a thread Hutter Prize update in Data Compression
    enwik10 as built of 10 languages does not only regard knowledge extraction, but also ability to find correspondence between languages, kind of automatic translation. It is an interesting question if/which compressors can do it. It can be tested as in this "Hilberg conjecture" for finding long range dependencies. For example in http://www.byronknoll.com/cmix.html we can read that enwik9, 8, 6 are compressed into correspondingly: 115714367, 14838332, 176377 bytes. We see it is sublinear behavior - larger files can be better compressed thanks to exploiting long range dependencies. The question is we have something similar in this enwiki10: what is the size difference between compressing its 10 language files together and separately? Improvement for compressing together can be interpreted as kind of automatic translation ability (rather only for the best compressors). Ok, not exactly translation - they probably contain different texts, so it is rather ability to exploit similarities between different languages (their models).
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 18:20
    bwt replied to a thread Hutter Prize update in Data Compression
    Maybe it is more interesting to compress eneik10 with <48 hours like sportsman do before. I think it is more applicative rather than compress enwik9. It is so long to compress 1gb file until 4-5 days....:eek:
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 18:12
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    @darek have you saved the paq8pxd48_bwt1 source code ?my HDD has crashed a long time ago so I don't have the source copy
    35 replies | 2535 view(s)
  • Jarek's Avatar
    23rd February 2020, 17:58
    Jarek replied to a thread Hutter Prize update in Data Compression
    It got some interest in https://old.reddit.com/r/MachineLearning/comments/f7z5sa/news_500000_prize_for_distilling_wikipedia_to_its/ https://news.slashdot.org/story/20/02/22/0434243/hutter-prize-for-lossless-compression-of-human-knowledge-increased-to-500000 There are compressors based on these huge BERT, GPL etc. models, but it is a disqualifying requirement. Also such advanced ML methods are deadly for CPU - this "no GPU usage" can discourage some interested people.
    40 replies | 1507 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:39
    This one actually compiles, but has the same problem with TextModel::p() not returning a value.
    35 replies | 2535 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:20
    I want to try writing an "Anti-LZ" coder - one that would encode some bitstrings that don't appear in the data, then enumeration index of the data with known exclusions. As to Nelson's file though, its produced with a known process: https://www.rand.org/pubs/monograph_reports/MR1418/index2.html
    21 replies | 1047 view(s)
  • bwt's Avatar
    23rd February 2020, 16:19
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    how about this source , can it be compiled ? thank you
    35 replies | 2535 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:05
    Maybe it was a different source version. This source is broken, it won't compile.
    35 replies | 2535 view(s)
  • Shelwien's Avatar
    23rd February 2020, 16:03
    Problem is that its basically the same contest, just with better prizes. New participants would still have to compete with all the time that Alex spent on tweaking and testing all kinds of things. So we can't expect sudden 10% breakthroughs here, at least not while using the same paq framework (cmix is also paq-based). Even 1% required for a prize would be not that easy to reach. > students who earn part-time 4 euro a hour Its very unlikely in this case. The potential winner has to be a C++ programmer (which is out of fashion) with a good knowledge of state-of-art compression methods, and with access to hardware for cmix testing.
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 15:55
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    could you compile using gcc 7.0 please ? because before you compile with gcc 7.0 is successful. thank you
    35 replies | 2535 view(s)
  • Kaw's Avatar
    23rd February 2020, 15:40
    50 bits of redundancy is not a very big deal if you talk about randomness. For compression you have basically 2 options: 1. On the fly statistics and trying to compress the data with partial knowledge. (Time serie prediction) 2. Use knowledge of the entire file, but then you have to include a description of this knowledge in the output file. If you look at time series prediction and randomness, it's really hard to find working patterns that will help you to compress the file. In a 100% random file 50% of the patterns will have a bias to 0 and 50% a bias to 1. Half of those patterns will end up having a bias the other way around at the end of the file. There might be a pattern finding 50 bits of redundancy, but which is it? And will it hold up for the second half of the file? If you are able to make an algorithm that finds that 50 bits redundancy in this file it will be very strong AI. If you look at prior knowledge: how do you describe this redundancy or bias within 50 bits? Also this would be a major improvement if you are able to describe fairly complex patterns within an amount of bits that's below the advantage you get from describing it.
    21 replies | 1047 view(s)
  • Sportman's Avatar
    23rd February 2020, 14:05
    Agree, but because you need to publish source code it can be strategic to submit the best possible version as first version and who knows how many % improvement that is. I think about outsiders, students who earn part-time 4 euro a hour in a restaurant or supermarket or have no income at all accept parents money or government/bank loan.
    40 replies | 1507 view(s)
  • Shelwien's Avatar
    23rd February 2020, 13:08
    I tried, but TextModel::p() seems to be missing the return statement, and there's a bracket mismatch somewhere. So it doesn't compile.
    35 replies | 2535 view(s)
  • Shelwien's Avatar
    23rd February 2020, 12:07
    You have to understand that both 50k and 500k values are just for advertisement. Realistically new people would have to invest months of work even to make $5k.
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 11:56
    bwt replied to a thread paq8lab 1.0 archiver in Data Compression
    paq8pxdv48_bwt2 @shelwien could you compile this source code please ? maybe could you remove zlib n gif function so i can compile it my self please ? thank you
    35 replies | 2535 view(s)
  • Sportman's Avatar
    23rd February 2020, 11:51
    Is (Marcus) Hutter pize money source grants or his Google Senior Researcher DeepMind salary (maybe with employee stocks/options)? Grants: 2019 - 2023 A$ 7’500’000,- ANU Grand Challenge. 10 CIs. Human Machine Intelligence (HMI). 2019 - 2021 US$ 276’000,- Future of Life Project grant. Sole CI. The Control Problem for Universal AI: A Formal Investigation (CPUAI). 2015 - 2019 A$ 421’500,- Australian Research Council DP grant. Sole CI. Unifying Foundations for Intelligent Agents (UFIA).
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 11:37
    bwt replied to a thread Hutter Prize update in Data Compression
    ​:eek: ​
    40 replies | 1507 view(s)
  • Sportman's Avatar
    23rd February 2020, 11:30
    This CPU and motherboard only support 64GB max and CPU has 8 cores (16 threads) so in theory you can run 16 instances with 4GB memory each (64GB memory installed). For 64GB: Memory: Crucial Ballistix Sport LT 64GB (2666MHz) 285 euro. Total: 660 euro. For 128GB: Motherboard: ASRock X570M Pro4 (micro-ATX) 195 euro. CPU: AMD Ryzen 9 3900X (12 cores, 3.8-4.6GHz) 470 euro. Memory: HyperX Fury black 128MB (3200MHz) 675 euro. Total: 1515 euro.
    40 replies | 1507 view(s)
  • bwt's Avatar
    23rd February 2020, 05:56
    bwt replied to a thread Hutter Prize update in Data Compression
    looking at ltcb site, cmix v17 has beats phda9 but it use 25gb ram and more time. how about to reduce the variable setting ? is it still better than phda9 ??
    40 replies | 1507 view(s)
  • well's Avatar
    23rd February 2020, 05:39
    ok, i believe you, since i've vision who are you irl and it is best not to be beyond the pale in my searching of truth...thanks evgeniy and sportman for answering!
    7 replies | 1397 view(s)
  • Shelwien's Avatar
    23rd February 2020, 04:15
    @CompressMaster: I don't provide that kind of service. You can see mcm results here: http://mattmahoney.net/dc/text.html#1449 and download enwik9.pmd here: http://mattmahoney.net/dc/textdata.html
    40 replies | 1507 view(s)
  • Shelwien's Avatar
    23rd February 2020, 03:57
    > What about building your own PC Keep in mind that a single run takes 5-7 days... I suppose it would be better to install 256GB of RAM (so 490+345=835?), then run 8 instances at once. Actually for some tasks, like submodel memory usage tweaking, or submodel contribution estimation it would be better to run individual submodels, write their predictions to files, then do the final mix/SSE pass separately - it would require recomputing only the results of modified submodel. But unfortunately many other tasks - like testing new preprocessing ideas, or article reordering, or WRT dictionary optimization - would still require complete runs. Btw I actually did experiment with article reordering for enwik8... but I tried to speed-optimize it by compressing reordered files with ppmd instead of paq. Unfortunately after actual testing it turned out that article order that improves ppmd compression hurts it for paq.
    40 replies | 1507 view(s)
  • Shelwien's Avatar
    23rd February 2020, 03:35
    @Self_Recursive_Data: > Matt says "The task is now to compress (instead of decompress) enwik9 to a self extracting archive" > Why the change to compress it and not decompress? I think its an attempt to buy phda sources from Alex (and then to prevent anybody else monopolizing the contest), since they require compressor's sources now. It was a decompressor before, but with only a decompressor source it might be still hard to reproduce the result if algorithm is asymmetric (ie compressor does some data optimization, then encodes the results) which might be the case for phda. Also it looks to me that some zpaq ideas affected the new rules: I think Matt expects the compressor to generate a custom decompressor based on file analysis, that's probably why both compressor and decompressor size are counted as part of result. Not sure why he decided to not support the more common case where enc/dec are symmetric, maybe its an attempt to promote asymmetry? > Shouldn't the time measured measure the total time to compress+decompress? The time doesn't affect the results. Based on "Each program must run on 1 core in less than 100 hours" we can say that "compress+decompress" is allowed 200 hours. > Strong AI would cycle/recurse through finding and digesting new information, > then extracting new insights, repeat. > Kennon's algorithm for example compresses very slowly but extracts super fast. Sure, but 100 hours is more than 4 days. It should be enough time to do multiple passes or whatever, if necessary. Matt doesn't have a dedicated server farm for testing contest entries, so they can't really run for too long.
    40 replies | 1507 view(s)
  • Sportman's Avatar
    23rd February 2020, 02:25
    Sportman replied to a thread 2019-nCoV in The Off-Topic Lounge
    Looks like still not found, hope for Italy that it was a tourist and not somebody who still walk round in Italy. Italy passed Hong Kong: South Korea 433, Japan 134, Singapore 89, Italy 79, Hong Kong 70. Improve health as long there is time.
    9 replies | 266 view(s)
  • Sportman's Avatar
    23rd February 2020, 01:26
    That's the chicken egg problem, best is to start working on something and if you win ask to send the price (in multiple smaller transactions once a month) to somebody you trust (via bank wire transfer) this way you stay anonymous and ask the bank no questions.
    7 replies | 1397 view(s)
  • Sportman's Avatar
    23rd February 2020, 00:57
    What about building your own PC: Case: Cooler Master MasterBox Q300L (micro-ATX), 45 euro Power supply: Seasonic Focus Gold 450W (80 Plus Gold), 60 euro Motherboard: Gigabyte B450M DS3H (micro-ATX), 70 euro CPU: AMD Ryzen 7 1700 + cooler (8 cores, 3.0-3.7GHz), 130 euro Memory: Crucial Ballistix Sport LT 32GB (2666MHz), 115 euro Storage: Adata XPG SX6000 Lite 512GB (NVMe M.2), 70 euro Total:, 490 euro Costs excl. energy 1 year:, 41 euro p/m Costs excl. energy 2 years:, 20 euro p/m Costs excl. energy 3 years:, 14 euro p/m Grabbed the parts, did not check if they match all.
    40 replies | 1507 view(s)
  • Self_Recursive_Data's Avatar
    23rd February 2020, 00:50
    Can someone answer my post #9 above ?
    40 replies | 1507 view(s)
  • CompressMaster's Avatar
    23rd February 2020, 00:44
    @Shelwien, could I request you for upload enwik9 compressed alongside with decompressor (mcm 0.84 with options -x3 and -x10) as you did with enwik10 in MEGA cloud? As always, I´m stuck with low HDD space... Thank you very much.
    40 replies | 1507 view(s)
  • well's Avatar
    23rd February 2020, 00:33
    the problem is "they" do not want to talk about money but rather about only compression:confused: random data is not compressible:) i'm only interesting now in hutter prize, it seems to me rather profitable task:D and yes, i have very few knowledges about technics of compression, but i do not give this field too much my time, just basics... my goal is money, if this goal breaks spirit of compression competition then everything will stay in-place and in-time, just let me know!:_shuffle2:
    7 replies | 1397 view(s)
  • Sportman's Avatar
    23rd February 2020, 00:22
    Ask if they can pay you out in a top 10 crypto currency if you win, they can exchange dollars to crypto and send it to you. I had a random data contest with a price, but it was not claimed before end last year, but you can always send what you have to me to verify your claim. If you managed to create a working random data compressor you can indeed better stay very low profile.
    7 replies | 1397 view(s)
  • Shelwien's Avatar
    23rd February 2020, 00:05
    I actually can run 3-4 instances of cmix at home. I just wanted to point that running one costs considerable money, it'd be $5 per run on my home PC (which is probably much faster than hetzner VM) just in electricity bills. Just that some optimizations like what Alex suggests (optimizing article reordering etc) would really benefit from having access to 100s of free instances. Anyway, Intel devcloud would be likely a better choice atm, since they provide a few months of free trial... but I think they don't allow a single task to run for more than 24 hours, so it would be necessary to implement some kind of save/load feature. Btw https://encode.su/threads/3242-google-colab-compression-testing
    40 replies | 1507 view(s)
  • well's Avatar
    22nd February 2020, 23:59
    sorry, i cann't go personally to australia, but i can call by skype and drop e-mail, and already i did this, no answer by phone, no answer by e-mail...i also can get money and in my credit card but ukrainian bank system is too restricted...just treat me right, i'm living in political and social getto and just want to be sure that prize money i'll get and there will be no media hype and so on connected with my person...i wanna be in as cool shadow as it can...
    7 replies | 1397 view(s)
  • schnaader's Avatar
    22nd February 2020, 23:41
    Hetzner has better pricing, 0.006 €/h or 36 €/month (CX51, 32 GB RAM, 240 GB disk). There's a "dedicated vCPU" alternative for 83 €/month, but I couldn't see a big performance difference last time I tried. Apart from that, Byron's offer for Google credit might still be available.
    40 replies | 1507 view(s)
More Activity