Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • rarkyan's Avatar
    Today, 04:43
    I dont know how to read this. Maybe i can ask my friend to"translate" it. I'll see what i can do to help
    206 replies | 75431 view(s)
  • AiZ's Avatar
    Yesterday, 22:55
    AiZ replied to a thread CHK Hash Tool in Data Compression
    French.
    208 replies | 78769 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 19:28
    This might be of some interest here as AI is related to compression, and people behind this have worked on Pik and JPEG XL. Ihmehimmeli is a brain-inspired way to do artificial neural nets (recurrent/dynamic/temporal/spiking instead of function/tensor/static). https://ai.googleblog.com/2019/09/project-ihmehimmeli-temporal-coding-in.html code at https://github.com/google/ihmehimmeli
    0 replies | 91 view(s)
  • kaitz's Avatar
    Yesterday, 17:18
    kaitz replied to a thread paq8px in Data Compression
    paq8px_v182fix2 -8 20321247 687.30 2372 MB
    1717 replies | 479265 view(s)
  • pacalovasjurijus's Avatar
    Yesterday, 16:34
    from multiprocessing import Pool,Value def f(x): return (x+1)**16383 if __name__ == '__main__': pool = Pool(processes=4) # start 4 worker processes result = pool.apply_async(f, ) # evaluate "f(10)" asynchronously #print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow #print(pool.map(f, range(147456))) # prints "" bnkw=pool.map(f, range(147456)) import binascii import json block=147456 blockw=147455 blockw1=16384 virationc=16383 bitc=14 lenf1=0 a=0 qfl=0 h=0 byteb="" notexist="" lenf=0 dd=0 numberschangenotexistq = qwa=0 z=0 m = p=0 asd="" b=0 szx="" asf2="0b" while b<blockw1: m+= b=b+1 k = wer="" numberschangenotexist = numbers = name = input("What is name of file? ") namea="file.Spring" namem=name+"/" s="" qwt="" sda="" ert=0 aqwer=0 aqwq=0 aqwers=0 qwaw="" with open(namea, "w") as f4: f4.write(s) with open(namea, "a") as f3: f3.write(namem) with open(name, "rb") as binary_file: data = binary_file.read() lenf1=len(data) if lenf1<900000: print("This file is too small"); raise SystemExit s=str(data) lenf=len(data) while dd<3000: a=0 qfl=0 h=0 byteb="" notexist="" lenf=0 numberschangenotexistq = qwa=0 z=0 m = p=0 asd="" b=0 szx="" asf2="0b" while b<blockw1: m+= b=b+1 k = wer="" numberschangenotexist = numbers = s="" qwt="" ert=0 aqwer=0 aqwq=0 aqwers=0 qwaw="" dd=dd+1 if dd==1: sda=bin(int(binascii.hexlify(data),16)) szx="" lenf=len(sda) xc=8-lenf%8 z=0 if xc!=0: if xc!=8: while z<xc: szx="0"+szx z=z+1 sda=szx+sda lenf=len(sda) szx="" for byte in sda: aqwer=aqwer+1 aqwers=aqwers+1 qwaw=qwaw+byte if aqwer<=bitc: qwt=qwt+byte if aqwer==bitc: aqwq=int(qwt,2) qwt="" a=a+1 h=h+1 av=bin(aqwq) if a<=block and aqwer==bitc: aqwer=0 m = aqwq numbers.append(aqwq) if a == block: qwaw="" p=0 while p<blockw1: if p!=m: k.append(p) p=p+1 lenfg=len(k) if lenfg>0: acvb=lenfg-1 notexist=k if notexist<8192: raise SystemExit notexist=notexist-8192 szx=bin(notexist) lenf=len(szx) xc=13-lenf notexist=notexist+8192 z=0 if xc!=0: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" if lenfg==0: raise SystemExit b=-1 kl=blockw cb=0 er=-1 ghj=0 ghjd=1 bnk=1 p=0 cvz=0 for p in range(blockw): if lenfg>0: if virationc!=numbers: byteb=numbers numberschangenotexist.append(byteb) if virationc==numbers: numberschangenotexist.append(notexist) ghj=numberschangenotexist qfl=qfl+1 ghjd=ghj bnk=1 bnkd=1 kl=kl-1 qwa=qwa+1 if lenfg>0: bnk=bnkw ghjd=0 ghjd=ghj*bnk cvz=cvz+ghjd szx=bin(cvz) cvz=0 lenf=len(szx) if lenfg>0: xc=2064370-lenf z=0 if xc!=0: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" a=0 numberschangenotexist = del k del numbers m = b=0 while b<blockw1: m+= b=b+1 b=0 a=0 wer=wer+qwaw qwaw="" wer="1"+wer+"1" if dd==3000: lenf=len(wer) xc=8-lenf%8 z=0 if xc!=0: if xc!=8: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" wer="0b"+wer n = int(wer, 2) jl=binascii.unhexlify('%x' % n) sda=wer with open(namea, "ab") as f2ww: f2ww.write(jl)
    206 replies | 75431 view(s)
  • Mauro Vezzosi's Avatar
    Yesterday, 16:07
    Italian.
    208 replies | 78769 view(s)
  • maadjordan's Avatar
    Yesterday, 14:50
    maadjordan replied to a thread smpdf in Data Compression
    the conclusion from my post the PSO can get better result than running cpdf several times and the gain much better. The difference in cpdf is mainly the result reordring streams.
    11 replies | 3111 view(s)
  • encode's Avatar
    Yesterday, 14:34
    encode replied to a thread CHK Hash Tool in Data Compression
    Okay, I think it's time to test the CHK v3.20: https://compressme.net/ :_superman2:
    208 replies | 78769 view(s)
  • Jyrki Alakuijala's Avatar
    Yesterday, 12:37
    Have more fun and productivity with low-level platform-independent SIMD programming in C++ with https://github.com/google/highway. Highway is designed by Dr. Jan Wassenberg. We used a similar approach for HighwayHash, Randen, Pik, and JPEG XL, and now we have decided to separate the SIMD library into a separate library to make it more appealing for others to use it. The project name highway is a reference to multiple lanes of computation.
    0 replies | 83 view(s)
  • necros's Avatar
    Yesterday, 11:51
    necros replied to a thread smpdf in Data Compression
    this doesn`t answer the question why every iteration gives different size)
    11 replies | 3111 view(s)
  • Darek's Avatar
    Yesterday, 10:45
    Darek replied to a thread paq8px in Data Compression
    Looks like the v182fix change is biggest improvement for JPG files during lots of versions ago... @LucaBiondi - could you format yours second table numbers to 0 decimal places and to use 1000 separator? It would be slightly easier to read.
    1717 replies | 479265 view(s)
  • Shelwien's Avatar
    Yesterday, 00:52
    I think its mainly targeted at NN/TPU (8-bit floats etc).
    2 replies | 85 view(s)
  • LucaBiondi's Avatar
    17th September 2019, 23:51
    LucaBiondi replied to a thread paq8px in Data Compression
    Hi guys! We got big improvements from V182 to V182fix2 JPEG files gain 225 KB (WOW!) PDF files gain 11 KB DOC files gain 400 bytes ISO files gain 1 KB other files gain 0 byte NEW OVERALL RECORD! NEW JPEG RECORD NEW PDF RECORD NEW DOC RECORD NEW ISO RECORD Well done Gotty & Kaitz you are a pro! I love it when plains come togheter! Luca follow my cutting edge blog @https://sqlserverperformace.blogspot.com/
    1717 replies | 479265 view(s)
  • pacalovasjurijus's Avatar
    17th September 2019, 22:17
    Please, help me with python. I want to add processes to here. I have 8 core computer. Can't find anything from find this website: https://docs.python.org/3.1/library/multiprocessing.html What do I need to do which my code? Here is my code of python: block=147456 bnk=1 virationc=16383 kl=block bnk=pow(virationc,kl) Here is my Whole code:
    206 replies | 75431 view(s)
  • fhanau's Avatar
    17th September 2019, 20:36
    It is not run after ECT's deflate. There is some usage of deflate in the brute force and incremental strategies that may have confused you.
    416 replies | 104683 view(s)
  • encode's Avatar
    17th September 2019, 20:19
    encode replied to a thread CHK Hash Tool in Data Compression
    + Fixed Status Bar flicker + Added CSV export (Excel format - semicolon separated values) + Added "Set Font..." option (Font Name + Font Size) :_coffee:
    208 replies | 78769 view(s)
  • Gotty's Avatar
    17th September 2019, 19:32
    Gotty replied to a thread paq8px in Data Compression
    Nice! Well done! The explanation for the v153 speedup is here (exemodel is not applied to text blocks anymore). The explanation for the v179fix1 and v179fix2 speedup is here (removed stuff) and here (divisions were eliminated).
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    17th September 2019, 18:53
    Gotty replied to a thread paq8px in Data Compression
    Wow, I had no idea what that 1 and 2 could have meant. Now I know. A mystery is resolved in my head. Thanx.
    1717 replies | 479265 view(s)
  • kaitz's Avatar
    17th September 2019, 18:16
    kaitz replied to a thread paq8px in Data Compression
    px was worse because old wordmodel (v181) had : if ((c>='a' && c<='z') || c==1 || c==2 || (c>=128 &&(c!=3))) { it affected word0. These are wrt Firstupper and Upperword. In pxd its part of text0 but not word0. etc. Now its almost same compression as pxd.
    1717 replies | 479265 view(s)
  • Darek's Avatar
    17th September 2019, 17:53
    Darek replied to a thread paq8px in Data Compression
    Some my scores for FlashMX.pdf (difference to v182fix1 with option -9et = -8 bytes): paq8px_v182fix2 -9 1,320,362 paq8px_v182fix2 -9a 1,327,992 paq8px_v182fix2 -9et 1,319,236 paq8px_v182fix2 -9eta 1,326,909 Looks like "et" option gives some gain and option "a" hurts compression this time.
    1717 replies | 479265 view(s)
  • schnaader's Avatar
    17th September 2019, 15:18
    Sounds interesting, nice to see that there is research in that direction. Though it will be even harder for some new format to replace IEEE floats than with e.g. new image formats like APNG or FLIF. The old IEEE floats have many flaws/pitfalls, but they have been used and researched for a long time. After a quick glance, it looks like a useful experimental new format that seems to focus on precision instead of performance (although it tries to adress this, too).
    2 replies | 85 view(s)
  • maadjordan's Avatar
    17th September 2019, 12:50
    maadjordan replied to a thread smpdf in Data Compression
    I tired before to replicate your case on my files but with no success. currently the new compile give the following original 350,140 1st 347,923 2nd 347,921 3rd 347,921 4th 347,918 5th 347,915 6th 347,916 7th 347,918 8th 347,918 9th 347,915 10tg 347,922 but with PDFsizeopt i get 344,569 CPDF after PSO gives 344,674 CPDF can be used for quick optimizations and repair some damaged files but PSO can gain way more reduction on most cases. the file still has meta data after optimization and no tool can remove them safely. but still running multiple run be added to File-Optimizer to automate the file size comparing process. or you can write a batch script to do so. (i think such script has been made in this forum before)
    11 replies | 3111 view(s)
  • necros's Avatar
    17th September 2019, 12:26
    necros replied to a thread smpdf in Data Compression
    pls support my bug report, author closed and ignores it but it still exist in 2.3 (multiple iterations of optimizing same file give random results) https://github.com/coherentgraphics/cpdf-binaries/issues/12
    11 replies | 3111 view(s)
  • encode's Avatar
    17th September 2019, 12:14
    encode replied to a thread CHK Hash Tool in Data Compression
    8)
    208 replies | 78769 view(s)
  • schnaader's Avatar
    17th September 2019, 11:45
    schnaader replied to a thread paq8px in Data Compression
    Results for reymont and FlashMX.pdf PDF files on a Hetzner cloud server (2 vCPU, 4 GB RAM, 64-Bit Ubuntu) to test the PDF part from v182: comp. size (bytes) time (s) memory (MiB) reymont (6,627,202 bytes), polish text, uncompressed PDF paq8px_v181fix1 -8 771,345 1795.53 2204 paq8px_v181fix1 -9 770,107 1770.64 4028 paq8px_v181fix1 -9a 765,737 1804.49 4028 paq8px_v182fix1 -9 758,969 1802.48 4018 paq8px_v182fix1 -9a 755,188 1836.77 4018 FlashMX.pdf (4,526,946 bytes), english text, compressed PDF, many images paq8px_v181fix1 -8 1,329,386 1898.80 2529 paq8px_v181fix1 -8a 1,336,691 1920.17 2529 paq8px_v181fix1 -9 out of memory paq8px_v182fix1 -8 1,321,460 1844.51 2523 paq8px_v182fix1 -8a 1,328,893 1868.74 2523
    1717 replies | 479265 view(s)
  • Darek's Avatar
    17th September 2019, 11:03
    Darek replied to a thread paq8px in Data Compression
    And my testset scores by paq8px v182fix2 - for F.JPG file this version got the best overall score! That means paq8px variant holds 9 files records at now! For A10.JPG from MaximumCompression paq8px v182fix2 got also the best score overall = 628'405 bytes! Additioaly there are enwik8 all scores for v182fix1: 16'832'420 - enwik8 -s7eta -Paq8px_v182fix1 16'437'368 - enwik8.drt -s7eta -Paq8px_v182fix1 16'411'564 - enwik8 -s9eta -Paq8px_v182fix1 - best score for paq8px series 16'086'836 - enwik8.drt -s9eta -Paq8px_v182fix1
    1717 replies | 479265 view(s)
  • CompressMaster's Avatar
    17th September 2019, 10:14
    As for randomness, random data are incompressible, that´s true. But since randomness does not exists, they´re all compressible, only we need to find correct patterns. We still have 256 bytes even in text document where each character is unique. It depends only on selected interpretation of your data. Mr. Matt Mahoney tries to persuade you that you have a method that simply does not work at all. I DISAGREE WITH THAT! Well, randomness DOES NOT EXISTS AT ALL, its all about finding better and better way to minimize randomness and improve compression. As to BARF, it´s a fake software that does not compress anything. It was written as a joke to debunk claims that random compression is impossible, because some people claimed that they´re able to compress random data recursively. Again, it´s not possible to compress random data, because they does not exists at all - we still have some (and lot of, yet randomly occuring) patterns in it. Of course, infinite compression is impossible - some info MUST be stored. But recursive random data compression might be possible some day. It´s all about finding better and better ways how to do something. I believe... maybe I have an overestimated expectations, but never say never...
    206 replies | 75431 view(s)
  • Shelwien's Avatar
    17th September 2019, 10:06
    https://en.wikipedia.org/wiki/Unum_(number_format)#Type_III_Unum_%E2%80%93_Posit https://gitlab.com/cerlane/SoftPosit https://github.com/milankl/Sonums.jl
    2 replies | 85 view(s)
  • Darek's Avatar
    17th September 2019, 08:54
    Darek replied to a thread paq8px in Data Compression
    enwik7 test scores with timings - in Excel file. I've made it for all versions that I've had on the laptop. And the two charts - scores and timings by version and scores in time (based on files dates). I think it could be good also input into Jarek's database.
    1717 replies | 479265 view(s)
  • rarkyan's Avatar
    17th September 2019, 05:16
    I have question. Maybe someone can answer it. At this page: https://encode.su/threads/1176-loseless-data-compression-method-for-all-digital-data-type?p=31483&viewfull=1#post31483 Mr. Matt Mahoney give me code to generate 2^75000000 which is on his statement, it might be run in a very long time. I dont know whether it is possible to get the result or not, i did test it long long time ago. But since it doesnt have pause menu, i cant continue it. But in this page : https://encode.su/threads/1176-loseless-data-compression-method-for-all-digital-data-type?p=60768&viewfull=1#post60768 Schnaader states that : I once asked on some math forum about computer ability to generate 2^75000000. I forget the link, but as i remembered one of their representative answer that it will run in "infinite" time. ------------------------------------------- And the question is : 2^75000000 Is it possible? How long the process if its possible? Infinite or not ------------------------------------------- Im continuing my experiment.
    206 replies | 75431 view(s)
  • LucaBiondi's Avatar
    17th September 2019, 01:05
    LucaBiondi replied to a thread paq8px in Data Compression
    Thanks Gotty ...ready, just started to test!:_cool2: Luca
    1717 replies | 479265 view(s)
  • kaitz's Avatar
    16th September 2019, 23:17
    kaitz replied to a thread paq8px in Data Compression
    IMG080.jpg (967711 bytes) paq8px_182.fix1 -8 737230 Time 23.43 sec, used 2372 MB (2487680938 bytes) of memory paq8px_v182fix2 -8 736736 Time 21.69 sec, used 2372 MB (2487680938 bytes) of memory paq8pxd_v68_AVX2 -s8 736627 Time 19.29 sec, used 2209 MB (2316655105 bytes) of memory
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    16th September 2019, 22:42
    Gotty replied to a thread paq8px in Data Compression
    Aham, that helps indeed (with larger jpegs), and it's logical, too! Going in the next release. Thanx so much! ;-) Luca will be happy. "Preview" attached. Luca, it's all yours.
    1717 replies | 479265 view(s)
  • kaitz's Avatar
    16th September 2019, 22:16
    kaitz replied to a thread paq8px in Data Compression
    In SSE class like this: case JPEG: { pr = pr0; break; } In pxd i dont have final APM, it really hurts compression.
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    16th September 2019, 22:00
    Gotty replied to a thread paq8px in Data Compression
    Line 11105 in v182fix1? pr = (pr+pr0+1)>>1; Hmmm.. It's worse if a remove it (just tested with smaller and larger files as well). Is this the line you meant? Edit: @Luca: I tested it on your 3 large files :-) Of course. That is my large test set :-)
    1717 replies | 479265 view(s)
  • LucaBiondi's Avatar
    16th September 2019, 21:51
    LucaBiondi replied to a thread paq8px in Data Compression
    If you want add an option i will be happy to test it! Luca
    1717 replies | 479265 view(s)
  • kaitz's Avatar
    16th September 2019, 21:41
    kaitz replied to a thread paq8px in Data Compression
    More ... :D JPEG -> what if you removed final APM in SSE class for JPEG. Will compression be better?
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    16th September 2019, 20:28
    Gotty replied to a thread paq8px in Data Compression
    Thanx! I have it on my to do list for a long time - since Darek suggested, and you gave these hints.
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    16th September 2019, 20:22
    Gotty replied to a thread paq8px in Data Compression
    I noticed that when you posted the results last time and it matched to my results exactly (I also run tests on level -8 ) - except for some files where I used "-a" (adaptive learning rate). We are on the same wavelength.
    1717 replies | 479265 view(s)
  • kaitz's Avatar
    16th September 2019, 20:13
    kaitz replied to a thread paq8px in Data Compression
    nci improvement comes from wrt filter, as all other large files. DEC Alpha main improvement comes mostly from byte order swap and call filter. osdb improvment comes from Wordmodel i think, cant remember what context/check it was. not sure about others. As for testing with option -t I always test without it on files. At least when comparing with pxd versions.
    1717 replies | 479265 view(s)
  • Gotty's Avatar
    16th September 2019, 19:53
    Gotty replied to a thread paq8px in Data Compression
    Yes, you are absolutely right! Fix1 contains the extended text-pretraining. Any pre-training helps only during the first some kilobytes (of text files of course), when the NormalModel and WordModel of paq8px doesn't know anything about words and their morphology. As soon as the NormalModel and WordModel has learnt enough from the real data the effect of pre-training fades away and the model takes on. It means that the larger the file, the less text-pretraining helps proportionally. I don't know when it happens, but your feeling of 100K-200K seems right. The truth is: text-pretraining is not advantageous. Look: paq8px_v182fix1 -9a : 16'456'404 (no pre-training) paq8px_v182fix1 -9at: 16'411'564 (with pre-training) The difference is 41'840 bytes In order to decompress you'll need paq8px_v182fix1.exe (it's size must be added to the size of both results), and for the second case with pre-training, you'll need the pre-training files as well. So how large are they? Let's see. paq8px_v182fix1 -9a: 109'701 (input file is a list file containing: english.dic, english.emb, english.exp) 16'411'564+109'701= 16'521'265 We lost 64'861 bytes! The result without pre-training is better! I suggest that we don't use pre-training at all in any benchmarks - or when we use pre-training we must add the compressed size of the pre-training files to the final result. If we don't take into account these files, the result gives us a false sense that paq8px has beaten cmix. I suppose if you don't use pre-training neither for paq8px nor for cmix, cmix would still beat paq8px. When you have some time, could you run a test only on the files in your testset where paq8px has beaten cmix? I wounder what the results would be.
    1717 replies | 479265 view(s)
  • maadjordan's Avatar
    16th September 2019, 17:27
    maadjordan replied to a thread smpdf in Data Compression
    It seems that unicode file name support was missed during compilation. A temporary windows compile is available through this link http://www.coherentpdf.com/16thSeptember2019.zip
    11 replies | 3111 view(s)
  • boxerab's Avatar
    16th September 2019, 15:42
    Cool, thanks @pter : Let the HTJ2K vs XS battle begin !
    28 replies | 3441 view(s)
  • Darek's Avatar
    16th September 2019, 12:00
    Darek replied to a thread paq8px in Data Compression
    Scores of 4 Corpuses for paq8px v182fix1 - amazing improvements especially for smaller files (Calgary and Canterbury corpuses) - almost all the best scores for paq8px and the biggest gain I've ever seen for paq8px during the next versions (0.8-0.9%)! For ob1, progb, progc (Calgary), fields.c, grammar.lsp, sum, xargs.1 (Canterbury), FlashMX.pdf (MaximumCompression) this version have the best overall scores and beat latest cmix v18 version! I have one insight (maybe I'm wrong) but most of changes on fix1 versions gives 200-500 bytes of gain independent to file size (it's similar on R.DOC and G.EXE or even smaller for K.WAD) - it looks like this improvement works only or mostly for first 100-200KB or I'm wrong... One tip to more improve on Silesia corpus (I know it's tuning mostly for this corpus) -> there are some changes in paq8pxd version by Kaitz which a) adds DEC Alpha parser/model - it's gives about 500KB of gain on Mozilla file. b) There are model which gives about 60KB of gain for NCI file. Files oofice, osdb and x-ray are also better compression but maybe it's specific for this version of paq. Additionaly there are scores of enwik8 and enwik9 for paq8px v182 (w/o fix yet): 16'838'907 - enwik8 -s7eta -Paq8px_v182 16'435'259 - enwik8.drt -s7eta -Paq8px_v182 16'428'290 - enwik8 -s9eta -Paq8px_v182 16'086'695 - enwik8.drt -s9eta -Paq8px_v182 133'672'575 - enwik9 -s9eta -Paq8px_v182 129'948'994 - enwik9.drt -s9eta -Paq8px_v182 133'591'653 - enwik9_1423 -s9eta -Paq8px_v182 - best score for all paq8px version (except paq8pxd) 129'809'666 - enwik9_1423.drt -s9eta -Paq8px_v182 - best score for all aq8px version (except paq8pxd)
    1717 replies | 479265 view(s)
  • Krishty's Avatar
    16th September 2019, 09:05
    I forgot … there is one thing you could help me with. I see that genetic filtering is implemented in lodepng’s encoder, which seems to run after Zopfli. If so, what are the reasons for running it *after* deflate optimization instead of before – wouldn’t that affect compression negatively, especially block splitting?
    416 replies | 104683 view(s)
  • pter's Avatar
    16th September 2019, 06:17
    pter replied to a thread JPEG 3000 Anyone ? in Data Compression
    The HTJ2K (ISO/IEC 15444-15 | ITU T.814) specification has been published and is available free of charge at: https://www.itu.int/rec/T-REC-T.814/en
    28 replies | 3441 view(s)
  • Krishty's Avatar
    16th September 2019, 00:57
    Yes, but I didn’t get to the actual tests yet because I wanted to isolate the deflate part first. I’ll let you know once I have the results! Sorry if I was unclear – with -60 and -61 I mean -10060/-20060/-30060/etc. It would be a pity to remove those as the fun starts at -xxx11 and the sweet spot for maximal compression seems to be at -xxx30 to -xxx60 :) Yes, that is absolutely right and it’s absolutely possible that my test set was just bad. However, looking at ECT’s PNG performance – where it is almost never beaten, Leanify being not even close – that could imply some sort of error (if the benchmarks turn out to be valid, again). Sorry, I should rather have expressed this as “TODO for me to check out” rather than “questions” … I’m trying not to bother you with guesses here, rather trying to find out what’s going on in my tests and documenting it for others in case it’s useful to them :)
    416 replies | 104683 view(s)
  • fhanau's Avatar
    16th September 2019, 00:17
    1. -3 does not perform substantially better than -4 in my tests. Have you considered to use a different test set? 2. -60 and -61 are not supported options. In a future version ECT will reject those arguments so questions like these don't come up anymore. 3. That depends on the settings used for the tools and the files contained in the zip. ECT was mostly tuned on PNG and text files. On the example you provided, ECT does nineteen bytes worse than leanify, I think occasionally doing that amount worse is acceptable.
    416 replies | 104683 view(s)
  • Krishty's Avatar
    16th September 2019, 00:02
    Great work, thanks a lot! Guess I’ll do some tests anyway, just out of curiosity :) This helps me a lot to get a high-level overview, thanks. So – just to establish a check point here – my open questions with ECT are: Why does -3 perform substantially better than -4 or any higher levels? I know so far: it’s a filter thing (it does not show in my deflate-only benchmarks) a workaround is using the --allfilters option it could be rooted in OptiPNG How can -61 sometimes take a thousand times longer than -60? (Not -62 vs -61, sorry for the error in my previous post!) definitely a deflate thing; ECT-only could be related to long runs of identical pixels (does not show with Lenna & Co., but with comics and renderings) How can Leanify & advzip outperform ECT on ZIP files when my benchmarks show such a high superiority of ECT with PNGs? I’ll try to find answers in subsequent benchmarks …
    416 replies | 104683 view(s)
  • fhanau's Avatar
    15th September 2019, 23:10
    This is a simple heuristic that tunes the LZ cost model based on the results gained from running lazy LZ first when we only have time for one iteration. It is only enabled for PNG when using a low compression level, where it really helps in making ECT with one iteration competitive.
    416 replies | 104683 view(s)
  • fhanau's Avatar
    15th September 2019, 23:06
    I wrote most of ECT years ago, but it mostly comes down to performance improvements in pretty much every part of ECT's deflate, much better caching, a new match finder and better handling of the iterative cost model.
    416 replies | 104683 view(s)
  • MegaByte's Avatar
    15th September 2019, 21:20
    Some of ECT's filtering code was written by me -- including a genetic algorithm inspired by PNGwolf (as long as you activate it) but with better overall performance especially due to better seeding from the other filter methods. I don't expect PNGwolf to win in any cases currently. A number of the other filter algorithms were inspired by Cedric Louvier's post about TruePNG. Since that time, he wrote pingo, which does many of those filters much more efficiently than the brute-force methods included in the ECT code.
    416 replies | 104683 view(s)
  • Krishty's Avatar
    15th September 2019, 16:02
    Me as well. Unfortunately, no clue. ECT’s source code is very different, and for example in squeeze.c I see vast floating-point math on symbol costs with comments like: Sorry, but this is the first time I look into compression code; even plain zlib is still overwhelming to me and ECT looks like a master or doctor thesis to me. Maybe Felix could elaborate on that? (Also, I get carried away from the original question – whether ECT’s filtering is better than PNGwolf’s :) )
    416 replies | 104683 view(s)
  • maadjordan's Avatar
    15th September 2019, 15:07
    maadjordan replied to a thread smpdf in Data Compression
    CPDF v2.3 has been released. https://coherentpdf.com/blog/?p=92 bin for Win,Mac & Linux : https://github.com/coherentgraphics/cpdf-binaries
    11 replies | 3111 view(s)
  • Jyrki Alakuijala's Avatar
    15th September 2019, 14:40
    Do we know why? Better block split heuristics? I'd love to see such improvements integrated into the original Zopfli, too.
    416 replies | 104683 view(s)
  • Krishty's Avatar
    15th September 2019, 13:11
    In order to get the Deflate benchmarks more fair, I downloaded all compressors I know, compiled them on Windows for x64, and ran them. All sample images had row filtering entirely disabled (filter type zero) and were compressed with the Z_STORE setting to avoid bias if tools want to re-use compression choices from original input. The tests typically take a day or two, so there’s just a few data points so far: Lenna, Euclid, PNG transparency demonstration; all shown below. We’re looking at a very tight size difference here (often just at a per mille of the image). The size differences are really small. First, it can definitely be stated that ECT’s Zopfli blows everything else away. For little compression, it’s always several times faster than the Zopfli variantes. For long run times, it constantly achieves higher compression ratios. So high that often the worst run of ECT compresses better than the best run of any Zopfi-related tool. But ECT has some weird anomaly where more than 62 iterations where sometimes it becomes incredibly inefficient and suddenly takes ten or thousand(!) times longer to run than 61 or less iterations. This can be seen clearly on Euclid, but it is worse on transparency where I had to omit all runs above 61 iterations because the run-time jumped from twelve seconds to 24,000 (a two-thousand-fold increase)! Second, advpng’s 7-Zip seems to be broken. You don’t see it in the benchmarks because it compresses so bad that it didn’t make it into any of the graphs. It’s constantly some percent(!) worse than Zopfli & Co and I just can’t believe that. There has to be a bug in the code, but I couldn’t investigate that yet. Now, Zopfli. Advpng made very minor adjustments to its Zopfli (or it is just an outdated version?) and apart from the higher constant overhead, it’s basically the same. Leanify’s Zopfli has had some significant changes. It sometimes compresses better, sometimes worse. But on low compression levels, it often compresses better. The one problem I see with ECT is that its performance is almost unpredictable. Though better than Zopfli, the difference from -10032 to -10033 can be as large as the difference between Zopfli and ECT. This will be a problem with my upcoming filter benchmarks. I should check whether it smoothes when I apply defluff/DeflOpt to the output. Input images are attached.
    416 replies | 104683 view(s)
  • Krishty's Avatar
    14th September 2019, 22:01
    Fixed. A few years ago, I wrote a custom PNG variation with PPMd instead of Deflate which worked pretty well with the expand for 7z function in my optimizer. However, I ditched it because non-standard formats are pretty much useless. Now I’m investigating ECT’s efficiency. Nothing else comes to my mind right know. The Optimizer has a (non-critical) memory problem with GIF optimization. FlexiGIF outputs a *lot* of progress information, sometimes as much as a GiB over a few days of run-time. Optimizer keeps all that (needlessly) in memory. I’ll fix that for the next version.
    18 replies | 899 view(s)
  • Krishty's Avatar
    14th September 2019, 21:54
    Krishty replied to a thread FileOptimizer in Data Compression
    I noticed that the specific order of operations in Papa’s often yields 18-B savings over almost all other JPEG optimizers, but I didn’t have any time yet to investigate the cause. In case anyone bothers to find out, I attached Papa’s JPG handling code. I’d be glad to learn what causes this gain because I’m sure it can be reached more efficiently!
    652 replies | 185763 view(s)
  • CompressMaster's Avatar
    14th September 2019, 20:52
    @Krishty, 1,By attaching, I mean your 1st post. Could you repair that? Thanks. 2,What other unpublished stuffs do you have? (compression field)
    18 replies | 899 view(s)
  • maadjordan's Avatar
    14th September 2019, 17:34
    maadjordan replied to a thread 7-zip plugins in Data Compression
    New Plugin Added: ExFat7z
    1 replies | 406 view(s)
  • Krishty's Avatar
    14th September 2019, 16:46
    For pixels, yes. Metadata not.
    18 replies | 899 view(s)
  • necros's Avatar
    14th September 2019, 08:08
    necros replied to a thread FileOptimizer in Data Compression
    why Papa`s optimizer output smaller-sized jpgs by default than FO. Not great size difference but still.
    652 replies | 185763 view(s)
  • necros's Avatar
    14th September 2019, 07:30
    necros replied to a thread Papa’s Optimizer in Data Compression
    is BMP, GIF to PNG conversion lossless?
    18 replies | 899 view(s)
  • Gonzalo's Avatar
    13th September 2019, 18:10
    Hopefully this is the first step towards mass production and the reduction of costs.
    2 replies | 98 view(s)
  • pklat's Avatar
    13th September 2019, 15:00
    pklat replied to a thread 7-Zip in Data Compression
    could it perhaps use CUDA ?
    545 replies | 287553 view(s)
  • Darek's Avatar
    13th September 2019, 12:20
    Darek replied to a thread paq8px in Data Compression
    Scores for my testset for paq8px v182fix1 - very good work for smaller files - > average gain for my textual files is on the level of 2.1%! That means my testset textual files lose now to best cmix (v17) only 1.1%! It's very, very close now. This version also get best overall scores (and beat best cmix scores) for O.APR, T.DOC and Y.CFG!
    1717 replies | 479265 view(s)
  • schnaader's Avatar
    13th September 2019, 10:42
    schnaader replied to a thread paq8px in Data Compression
    Good work on this! Tried to do this as a separate transformation tool recently, and it didn't work out. Problems were: The length of the strings between brackets had to be coded, so e.g. replacing them with spaces does this, but adds information instead of just moving the text. Also, for many files, the kerning numbers between the brackets correlate with the previous and next character in the brackets, so separating them hurts compression. So context modelling is the way to go here. Another PDF thing that is relatively easy to implement would be xrefs. This is the big table at the end of PDF files. It encodes all the offsets of "x 0 obj" (where x is a incrementing number) and sorting this list by x leads to the xref table (although there can be some deleted objects between entries that don't appear in the previous part). Not a big saving as xref tables compress well anyway, but some KB per PDF.
    1717 replies | 479265 view(s)
  • Darek's Avatar
    13th September 2019, 10:26
    Darek replied to a thread lstm-compress in Data Compression
    Here are lstm-compress v3b scores for my testset. As total = -1.32% of gain but some files, especially small texual files got two-digit gains! It's my best option now without any optimizations which could give some additional gains. In the second table there are comparison of lstm-compress v3b to latest NNCP rc1 scores.
    74 replies | 8228 view(s)
  • Gotty's Avatar
    13th September 2019, 08:06
    Gotty replied to a thread paq8px in Data Compression
    Paq8px_v181 -9ta 16'446'172 Paq8px_v182 -9ta 16'428'290 Paq8px_v182fix1 -9ta 16'411'564
    1717 replies | 479265 view(s)
  • LucaBiondi's Avatar
    13th September 2019, 00:14
    LucaBiondi replied to a thread paq8px in Data Compression
    Hi! Gotty this time you have got a great great job! wow! Theese are results from my big testset V181 vs V182 JPEG achieve 10 KB! PDF gain 134 KB! TXT gain 30KB! ISO gain 50KB! MP3 AND XML loose some. New Overall record! New record for PDF, MP4,TXT, BAK, EXE and ISO files! Thank you!!! Luca
    1717 replies | 479265 view(s)
More Activity