Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Shelwien's Avatar
    Today, 01:44
    On win10, Large Pages only work when 1) Program executable is x64 2) Runs under admin 3) Policy allows large pages (“Computer Configuration” , “Windows Settings”, “Security Settings”, “Local Policies” , “User Rights Assignment”, “Lock Pages in memory”) 4) Unfragmented 2MB pages physically exist in memory manager (ie test is done soon after reboot) z:\021>2mpages.exe OpenProcessToken: <The operation completed successfully. > LookupPrivilegeValue: <The operation completed successfully. > AdjustTokenPrivileges: <The operation completed successfully. > LPM.size=200000: VirtualAlloc flags=20001000: <The operation completed successfully. > p=00C00000 Flags=20001000 Z:\021>timetest 7z a -mx1 -md27 -slp -mmt=1 1 "D:\000\enwik8" 7-Zip 19.02 alpha (x64) : Copyright (c) 1999-2019 Igor Pavlov : 2019-09-05 Archive size: 31739530 bytes (31 MiB) Tested program has wasted 12.563s Z:\021>timetest 7z a -mx1 -md27 -slp- -mmt=1 2 "D:\000\enwik8" 7-Zip 19.02 alpha (x64) : Copyright (c) 1999-2019 Igor Pavlov : 2019-09-05 Archive size: 31739530 bytes (31 MiB) Tested program has wasted 15.547s (15.547/12.563-1)*100 = 23.75%
    0 replies | 12 view(s)
  • Gotty's Avatar
    Today, 00:09
    :_good2: Interesting solution, indeed.I believe, the time complexity is O(n), not O(1). You need n-1 multiplications, n-1 divisions, and n-1 modulus operations with the current implementation. Sorting just a few very small numbers puts this algorithm in a different category from the general sorting algorithms (like the mentioned quick sort). So saying that your method is faster than quick sort is kind of cheating. The same way I could say, that I have just implemented a sorting method even faster than yours: it sorts two numbers: if(a<b)printf("%d, %d",a,b); else printf("%d, %d",b,a); It is certainly faster than yours and uses no extra memory. ;-) >>Here the range is 0-50 and the number of items n is 7. Memory requirement in this case is... how much? To be fixed: "database" should be renamed to "array"; "speed" to "time complexity", "searching" to "lookup".
    1 replies | 83 view(s)
  • algorithm's Avatar
    Yesterday, 23:40
    It is funny that he is not measuring gate length. You need to look from above to measure gate length (L) . He is measuring something closer to gate width. Also notice that a transistor can have multiple fins to drop Rds and increase Ids.
    2 replies | 62 view(s)
  • JamesWasil's Avatar
    Yesterday, 22:22
    That was a really good video. Thanks for sharing that. I kind of figured that they were using the nomenclature for marketing rather than really at the level of nm design they were saying after the early 2000's. This really put it into perspective with great detail. It's probably best to use the transistor count present as a gauge to know if it is or isn't more compact than a predecessor even with new technologies that are ahead of extreme ultraviolet lithography. You may have already seen this link, but if not there is a good suggestion for that here, where they suggest using a combination of characteristics to make it more accurate again: https://spectrum.ieee.org/semiconductors/devices/a-better-way-to-measure-progress-in-semiconductors
    2 replies | 62 view(s)
  • fcorbelli's Avatar
    Yesterday, 21:22
    fcorbelli replied to a thread zpaq updates in Data Compression
    I apologize for this problem. I just removed all (2020+) uploaded executables. I will not post anymore compiled code
    2554 replies | 1105278 view(s)
  • Shelwien's Avatar
    Yesterday, 19:50
    Shelwien replied to a thread zpaq updates in Data Compression
    Thanks for description, but hosting executables here is still risky - google has a lot of false positives and easily blocks sites in chrome.
    2554 replies | 1105278 view(s)
  • Shelwien's Avatar
    Yesterday, 19:43
    The main problem is that default irolz has 256kb window (d18). You can look for ROLZ here: http://mattmahoney.net/dc/text.html > I've installed codeblocks with mingw, why I can not run debug? Maybe compiled with debug options? http://wiki.codeblocks.org/index.php/Debugging_with_Code::Blocks
    43 replies | 2046 view(s)
  • Shelwien's Avatar
    Yesterday, 19:27
    https://www.virustotal.com/gui/file/bda1fb41d38429620596d0c73f0d9d8dcf94dd9ae63a3f763dc00959eadb1ba8/behavior Malware most likely. Script is obfuscated, so you'd need a debugger (MSE).
    2 replies | 46 view(s)
  • lz77's Avatar
    Yesterday, 19:12
    I found an article http://www.ezcodesample.com/rolz/rolz_article.html and saw some his examples of iROLZ... http://www.ezcodesample.com/rolz/skeleton_irolz_2_dictionaries.txt on enwik8, output: === Original and compressed data sizes 100000000 50804922 Approximate ratio relative to original size 0.345271 == Hm, 50 Mb is 34% from 100 Mb? Bad... http://www.ezcodesample.com/rolz/irolzstream.txt compresses ts40.txt to 42%, it's also bad... May be I saw wrong ROLZ sources? By the way: I've installed codeblocks with mingw, why I can not run debug? F8 etc. does not work...
    43 replies | 2046 view(s)
  • snowcat's Avatar
    Yesterday, 18:43
    I didn't see any statement, so maybe no. But I'm not really familiar with vbs, so... :) Note: This post is very off-topic. This should be in Off-topic.
    2 replies | 46 view(s)
  • LawCounsels's Avatar
    Yesterday, 17:44
    Hi : https://drive.google.com/uc?id=1yJne2_x3uhOf0nrb8Di1qBPtM9Q-Qoz8&export=download Document password:: 1320
    2 replies | 46 view(s)
  • Lithium Flower's Avatar
    Yesterday, 16:23
    sorry, my english is not good. Hello, I compress a lot of non-photographic images, image type is Japanese Anime and Japanese Manga, png rgb24 using mozjpeg jpeg lossy q95~q99, png rgba32 using cwebp webp near-lossless 60,80, and pingo png lossy pngfilter=100, get some problem in cwebp webp lossy. I using butteraugli check compressed image quality, but had some question with butteraugli distance, i need some hint or suggest for those question, thanks you very much. my image set like this image set, Tab Anime, AW, Manga, Pixiv, https://docs.google.com/spreadsheets/d/1ju4q1WkaXT7WoxZINmQpf4ElgMD2VMlqeDN2DuZ6yJ8/edit#gid=2135584682 thanks Scope provide this image set. https://encode.su/threads/2274-ECT-an-file-optimizer-with-fast-zopfli-like-deflate-compression?p=64829&viewfull=1#post64829 1. butteraugli and butteraugli jpeg xl assessment difference butteraugli distance I using butteraugli and butteraugli xl to check image, in *Reference 01, butteraugli's xyb is likely more accurate, but some image butteraugli assessment good distance(1.3), butteraugli xl assessment bad distance(near 2.0), and some image butteraugli reject butteraugli xl good distance(1.3), how to correct understand butteraugli distance and butteraugli xl 3-norm? 2. butteraugli safe area or great area Compress png rgba32 image, my process is using near-lossless 60 and pngfilter=100 to first compress, if compressed image not below safe butteraugli distance, using near-lossless 80 to second compress. I collect Jyrki Alakuijala Comment and create a table, in *Reference 02 '1.0 ~ 1.3 definitely works as designed', '1.0 ~ 1.6 A value below 1.6 is great', if i want my compressed image have a great quality, i should choose area 1.0 ~ 1.3 or area 1.0 ~ 1.6?, if i made a mistake please let me know. pngfilter=100 butteraugli distance , webp near-lossless 60 butteraugli distance , ,near-lossless 60 => near-lossless 80 command: pingo_rc3.exe -pngfilter=100 -noconversion -nosrgb -nodate -sa "%%A" cwebp.exe -mt -m 6 -af -near_lossless 60 -alpha_filter best -progress "%%A" -o "%%~nA.webp" 3. non-photographic image and mozjpeg encoder jpeg quality suggest Compress png rgb24 image, my process is using quality 95 to first compress, if compressed image not below safe butteraugli distance, increase quality to second compress. In my png rgb24 image set, butteraugli assessment jpeg quality 95 doesn't get good butteraugli distance, jpeg quality 95 butteraugli distance , jpeg quality 95 butteraugli xl distance , but in cjpeg usage.txt 'specifying a quality value above about 95 will increase the size of the compressed file dramatically, and while the quality gain from these higher quality values is measurable' https://github.com/mozilla/mozjpeg/blob/3fed7e016bb2a02a24799acd24030417cf4d6a6d/usage.txt#L115 If i compress non-photographic image to jpeg and want near psychovisual lossless, it is necessary using above quality 95 to compress those image? or in *Reference 03, possibly butteraugli is too sensitive in some non-photographic image? command: cjpeg.exe -optimize -progressive -quality 95 -quant-table 3 -sample 1x1 -outfile "mozjpeg\%%~nA.jpg" "%%A" 4. webp lossy q100 and butteraugli distance I test another non-photographic image set in webp lossy q100, but some image get larger butteraugli distance, it possibly webp lossy 420 subsampling and fancy upsampling will make some larger errors in some area? and i test webp lossy alpha(alpha_q) features, this features will increase butteraugli distance, but i don't understand, why lossy alpha will effect butteraugli distance? q100.png 2.013666 q100_lossy_alpha 80.png 2.035022 q100_lossy_alpha 50.png 2.099735 webp lossy q100 butteraugli distance , , dssim command: cwebp.exe -mt -m 6 -q 100 -sharp_yuv -pre 4 -af -alpha_filter best -progress "%%A" -o "%%~nA.webp" I creating some data table and quality test data, i will upload later, thanks you very much. 2d art bg png file https://mega.nz/file/FDBHmYjT#0EruxqhmJGZ4xKLh4tcgMGl_tgn1aV8FcTfPuFBEGmg ================================================================================================= Reference Area *Reference 01 From Jyrki Alakuijala Comment: Butteraugli vs Butteraugli(jpeg xl) butteraugli's xyb is likely more accurate, because of asymptotic log behaviour for high intensity values (instead of raising to a power), jpeg xl's xyb modeling is going to be substantially faster to computer, because gamma is exactly three there. *Reference 02 From Jyrki Alakuijala Comment: 0.6 ~ 0.7 // most critical use 1.0 // normal use 1.0 ~ 1.3 // definitely works as designed 1.0 ~ 1.6 // A value below 1.6 is great 1.6 ~ 2.1 // a value below 2.1 okayish 2.1+ // Above 2.1 there is likely a noticeable artefact in an inplace flip test. 2.5+ // not necessarily psychovisually relevant and fair 4.0+ /* The non-linearities near the just-noticeable-boundary in scale At larger errors (4.0+) butteraugli becomes less useful. Current versions of butteraugli only extrapolate these values as multiples of just-noticeable-difference, but the human visual system is highly non-linear and large extrapolationdoesn't bring much value. */ https://github.com/google/butteraugli/issues/22 *Reference 03 From Jyrki Alakuijala Comment: Butteraugli is a lot more sensitive for lines (displaced, emerging or removed) than any other visual measure,
    0 replies | 84 view(s)
  • fcorbelli's Avatar
    Yesterday, 16:18
    fcorbelli replied to a thread zpaq updates in Data Compression
    1) this is not a virus (but a MPRESS packed file with some .EXE as resources, as usual for Delphi code) starting from 2013 (www.francocorbelli.it/pakka) you can see here http://www.francocorbelli.it/nuovosito/vari.html In other words, it is a monolithic EXE that extracts executable programs from its resources (in the% temp% \ pkz folder) so as not to depend on anything and to be portable. However, since directly linked programs (zpaq custom executables) may be obsolete, it has a mechanism that offers to download them directly (windows update style), useful for debug zpaq RCDATA zpaqfranz.exe zpaq32 RCDATA zpaqfranz32.exe testa RCDATA testa.exe sortone RCDATA sortone.exe sortone32 RCDATA sortone32.exe codepage RCDATA codepage.exe Those are - my zpaq.cpp patched 64 bit (source already posted, here it is http://www.francocorbelli.it/pakka/zpaqfranz/) - my zpaq.cpp patched 32 bit (source already posted) - my software like "head" (to refresh filesize when zpaq is running) - sortone my delphi 64-bit special sorter (for zpaq output) - sortone32 same for 32 bit - codepage my C software to set codepage UTF-8 into Windows's shell (to restore UTF8-file) program testa; {$APPTYPE CONSOLE} uses SysUtils,classes; var filename:string; Stream: TFileStream; Value: char; function prendiDimensioneFile(i_nomefile:string):int64; var F:file of byte; SearchRec : TSearchRec; begin Result:=0; if FindFirst(i_nomefile, faAnyFile, SearchRec ) = 0 then // if found Result := Int64(SearchRec.FindData.nFileSizeHigh) shl Int64(32) + // calculate the size Int64(SearchREc.FindData.nFileSizeLow); sysutils.FindClose(SearchRec); end; begin { TODO -oUser -cConsole Main : Insert code here } if ParamCount<>1 then begin Writeln('Testa 1.0 - by Franco Corbelli'); Writeln('Need just 1 parameter (filename)'); Exit; end; filename:=ParamStr(1); if not FileExists(filename) then begin Writeln('File name does not exists '+filename); Exit; end; Stream := TFileStream.Create(FileName, fmOpenRead or fmShareDenyNone); try Stream.ReadBuffer(Value, SizeOf(Value));//read a 4 byte integer /// writeln('#1 '+inttostr(Integer(value))); except /// writeln('Except'); end; Stream.Free; Writeln(prendidimensionefile(filename)); end. program sortone; {$APPTYPE CONSOLE} {$R *.res} {$I defines.inc} uses system.classes,system.sysutils; var gf_start:integer; gf_version:integer; function miacompara(List: TStringList; Index1, Index2: Integer): Integer; var i:integer; s1,s2:string; begin s1:=list.Substring(gf_start);//+list.Substring(42,4); s2:=list.Substring(gf_start);//+list.Substring(42,4); if s1=s2 then begin /// stessa porzione, sortiamo per parte iniziale s1:=list.Substring(gf_version); s2:=list.Substring(gf_version); if s1=s2 then begin result:=0; end else if s1<s2 then result:=-1 else result:=1; end else if s1<s2 then result:=-1 else result:=1; end; var sl:tstringlist; inizio:tdatetime; i:integer; filename:string; outputfile:string; totale:tdatetime; begin try { TODO -oUser -cConsole Main : Insert code here } except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; gf_start:=0; if paramcount<>4 then begin writeln('Sortone V1.1 - 64 bit'); writeln('4 parameters filename version path outputfile'); writeln('Example z:\1.txt 42 47 z:\2.txt'); exit; end; filename:=paramstr(1); if not fileexists(filename) then begin writeln('File not found '+filename); exit; end; try gf_version:=strtointdef(paramstr(2),0); finally end; if gf_version=0 then begin writeln('Strange version start'); exit; end; try gf_start:=strtointdef(paramstr(3),0); finally end; if gf_start=0 then begin writeln('Strange column start'); exit; end; outputfile:=paramstr(4); if fileexists(outputfile) then deletefile(outputfile); if fileexists(outputfile) then begin writeln('We have a immortal '+outputfile); exit; end; sl:=tstringlist.create; totale:=now; inizio:=now; writeln(timetostr(now)+' load/column '+inttostr(gf_start)); sl.loadfromfile(filename); writeln(timetostr(now)+' end load in '+floattostr((now-inizio)*100000)); inizio:=now; writeln(timetostr(now)+' purge'); for i:=sl.Count-1 downto 0 do begin if sl='' then begin sl.Delete(i); end else begin if sl<>'-' then sl.Delete(i); end; end; writeln(timetostr(now)+' end purge in '+floattostr((now-inizio)*100000)); writeln(timetostr(now)+' lines/sort '+inttostr(sl.Count-1)); inizio:=now; sl.CustomSort(miacompara); writeln(timetostr(now)+' end sort in '+floattostr((now-inizio)*100000)); inizio:=now; sl.SaveToFile(outputfile); writeln(timetostr(now)+' end save in '+floattostr((now-inizio)*100000)); writeln(timetostr(now)+' total time '+floattostr((now-totale)*100000)); end. /* gcc -O3 codepage.c -o codepage.exe */ #include <stdio.h> #include <windows.h> #define str_to_int(str) strtoul(str, (TCHAR **) NULL, 10) int main(int argc, char * argv) { UINT in_cp; UINT out_cp; in_cp=65001; out_cp=65001; SetConsoleCP(in_cp); SetConsoleOutputCP(out_cp); in_cp=GetConsoleCP(); out_cp=GetConsoleOutputCP(); printf("CodePage in=%u out=%u\n", in_cp, out_cp); return 0; } 2) as previously stated, it's a form (a Delphi-form made a separate EXE with $ifdef and so on) of my little ERP (with it's own commercial license). This one http://www.francocorbelli.it/nuovosito/zarc.html In this case, of course, I have stripped the "real" time license (briefly if you do not pay every year, you do not get updates) for a free one that is "always good". If it's a problem, I can modify the code to turn it off (a lot of $ifdef required, but doable). As you can see on the first run you can download the updates directly from my site if frmMainPakka.GetInetFile('http://www.francocorbelli.it/jita/'+extractfilename(i_nomefile), filetemp) then Result:=CopyFile(PChar(filetemp),PChar(i_nomefile),false); Obviously in this case I could log the applicant's IP, maybe from the web server. But who cares? Should I turn it off altogheter? 3) this is the currently build with and without mpress http://www.francocorbelli.it/pakka/mpressed.exe SHA1 d46996e1d265a94ea3f0e439d2de3328db71135a http://www.francocorbelli.it/pakka/not-mpressed.exe SHA1 3294a876f43be64d0a9567dbf39a873bab27e850 4) Into the delphi there is a function then, maybe, the euristic antivirus do not like very much. It's about "kill-every-file-within-filemask-in-a-folder" procedure SterminaFileConMaschera(i_path:string;i_maschera:string); var elencofile:tstringlist; i:integer; nomefile:TStringList; begin if i_path='' then exit; if not saggiascrivibilitacartella(i_path) then exit; elencofile:=tstringlist.create; nomefile:=tstringlist.create; enumerafile(i_path,i_maschera,elencoFile,nomefile); for i:=0 to elencofile.count-1 do if fileexists(elencofile.strings) then begin ///toglireadonly(elencofile.strings); deletefile(pchar(elencofile.strings)); end; elencofile.free; nomefile.free; end; And used like this procedure TfrmMainPakka.stermina; begin SterminaFileConMaschera(GetTempDirectory,'*.txt'); SterminaFileConMaschera(GetTempDirectory,'*.bat'); SterminaFileConMaschera(GetTempDirectory,'*.bin'); end; In fact, delete the temporary files extracted into %temp%\pkz 5) There is a "strange" function too (from virus euristic detector), a "language checker" function g_getLanguage:string; var wLang : LangID; szLang: Array of Char; begin wLang := GetSystemDefaultLCID; VerLanguageName(wLang, szLang, SizeOf(szLang)); Result:=szLang; end; function isItaliano:boolean; begin result:=pos('italia',lowercase(g_getlanguage))>0 end; That, for default, turn on Italian strings if... running on Italy's Windows. Otherwise turn on english (not fully translated in fact, it's heavy and boring) 6) If you run as an admin, you get another "risky" function Register the ZPAQ extension to the software, so you can "double click" and open directly the file. Maybe the google anti virus do not like it very much, I do not know. I hope I've been exhaustive.
    2554 replies | 1105278 view(s)
  • paleski's Avatar
    Yesterday, 11:09
    paleski replied to a thread Kraken compressor in Data Compression
    PlayStation 5 IO System to Be ‘Supercharged’ by Oodle Texture, Bandwidth Goes Up to 17.38GB/s https://wccftech.com/ps5-io-system-to-be-supercharged-by-oodle-texture-bandwidth-goes-up-to-17-38gb-s/
    50 replies | 26220 view(s)
  • urntme's Avatar
    Yesterday, 10:15
    Hello Encode.su community members! My name is Ananth Gonibeed and my username is urntme on this forum. I previously posted on the data compression section of this forum. Following some messages by other members of this community on other threads that I posted in, I decided to try and figure out how to code. However, instead of trying to code the Data compression algorithm I mentioned in those threads, I decided to try something much simpler and much more accessible to my coding level for a first attempt. I decided to try and code my “Instant sort” sorting algorithm. So this sorting algorithm is called “Instant sort” and it basically instantly sorts numbers. It is an algorithm that I came up with. It has a time complexity of O(1) and it accomplishes this because it’s a new and different type of sorting algorithm. You can read all about it in the attached paper. I have attached a couple of things to the zip archive attached to this thread: 1) The paper describing “Instant Sort” the algorithm and the basic concept around it. 2) The executable you can run to test out a very very primitive version of the algorithm. 3) A document explaining how the code for the very very primitive version of the algorithm works. 4) The source code for the program I created. I used C++ and the syntax was a bit unfamiliar to me at first but the thing works the way it’s supposed to so I can’t complain. This is version 1 and I will probably build upon it further over time once I figure out how to do more complex versions of it. Let me know your thoughts and anything you want to say. Kudos, Ananth. P.S: You can contact me at this email id: ananth.gonibeed@gmail.com if you want to contact me personally for any reason.
    1 replies | 83 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 08:52
    This .rar file contains the source code n binary file
    37 replies | 2209 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 02:22
    In Windows,you can use drag n drop function to compress or decompress files
    37 replies | 2209 view(s)
  • Shelwien's Avatar
    Yesterday, 02:13
    https://youtu.be/1kQUXpZpLXI?t=784
    2 replies | 62 view(s)
  • suryakandau@yahoo.co.id's Avatar
    Yesterday, 02:08
    Fp8sk17 -improve image24 bit compression ratio astro-01.pnm (GDCC test file) fp8sk16 Total 8997713 bytes compressed to 4612625 bytes in 103.97 sec fp8sk17 Total 8997713 bytes compressed to 4509505 bytes in 106.42 sec ​
    37 replies | 2209 view(s)
  • Shelwien's Avatar
    24th September 2020, 23:42
    Shelwien replied to a thread zpaq updates in Data Compression
    Google complained, had to remove attachment from this post: https://www.virustotal.com/gui/file/d67b227c8ae3ea05ea559f709a088d76c24e024cfc050b5e97ce77802769212c/details Also it does seem to have some inconvenient things, like accessing "https://www.francocorbelli.com:443/konta.php?licenza=PAKKA&versione=PAKKA.DEMO".
    2554 replies | 1105278 view(s)
  • Shelwien's Avatar
    24th September 2020, 19:29
    http://imagecompression.info/test_images/ http://imagecompression.info/gralic/LPCB-data.html
    2 replies | 150 view(s)
  • Adreitz's Avatar
    24th September 2020, 18:53
    Hello. First post here. I've been lurking for a long time due to my enthusiast interest in lossless image compression, starting with PNG, then WebP, and now I'm playing with JPEG XL. My reason for creating my account was because I don't understand something fundamental about the use of the JPEG XL lossless compressor. So this question will be for either Jyrki or Jon. I've been experimenting with Jamaika's build of release 6b5144cb of cjpegXL with the aim of maximum lossless compression. (I tried building myself with Visual Studio 2019 in Windows 10, but was unsuccessful as it couldn't understand a function that apparently should be built-in. I don't know enough about programming, Visual Studio, or Windows to figure it out.) The issue that I'm encountering is that, for some images, fewer flags are better and I don't understand why. Take, for instance, the "artificial" image from the 8-bit RGB corpus at http://imagecompression.info/test_images/. Using the specified release of cjpegXL above, I reach a compressed file size of 808110 bytes with simply running cjpegxl.exe -q 100 -s 9. However, when I brute-force all combinations of color type and predictor to find the optimal, the best I can get is cjpegxl.exe -q 100 -s 9 -C 12 -P 5, which outputs a file of 881747 bytes. I figured I must be missing something, so I tried experimenting with all of the other documented command line flags, but didn't get any improvement. So then I went searching and ended up finding five undocumented flags: -E, -I, -c, -X, and -Y. I don't know what they do beyond affecting the resulting file size, and my only knowledge of their arguments is experimental. My best guess is the following: E -- 0, 1, or 2 I -- 0 or 1 c -- 0, 1, or 2 (but 2 produces invalid files) X -- takes any positive integer, but only seems to vary the output from approx. 85 to 100 (with many file size repeats in this range) Y -- similar to X, but ranging from approx. 65 to 100. I also discovered that -C accepts values up to 75 (though most, but not all, arguments above 37 produce invalid output) and -P also accepts 8 and 9 as arguments (which produce valid output and distinct file sizes compared to all documented predictors, and are even better than the defined predictors for certain files). Even with all of this, though, my best valid result from tweaking all of the flags I could access is 830368 bytes from cjpegXL -q 100 -s 9 -C 19 -P 5 -E 2 -I 1 -c 1 -X 99 -Y 97, which is still 21 KB greater than when I simply use -q 100 -s 9. So, what's going on here? From using libpng and cwebp, I am used to compressors that use heuristics to set default values to flags if they are not specified by the user (and therefore you get a compression benefit if you spend the effort to manually find the best settings). But that doesn't seem to be the case with cjpegXL. What am I missing? Also, it would be great if you could provide an official description of what the undocumented flags do and what arguments they take. Thanks, ​Aaron
    133 replies | 9784 view(s)
  • suryakandau@yahoo.co.id's Avatar
    24th September 2020, 16:35
    i mean lossless photo compression benchmark...
    2 replies | 150 view(s)
  • Sportman's Avatar
    24th September 2020, 12:39
    Fixed.
    110 replies | 11438 view(s)
  • Piotr Tarsa's Avatar
    24th September 2020, 08:14
    Nuvia doesn't even have any timeline on when their servers will hit the market and it seems it could take them 2+ years to do so, so they need high IPC jump vs at least Intel Skylake. In meantime the landscape is changing: - Intel released laptop Tiger Lake which is basically laptop Ice Lake with much higher frequencies (there's small IPC change, mostly due to beefier caches), nearly 5 GHz. This means Intel at least figured out how to clock their 10nm high, but since laptop Tiger Lake is still limited to max quad core it seems that yield is still poor. - Arm has prepared two new cores: V1 for HPC workload (2 x 256bit SIMD) and N2 for business apps (2 x 128bit SIMD): https://fuse.wikichip.org/news/4564/arm-updates-its-neoverse-roadmap-new-bfloat16-sve-support/ https://www.anandtech.com/show/16073/arm-announces-neoverse-v1-n2 IPC jump is quite big, but it remains to be seen when the servers will hit the market as it previously took much time for Neoverse N1 to be available since the announcement. At least those are SVE (Scalable Vector Extensions) enabled cores (both V1 and N2) so apps can finally be optimized using decent SIMD ISA, comparable to AVX (AVX512 has probably more features than SVE1, but SVE is automatically scalable without the need for recompilation). - Apple already presented iPad Air 2020 with Apple A14 Bionic 5nm SoC, but the promised performance increase over A13 seems to be small. I haven't found reliable source mentioning Apple A14 clocks so maybe they kept them constant to reduce power draw in mobile devices like iPad and iPhone? Right now there are people selling water cooling cases for iPhone (WTF?): https://a.aliexpress.com/_mtfZamJ - Oracle will offer ARM servers in their cloud next year: https://www.anandtech.com/show/16100/oracle-announces-upcoming-cloud-compute-instances-ice-lake-and-milan-a100-and-altra and IIRC they say they will compete on price.
    15 replies | 1222 view(s)
  • suryakandau@yahoo.co.id's Avatar
    24th September 2020, 01:58
    where can i get LPCB test file ?
    2 replies | 150 view(s)
  • Jyrki Alakuijala's Avatar
    24th September 2020, 01:52
    I fully agree with that. We are working on to get some improvement in this domain. We have very recently made a significant (~20 %) improvement in the modular mode by defaulting to butteraugli's XYB colorspace there, too. Not sure if it is published yet, could be this week's or next Monday. We are making the number of iterations of the final filtering configurable (0, 1, 2, or 3 iterations), allowing for some more variety of compromise between smoothness and artefacts.
    133 replies | 9784 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 23:43
    Wavelets have advantage and drawbacks, personally I find the DCT block-based codecs have precision and NHW has neatness also due to fact that there is no deblocking in NHW, and so it's a choice.Personally, I prefer neatness than precision, for me it's visually more pleasant, but it's only my taste... The other advantage of NHW is that it's a lot (and a lot) faster to encode and decode. Cheers, Raphael
    195 replies | 22474 view(s)
  • Scope's Avatar
    23rd September 2020, 23:38
    ​Yes, I notice that many people compare codecs on low bpp and if this codec is visually more acceptable, it is also believed to be more efficient on higher bpp. Although low quality is also quite in demand, especially with the spread of AVIF, people try to compress images as much as possible to reduce the page size and where the accuracy of these images is not so important. According to my tests, the problem with low bpp for Jpeg XL (VarDCT) are images with clear lines, line art, graphics/diagrams and the like, there are very quickly become visible artifacts and distortions of these lines or loss of sharpness, and in modular mode there is a noticeable pixelization and also loss of sharpness, in such images AVIF has strong advantages. If it were possible to give priority to saving contours and lines with more aggressive filtering or the possibility to select a preset (in WebP such presets sometimes helped), it would be good.
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    23rd September 2020, 23:28
    so the image can't divided in separate part to perform quality measure, it won't be same efficient as jpeg xl or good for video or images on the web like design fashion. I understood that.
    195 replies | 22474 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 23:26
    Yes it's very difficult to perform block processing for wavelets.Because wavelets are FIR filter and so with transitional response, and from my experience, this transitional response is quite dirty with wavelets, and so this will cause noticeable artifacts at block boundaries... But I can be wrong. Yes, there is no block processing, and so the quantization parameters for example are the same for the whole image, hence the importance of a good psychovisual optimization.Also the advantage of it is that you don't have deblocking artifacts for example...
    195 replies | 22474 view(s)
  • Jyrki Alakuijala's Avatar
    23rd September 2020, 23:21
    I suspect there is some more sharpness and preservation of textures. The improvements are encoder only, so it will be possible to go back to less sharp later. I think in Sep 07 we didn't have the filtering control field yet in use. Now we turn off filtering in flat areas, making them easier to preserve fine texture. We control the filtering using an 8x8 grid, so we may now filter non-uniformly across a larger integral transform (such as 32x32 dct).
    133 replies | 9784 view(s)
  • Jyrki Alakuijala's Avatar
    23rd September 2020, 23:16
    Nooo ;-). I promise we didn't put the big technologies into it :-D More seriously, JPEG XL is a small underdog effort when compared to the AOM industry coalition. We have built PIK/JPEG XL during 5.5 years, but mostly with 2-5 engineers (of course, Alex Rhatushnyak and Jon Sneyers brought their expertise, too). We tried to keep our eyes open for new approaches and produced intermediate milestones (webp lossless, webp delta-palettization, webp near-lossless, ZopfliPNG, guetzli, butteraugli, brunsli, knusperli) to check that we are not lost. Alex and Jon had done previously the same with QLIC, GRALIC, SSIMULACRA, FLIF and FUIF.
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    23rd September 2020, 23:14
    And how is 2.469 TO 1 or 2.24 s 8 distance. Are there really improvements/how is the progress in these days?
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    23rd September 2020, 23:11
    so a wavelet codec or a wavelet denoise can't process pixels in block? if you want a pixel to have higher quality or different quantization different settings different l it isn't possible is there a settings it can't be done?
    195 replies | 22474 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 23:05
    Hi, > does your codecs can support qualities from 1 to 100 For now, there are 23 quality settings for nhw, need to code high quality and extreme compression, and with a finer compression step, maybe we can reach 100 quality settings. > can increase quality, decrease quality from pixels, without dividing in colors like wavelet do? Could you explain/give us more detail of what you mean here, as I don't get it for now... > Can you give information of wavelets and qualities from 1 to 100? There are many different algorithms for wavelet compression.For example in NHW, there are 2 quantizations, one directly applied on the spatial YUV pixels right after colorspace conversion, and a second quantization on wavelet coefficients after transform, that is a uniform scalar quantization with deadzone (and with some refinements...). > Also, can wavelet codecs be written in Rust? Yes, I think so. > raphael canut i want to use only one codec. You may get wavelet to perform like jpeg xl. That's a lot of work to make a fully professional NHW codec, full-time work, and so I am searching a sponsor for that... Cheers, Raphael
    195 replies | 22474 view(s)
  • Jyrki Alakuijala's Avatar
    23rd September 2020, 22:55
    We are now looking into improving density at lowest qualities. These lowest bitrates can be important for bloggers/image compression influencers wanting to demonstrate compression artefacts, but are not used in actual day-to-day use.
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    23rd September 2020, 22:36
    raphael canut I would like to use only one codec. You may get wavelet to perform like jpeg xl.
    195 replies | 22474 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 22:32
    From what I've read, it's AV2 that will be based on neural, really more than AV1 that however introduced first neural/machine learning processings.
    133 replies | 9784 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 22:26
    Hi Jyrki, Sounds interesting! I didn't try it, but it could be interesting to give more neatness to JPEG XL with NHW pre-processing, I even think that files will be then smaller, and furthermore NHW compress/uncompress is really very fast!... Many thanks. Cheers, Raphael
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    23rd September 2020, 22:13
    Hi, does your codecs can support qualities from 1 to 100, or can increase quality, decrease quality from pixels, without dividing in colors like wavelet do? I tested some years ago a wavelet denoiser in Gimp and it divided the pixel based on is colors, so it couldn't separate the individual pixels in tiles or block and select a quality or strength. Can you give information of wavelets and qualities from 1 to 100? Maybe also that is a different technology with no success, it's also that a video codec requires control of individual pixels you can't just reconstruct colors by brightness (wavelet isn't av1 is a bit different, it doesn't work like av1) but anyway I do not understand wavelets. Also, can wavelet codecs be written in Rust?
    195 replies | 22474 view(s)
  • fabiorug's Avatar
    23rd September 2020, 22:10
    AV3 will likely be based on webpv2 or neural. Jpeg xl has big technologies from Google but that doesn't guarantee that Aom will be interested. Aom ≠ Google. Also I have a point i will ask you to your thread. wait.
    133 replies | 9784 view(s)
  • Jyrki Alakuijala's Avatar
    23rd September 2020, 22:03
    You could try to get both the neatness of NHW and the efficiency of JPEG XL by: 1. first compress and uncompress with NHW, and get a new kind of neat image. 2. compress the neat image with JPEG XL.
    133 replies | 9784 view(s)
  • fcorbelli's Avatar
    23rd September 2020, 20:36
    fcorbelli replied to a thread zpaq updates in Data Compression
    After a bit of digging the answer for this piece of code is... string append_path(string a, string b) { int na=a.size(); int nb=b.size(); #ifndef unix if (nb>1 && b==':') { // remove : from drive letter if (nb>2 && b!='/') b='/'; else b=b+b.substr(2), --nb; } #endif if (nb>0 && b=='/') b=b.substr(1); if (na>0 && a=='/') a=a.substr(0, na-1); return a+"/"+b; } ...the extraction of all versions, with x -all, to create different paths numbered progressively. I think I will correct my "franz28" version of zpaq in the future to support this functionality. I add an updated version of PAKKA which I am testing with rather large archives (5M + files, 700GB + size), command to extract from version x to y, check instead extract, test everything. Improved UTF8 filename list, with double click (in the log tab) to directly search (~ 4/8 seconds for ~2M nodes tree)
    2554 replies | 1105278 view(s)
  • Mauro Vezzosi's Avatar
    23rd September 2020, 19:21
    Wrong order 2,872,160,117 bytes, 240.566 sec. - 20.319 sec., brotli -q 5 --large_window=30 (v1.0.7) 2,915,934,603 bytes, 102.544 sec. - 13.302 sec., zstd -6 --ultra --single-thread (v1.4.4) 2,915,934,603 bytes, 103.971 sec. - 12.798 sec., zstd -6 --ultra --single-thread (v1.4.5) 2,812,779,013 bytes, 412.488 sec. - 64.311 sec., 7z -t7z -mx3 -mmt1 (v19.02)
    110 replies | 11438 view(s)
  • Raphael Canut's Avatar
    23rd September 2020, 17:28
    Hi Scope, Thank you very much for testing NHW and for this very interesting image comparison with JPEG XL. On my computer screen, this confirms my tests that is to say that I find that NHW has more neatness and JPEG XL has more precision. -I just wanted to notice if this demo makes others want to try NHW, that the new entropy coding schemes of NHW are not optmized for now, I have the fast ideas to improve them, and so we can save in average 2.5KB per .nhw compressed file, and even more with Chroma from Luma technique for example.- Many thanks again. Cheers, Raphael
    133 replies | 9784 view(s)
  • Shelwien's Avatar
    23rd September 2020, 17:10
    > After preprocessing TS40.txt by my preprocessor my compressor compresses it on 16Mb better, is it great? Seems reasonable: 100,000,000 enwik8 61,045,002 enwik8.mcm // mcm -store 25,340,456 enwik8.zst // zstd -22 24,007,111 enwik8.mcm.zst // zstd -22 > Did I correctly understand that these are the disadvantages of the ROLZ algorithm? No, ROLZ matchfinder can be exactly the same as LZ77 one - the only required (to be called ROLZ) difference is encoding of match rank (during search) instead of distance/position, thus forcing decoder to also run the matchfinder.
    43 replies | 2046 view(s)
  • no404error's Avatar
    23rd September 2020, 15:43
    no404error replied to a thread FileOptimizer in Data Compression
    I was unexpectedly kicked out of many communities for plagiarism. Upon learning, I realized that this is due to the mention of my old nickname in your changelog. I think you should change the changelog. I never said that any of the tools you use were written by me. I only recommended you what I was familiar with, mainly from the demoscene. PCXLite wrote by Sol, as did some of your set. Sol webpage - https://sol.gfxile.net/ /Thank
    664 replies | 202535 view(s)
  • Scope's Avatar
    23rd September 2020, 14:56
    ​I made a comparison for myself with NHW back when the first public version of Jpeg XL was released, the main problem for normal testing by enthusiasts like me is the 512x512 resolution limitation and I have to either resize large images or split them into tiles or compare only small images. Ready-made formats also require saving additional data for the image, container, structure, metadata, etc., and this gives some advantages to experimental unfinished formats with only raw data, especially on very small images. Here's another small visual quick comparison on 512x512 images with the latest available versions of Jpeg XL and NHW, for NHW I chose -l5 (at higher compression I already see unacceptable distortions and loss of detail in many images), then Jpeg XL was encoded to the same size with -s 8 settings (VarDCT, also using faster speed doesn't always make it worse). ​12 images that can be switched with the keyboard arrows up and down, and the images themselves with numbers, the first is the original, the second NHW, the third JXL. https://slow.pics/c/ivA8aKHO
    133 replies | 9784 view(s)
  • lz77's Avatar
    23rd September 2020, 10:58
    After preprocessing TS40.txt by my preprocessor my compressor compresses it on 16Mb better, is it great? Yesterday I thought 2 times about the ROLZ: 1. Compressor can calculate hashes with bytes (for example, with cdef) which are located on the right of the current position: abcd|efgh, but decompressor can't: abcd|????. Are matches near current position impossible in ROLZ? 2. At the beginning of the data classic LZ can compress abcd but ROLZ can't: abcdEabcdFabcd...Eabcd... The first appearance of abcd has no predecessor char, the second has not right one, and only second Eabcd will match... Did I correctly understand that these are the disadvantages of the ROLZ algorithm?
    43 replies | 2046 view(s)
  • fabiorug's Avatar
    22nd September 2020, 22:50
    jyri in your opinion a photo compresses with squoosh.app online at 23q mozjpeg then 8.81 speed 7 jpeg xl 07 september 2020 build. Is too low of quality would jpeg xl introduce these bitrates in next versions maybe with a forced deblocking? i'm a bit confused, maybe with imageready presentation i will be more or less.
    133 replies | 9784 view(s)
  • Raphael Canut's Avatar
    22nd September 2020, 21:12
    Yes, I am also looking forward to seeing the new update of WebP v2.Because if I remember correctly, Pascal Massimino stated in October 2019 that WebP v2 was (a little) inferior to AVIF, but they were working hard to improve it.It would be awesome if they announce this end of year that WebP v2 has become better than AVIF (and for less complexity furthermore)! There is also a huge research effort in (machine/deep learning) learned image compression.The problem for me and NHW to find financing, is that a lot of people and experts say that the future will be all-machine-learning, and so unfortunately a lot of people absolutely don't care about a wavelet codec (made by an individual furthermore)...
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    22nd September 2020, 20:56
    Honestly now I'm more hyped in ImageReady presentation for webp v2 by Pascal Massimino. He gave brief look at Aomedia Symposium 2019. But it would be good to have one codec that with a 1.81 difference is transparent for all images.
    133 replies | 9784 view(s)
  • Raphael Canut's Avatar
    22nd September 2020, 20:41
    @fabiorug, OK.Thanks for the precision. I would like thank you very much for testing the NHW codec, there's too few people who looked at it, and I wanted to let you know that I appreciate your comments on NHW, even if some people will say that it is not appropriate here, it's the JPEG XL thread, the NHW thread is another page... Just a very quick remark, I don't agree when you say that wavelet codecs have all the same results, for example NHW has not the same results as JPEG2000, if you want a wavelet codec that retains details try Rududu codec (enhanced SPIHT) -but it will have less neatness than NHW-... Many thanks again. Cheers, Raphael
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    22nd September 2020, 20:14
    Hi, thanks for the comments. Is not a program, is a text i wanted to find ratio between image and audio for personal scope And to know what image codec use. What butteraugli use. Nothing scientific or complex. but 2.469 TO 1 jpeg xl is the ratio where it should produce better quality in comparison to png or other lossy codecs. that's reasonable and people will like use these values so they expect good quality i'm sorry if i made comparisons. It's not intended. Pleasantness isn't a metric. is a subjective score i made 4.1 PLEASANTNESS OR INFLATES FILE SIZE/1.660 min i like for jpeg xl= and i found the 2.469 butteraugli values. And i don't know if there is a limit after 4.1 But as jyri araylla said the comment before isn't accurate as is an odd value and jpeg xl is good as it is, i only suggested that at 2.469 TO 1 distance it can be improved. Citing: Jyrki Alakuijala Why do you use so precise distance values in your recommendations? 2.6 and 2.581 should produce roughly identical results. Sorry for bold text abuse. Scope has said it's because i like more wavelet blurring in less pleasant image low bitrate and that your codecs as wavelet codecs is blurry in low bpp and it enhances image, and is like all wavelets. The comment is: For myself I didn't see very big differences from other wavelet formats, and it's much harder to compare it on such small resolutions. but is not a comparison. Even jpeg xl modular is interesting for text and some graphics. L8 (1.660pleasantness) L16 (1.1250pleasantness). that's what i use and indeed the result isn't perfect, because it doesn't support all resolutions, but for some instagram photos is acceptable.
    133 replies | 9784 view(s)
  • Raphael Canut's Avatar
    22nd September 2020, 19:36
    Hi, Sorry for interrupting the JPEG XL thread, but a just a precision, @fabiorug: do you mean that you have created a program/metrics that computes/evaluates the pleasantness of an image? If so, could you give us some details? Is there a version to download? Else, just very quickly, -l12 to -l19 quality settings are absolutely not optimized in NHW.They can be better.Will have to work on it, and so don't trust them too much for now.-Very quickly to finish, on the contrary, I find that for example -l4,-l5,-l6 quality settings have good visual results which you don't seem to agree, but that's right that I mainly tested against AVIF...- Cheers, Raphael
    133 replies | 9784 view(s)
  • fabiorug's Avatar
    22nd September 2020, 18:18
    OBJECTIVE 2.469 TO 1 is the ratio where it should produce better quality in comparison to png or other lossy codecs. SUBJECTIVE MAX 4.1 PLEASANTNESS OR INFLATES FILE SIZE MIN 1,1250 TO 1.660 PLEASANTNESS LOW QUALITY IMAGE BETTER COMPRESS WITH NHW CODEC IF YOU WANT BETTER RATIO AND YOU ACCEPT WAXED OUT IMAGE. JPEG XL IN MY OPINION ISN'T GOOD FOR THAT TYPE OF LOW QUALITY IMAGE, AND IF YOU HAVE AN IMAGE THAT PLEASES MORE THAN 4.1 (i don't know the maximum, it's subjective), it's better to leave the PNG could be as I notice more a loss of details. L8 (1.660) L16 (1.1250). l8 is a nhw codec (wavelet) settings it works only for 512x512 24 bit BMP. I found that etc... it can compress more the images I have on MIN 1,1250 TO 1.660 PLEASANTNESS LOW QUALITY IMAGE (a value I invented). But anyway returning to JPEG XL 2.469 TO 1 (distance butteraugli) is the ratio (in my opinion) where it should produce better quality in comparison to PNG or other lossy codecs.
    133 replies | 9784 view(s)
  • Jarek's Avatar
    22nd September 2020, 14:00
    Sure order might improve a bit, but I am talking about exploiting dependencies inside block - the tests show that linear predictions from already decoded coefficients in this block can give a few percent improvement, especially for width prediction (#2 post here).
    10 replies | 1173 view(s)
  • Jyrki Alakuijala's Avatar
    22nd September 2020, 13:43
    Why do you use so precise distance values in your recommendations? 2.6 and 2.581 should produce roughly identical results.
    133 replies | 9784 view(s)
  • Jyrki Alakuijala's Avatar
    22nd September 2020, 13:36
    In PIK and JPEG XL we optimize the (zigzag) order to get best zero rle characteristics, too. There is also some additional benefit for keeping a lowering variance for other reasons in the chosen codings.
    10 replies | 1173 view(s)
  • Jarek's Avatar
    22nd September 2020, 07:32
    Finally added some evaluation with quantization to https://arxiv.org/pdf/2007.12055 - the blue improvements on the right use 1D DCT of column toward left and row above (plots are coefficients: prediction mainly uses corresponding frequencies - we can focus on them) - prediction of widths has similar cost as prediction of values, but gives much larger improvements here. The red improvements use already decoded values in zigzag order - it is costly and gain is relatively small, but its practical approximations should have similar gains.
    10 replies | 1173 view(s)
  • elit's Avatar
    22nd September 2020, 03:52
    Those are not from US lab's.
    5 replies | 458 view(s)
  • Shelwien's Avatar
    21st September 2020, 20:08
    0 replies | 211 view(s)
  • Shelwien's Avatar
    21st September 2020, 20:02
    > the speed is multiplied by 2... Yes, ROLZ can provide much better compression with a fast parsing strategy, which might be good for the competition. > Are there too many literals? Mostly the same as with normal LZ77. LZ77 would usually already work like that - take a context hash and go through a hash-chain list to check previous context matches - the difference is that LZ77 then encodes match distance, while ROLZ would encode the number of hashchain steps. > Is LZP also like ROLZ? LZP is a special case of ROLZ with only one match per context hash. So it encodes only length or literals, no distance equivalent. But LZP is rarely practical on its own - its commonly used as a dedup preprocessor for some stronger algorithm.
    43 replies | 2046 view(s)
  • Gotty's Avatar
    21st September 2020, 19:45
    (These from Lucas are important ones. Let me grab you the links.) So, here they are: - the thread of WBPE: https://encode.su/threads/1301-Small-dictionary-prepreprocessing-for-text-files - the XWRT paper: http://pskibinski.pl/papers/07-AsymmetricXML.pdf --- And maybe also: - the StarNT paper: https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.163.3821&rep=rep1&type=pdf The text preprocessors page certainly worth a look, some of these tools are in your league: https://encode.su/threads/3240-Text-ish-preprocessors
    30 replies | 1341 view(s)
  • lz77's Avatar
    21st September 2020, 18:46
    Decoding is slower, but in the scoring formula (F=c_time+2*d_time+...), the speed is multiplied by 2... As I understand ROLZ, if we have in current position 'the ' and the predecessor symbol is 'n', then we see in a history buffer 256 saved hashes for 'the ' only with the predecessor symbol 'n'? And if the search fails then 't' is a literal? Are there too many literals? Is LZP also like ROLZ?
    43 replies | 2046 view(s)
  • Gotty's Avatar
    21st September 2020, 18:01
    Writing Disha version V2 was just too quick. Please spend more time on it, don't rush. Start again. - I would like to suggest you to phrase v2 in such a way that it does not reference anything in v1. Even if you are not intending publishing it anymore, you would make it easier for a future reader. - Be more specific. V2 is still formulated as an idea and not a concrete algorithm. You will need to write down concretely... ...Define what are the symbols exactly. Only characters and words? Or expressions ("United States of America"?) Fragments? Combination of characters? "http://" or ".com"? ...How do you handle capitalization? ...How the dictionary would be constructed - this one is not straightforward - there are many ways. You'd like to create one or more static dictionaries, right? How exactly? What decides if a word would be a part of one of them? Where would you get your corpus from? ...How compression would work - this one is also not straightforward: how do you imagine to split up the input file to its parts (words). An example: When you have found "the" in your input and it's in the dictionary. Would you go on processing more characters in the hope that there will be a longer match? Let's say you do. And you are lucky, it's actually "there" and it is still in the dictionary. Nice. But wait. If you just did that, you will find that the following characters are "scue". Hm, There is no such a word. What happened? Ah, it is "www.therescue.com", and you should have really stopped at "the". So let's stop then early (after "the"). But now you will find "then", and that (the+n) will be expensive, you should not have stopped this time. See the problem? Its not straightforward. Should it be greedy, should it do some backtrack? Should it try multiple ways and decide on some calculated metric ? Optimally splitting a text into parts is an NP-hard problem so you would probably need heuristics (except when doing it greedy). The current description is too general - there are too many ways to implement it. If you'd ask 10 programmers to implement it, there would probably be 10 different results. So we need more specific details, please.
    30 replies | 1341 view(s)
  • Gotty's Avatar
    21st September 2020, 17:35
    From my side, I can see your passion and your will to do something interesting and unique. It's a very good start. Passion is an absolute must to reach greatness, but it's not enough. I have a strong feeling that before you'd go on you'll need to attain more knowledge in algorithms and learn some programming. You will need to learn more and read more about data compression, too. You will probably know then how to describe an algorithm properly and how be more specific. If you do intend to publish it sometime in the future you'll need to learn some scientific writing. Your document is not there yet. (Probably you know that.) Collaboration works only if we are "equal" in some sense. What do you bring to the table? Until now it's only an idea. Not very specific, so I'd even say it's a "vague idea". Please read again those links above. Read the source codes of the open source software that are similar to yours. Learn the advantages and disadvantages of them. Then you'll know where you would "break into the market". In v2 you mention that "The difference between this algorithm and others is that instead of using code words at the bit level, byte level code words are used". Homework: you'll really need to read those links above again. But most importantly: if you'd learn some programming you could give birth to your child. Please don't expect such a collaboration from us that you drop in an idea without specific details, and someone will eventually sit down, take time, elaborate the missing parts and create your software - which will not be really yours. It may be very far from your idea. Implementing V1 was (fortunately) (almost) straight forward. But V2 is not. You'll need to be experimenting, failing, experimenting more, failing more until you are getting better and better. Eventually an expert. Then you will be able to create a unique compression algorithm. If you invest your time, I believe you'll be there. Your passion will get you there. We are here, we can help you, we can spot missing stuff we can point you to different directions, but please don't expect us to elaborate on the details of your compression idea, you'll need to do that.
    30 replies | 1341 view(s)
More Activity