Activity Stream

Filter
Sort By Time Show
Recent Recent Popular Popular Anytime Anytime Last 24 Hours Last 24 Hours Last 7 Days Last 7 Days Last 30 Days Last 30 Days All All Photos Photos Forum Forums
  • Jarek's Avatar
    Today, 10:43
    In one paragraph you write that we shouldn't ask, in the second that we don't know ... maybe the problem of theoretical physics staying in place are all these artificial mental barriers ... how can we combine QM and gravity if forbidding to ask basic questions about literally the best object combining these two worlds: electron? And no, a few times I have tried to publish ANS since ~2007, but academia was not interested - until it was widely used outside (2015). And no again, in this moment they know, but having papers and patents on the old methods, it is not in their interest to acknowledge that they are currently being replaced. Data compression academia has lost its meaning in recent years - staying in place to focus on self adoration, lots of peer-reviewed papers but nearly no meaningful development, actually used and better compressors are from outside.
    45 replies | 3404 view(s)
  • thorfdbg's Avatar
    Today, 10:13
    It is not a mental censorship, it is a nonsense question. I could also ask you "which color does an electron have" or "which flavour does it have". Depending on the physical process, one may assign it one radius or another, but nothing of that is the "ground truth" because you don't have a scale or a meter that you can compare to. The process of measuring a radius is ill-defined. You ca make several assumptions and define physical processes which may provide you a length unit one may attempt to identify with a radius, but given that the processes are different, don't expect consistent answers. For which, at this time, we do not have an answer because we don't have a theory that combines quantum mechanics and general relativity. So, all we can say here: At this point, our science ends. But attempting to apply classical concepts doesn't help either because we already *know* that they are *not* how physics works at its core. Sigh. No one is *pretending that*. The problem is that nobody *knows* it, and that, my friend, is your fault of denying publication.
    45 replies | 3404 view(s)
  • Jarek's Avatar
    Today, 07:25
    Thomas, you don't address my arguments, but write some pop-sci book texts. This is not a place for that, but let's look one more time. Claiming that it is not defined is example of mental censorship: forbidding to search for answers of basic questions, to protect the status quo. But they are asking for this radius, trying to restrict it from above. The first question is its scale: ~10^-15m? 18? 21? Going much down you finally exceed Schwarzschild's radius - electron becomes a black hole. Even in QM we can define effective radius in many ways - e.g. probability cloud allows to define "median radius": containing half of probability. And Dehmelt's reasoning from diagram I have attached assumed we can ask for this radius, but extrapolated it by fitting parabola to two points - the latter one is mathematical nonsense. Another perspective: electron-positron scattering cross-section, which is interpreted as area of particle: The peaks are resonances corresponding to creation of some particles. This is for huge energy (in GeVs) where particles are "squeezed" due to Lorentz contraction - wanting size of resting electron, we need to unsqueeze them - extrapolate to 2 x 511keV energy of resting particles. There is marked general linear trend - using it to extrapolate to energy of resting particles, we get ~100mb cross section, what corresponds to ~10^-15m radius of electron - at scale of the classical one, orders of magnitude larger than believed from this "fitting parabola to two points". Returning to data compression, I don't feel a need to join the mutual adoration society to help protecting the status quo - like pretending that ANS does not exists. Even getting it through the peer-review filter, required first getting popularity outside academia. The peer review system is made for mass production of similar articles glorifying the status quo - to change deep pathologies of a field, one has to also act outside it.
    45 replies | 3404 view(s)
  • thorfdbg's Avatar
    Yesterday, 23:28
    There is no "magic" here at all - look, if you don't know, you don't know, but please don't tell people that they are doing nonsense if you aren't into it. The problem with renormalization goes away if you use an operator-algebra type approach, so quantum electro dynamics are not a problem. The problem begins with the strong interaction where up to my knowledge we do not yet have a consistent theory. Nothing is "forbidden",but I don't think that the model of an electron as a "point" is adequate either. The problem is that you seem to follow a "classic model", and this can only go wrong. The important question is not "what an electron is" because it is so distant from our all-day experience - we know how to describe an electron perfectly with the mathematics we have, but we cannot say what it "is" because this requires an answer in terms we are used to think in - which are inappropriate for quantum physics. So there is no problem. The problem is how come that the laws of classical physics emerge from the quantum world and how they are compatible to it. So how come the classical world looks how it looks given the well-tested laws of quantum physics. There you have a problem. But look, the question of a "radius" for a non-classical particle is not well defined in first place. So what do you expect as answer? Wrong model, nonsense answer. Clear case of "garbage in, garbage out". Because the model of a radius makes no sense. "Length" is as much a classical concept as that of a particle. An electron is just that: An electron. It's not a particle, and not a wave either. Both concepts are classical concepts, and they don't apply fully to what you call "electrons". So if it is not a particle, how can it have a radius? Then change that and publish so people know. I don't promise that it is going to be easy (well noting that the original BWT paper was refused at DCC (-; ) but it is necessary. I can't make your job. That depends, and it also depends on what you count as "academia". One of the harder working parties in MPEG is Fraunhofer HHI, and HHI is a half-goverment funded, half industry funded research institute. So is this academia? Or industry? Oh well, but by that you'll never put your work to discussion, which is a necessary part of every scientific achievement.
    45 replies | 3404 view(s)
  • thorfdbg's Avatar
    Yesterday, 23:10
    What is certainly necessary to understand is how you define HDR. There are two approaches: The approach taken by MPEG assuming an n-bit integer representation which then undergoes an EOTF to map it to radiance values, and the EOTF is signaled by additional metadata. There is also the approach that was taken by JPEG XT Part 7 (unlike Part 6, which follows the EOTF approach) where you assume that your images are represented in relative or absolute radiance, and the radiance is then the pixel value. This then follows the workflow in computational photography, and the approach OpenEXR uses. In the latter case, you want to have at least 16 bit, which are then however floating point. While 32 bit are probably overshoot, you may want to look into JPEG 2000 part 2 where the number of bits in exponent and mantissa are signaled, and a floating point format is being used. By that I mean that a simple "int only" approach may not be fully sufficient, and 16 bits may not be fully sufficient if you have radiance images.
    9 replies | 989 view(s)
  • Krishty's Avatar
    Yesterday, 21:20
    Very glad that you remember :) Yes, the two most urgent extensions are 1. CFBF Optimizer and 2. Ultra7z. CFBF did not yet make it because I don’t have so many CFBF files; Ultra7z did not yet make it because converting that giant batch script to C is quite some hassle. It’ll come, be assured!
    5 replies | 174 view(s)
  • Krishty's Avatar
    Yesterday, 21:14
    When I last tested pngwolf, it was outperformed by ECT by a considerable margin, so I switched over to ECT entirely. That was in 2017, though … OptiPNG is not explicitly called when optimizing PNG because it is already contained in ECT. Instead it’s used to reduce color palettes & transparency before expanding PNGs (ArchiPNG doesn’t have that) to convert BMP/GIF/TIFF to PNG (except for TGA, which is done via pngout) because ECT can’t do that Yes, pngout is not contained due to its license. You can copy it into the tools folder yourself to make the TGA conversion work. ECT works on all kinds of ZIP files, and Office documents are ZIP files (and so are ebubs). However, Office as well as epub require the first stream in the archive being the uncompressed MIME type, so running ECT on it is de-facto not allowed. It seems to work with most software, though.
    5 replies | 174 view(s)
  • maadjordan's Avatar
    Yesterday, 21:13
    Olny office 2007 and later are supported which formats falls into Zip format.
    5 replies | 174 view(s)
  • maadjordan's Avatar
    Yesterday, 21:12
    I welcome any new tool.. I am interested in your compound file optimizer (which you did not to your Papa's Optimizer). Can you add optimizing media streams in compound files?
    5 replies | 174 view(s)
  • SvenBent's Avatar
    Yesterday, 20:52
    I dont see pngwolf-zopfli in there for png optimizng Also no pngout but im assuming this is due to license issues also in my testing i have never seen optipng add anything usefull for a png optimizing routine ( ive teste it a coyple of time to see if i wanted to add it to my pngbest batch file) P.S. I did not know ect worked on office documents
    5 replies | 174 view(s)
  • Krishty's Avatar
    Yesterday, 18:06
    A file optimization program I wrote (Windows, 64-bit): https://papas-best.com/downloads/optimizer/stable/x64/Optimizer.7z (7.53 MiB) Adjust settings in the tabs, select a folder with files for optimization; press Analyze, then Optimize. Warning: Experimental – use at your own risk, always back up your files! Anything DEFLATE-related can take *very* long. I don’t care; I just let it run in the background for weeks. While the creation/modification date stays roughly the same, the latter is changed minimally to mark files as optimized so they are not considered in subsequent runs. In case of errors, a directory with a log file and artifacts may remain at your temporary directory. I may clean it up in later versions. If your username or the installation directory contains special characters, JPEG optimization may not work due to problems with EXIFTool’s Perl runtime. Sorry; I’m investigating! Short: This is similar to Nikkho’s File Optimizer, but the goals are a little different :)My optimizer tries to squeeze out every last bit without any regard to sanity. Files may take days or weeks to finish, so it’s fully multi-threaded. Under the hood, it’s the usual calls to ECT & Co.It’s a small project and I’ve only used it personally, so it doesn’t support many file formats. Long: I’ve been running BAT scripts on backups for quite some time, but the problems with unicode paths kept adding up good use of multiple CPU cores became more and more a concern when I cranked up compression levels sanity checks on the results became incredibly complicated So I decided to write my own C++ Win32 frontend to solve these problems.I used it successfully for some years on my personal backups and projects (mostly PNG, JPG, GIF) and adjusted the UI/toolset whenever I needed to improve on anything. The UI is inspired by Ken Silverman’s PngoutWin which I used in the 2000s.It’s multi-threaded – runs one thread per optimization, but as many parallel optimizations as you like. (Multiple threads per optimization turned out very hard to control; maybe later.) Optimization runs with low priority, so you can keep working normally. There’s nothing exciting to say compression-wise; all work is done by external tools. Except maybe for ArchiPNG, which I wrote a few years ago to prepare PNGs in a way they compress better with 7-Zip (especially with PPMd). No magic there, just brute-forcing filters with special zlib settings.Optimization aborts with damaged files (I think it should be up to the user to fix damaged files, not software taking educated guesses). In general I take errors very seriously and abort to avoid data loss. However, I’ve not managed yet to enforce an actual before-after SHA check. PNG Keep/delete metadata Full optimization via ECT Optimization for later archiving via a custom tool I wrote (optimal for PPMd) Clear/keep transparent pixels (important for textures with premultiplied alpha) Runs OptiPNG, ECT -60500 --allfilters-b --pal_sort=120, DeflOpt, defluff (takes very long) JPEG Delete metadata, keep only Date Taken, keep anything but the thumbnail, or keep everything. Derotate automatically or don’t; if derotating, force losslessness or don’t. Runs EXIFTool for metadata, mozjpegtran for derotation, ECT for optimization. Office (docx, xlsx, odt & Co.) WARNING - nonstandard! The specification requires an uncompressed MIME type as first stream in the archive, but I blindly re-compress everything. Works fine with MS/Libre Office, though. Recursive optimization – optimizes the contained PNG/JPEG/etc, but does not convert BMP/TGA/TIFF to PNG even if you asked for it on top-level files. (So it doesn’t destroy complex packages with specific layouts.) Runs 7-Zip, ECT, DeflOpt, defluff. ZIP/gzip Extract, optimize, or ignore. Recursive optimization – optimizes the contained PNG/JPEG/etc. GZ optimization is a little bit weaker than Nikkho’s (because it does not remove file names). Runs 7-Zip, ECT, DeflOpt, defluff. BMP/TGA/TIFF Can be converted to PNG (without changing the file extension). Runs pngout in addition to the PNG tools because it’s better at converting. It’s not permitted to re-distribute pngout, so please download it and place it in the tools subdirectory if you want to use it. GIF Optimization via flexiGIF. Can take weeks. You can choose to convert non-animated GIFs to PNG, applying all PNG settings. Windows Delete Folder.jpg *if* it is a system file, i.e. auto-generated by Windows Media Player & Co. Delete Album Art files from Windows Media Player. Delete thumbs.db. I’m now trying to publish my stuff instead of having it laying around on my hard drive, so it got a major rewrite last week (mostly changing the polling on sub-process output to an event-based system) and now I hope it’s useful to someone out there :)
    5 replies | 174 view(s)
  • Shelwien's Avatar
    Yesterday, 01:46
    Standard PRNG are pretty bad: https://encode.su/threads/3099-Compressing-pseudo-random-files?p=59940&viewfull=1#post59940 https://encode.su/threads/3131-Hexadecimal-compression?p=60558&viewfull=1#post60558 Still, even for AES there're some options. 1) we can try passwords from a good password list and test if a file decrypted with some password is more compressible 2) we can try cracking it with SAT - with modern hardware its not totally impossible, also we can try and invent some advanced method for SAT solving. Anyway, my point is that there's a known method to compress zeroes.aes to 32 bytes or so (though its very expensive). Does anybody think that there's _another_ method to compress it that doesn't require decrypting?
    193 replies | 74344 view(s)
  • Jyrki Alakuijala's Avatar
    23rd August 2019, 22:43
    For general purpose lossy image compression. It is rather close to being a traditional jpeg, just more dense.
    9 replies | 989 view(s)
  • WinnieW's Avatar
    23rd August 2019, 21:02
    ... Expected result for me. AES-cbc mode works just like a PRNG (pseudo random number generator).
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    23rd August 2019, 19:11
    I already have a sticky thread in "random compression". Just post a warning or something there.
    3 replies | 224 view(s)
  • SvenBent's Avatar
    23rd August 2019, 18:19
    my apolgoies i messed up ecrypt with Nessie Nessie is the EU crypto competition where all the candidates failed
    7 replies | 181 view(s)
  • SvenBent's Avatar
    23rd August 2019, 18:17
    I agree i am a big fan of chacha20 ( but not really enough knowledge for that to make a difference) and ive changed from using aes to chacha20 on devices with no AES specific hardware ( aka my phone) i did reda into some ecrypt and super cop but i believe ecryps tottaly bonked as non of the algo was found to be food enough which then lead into estream and supercop (asian?) did have a veeery weird setup of recommendation that i gave up on it i wonder if aes twofish andf serpent already had so much attention that there was only really room on that attention spand for one more from Estream. Or did salsa/chacha20 just outperform the rest of the estream porfolie with such a big marken that we dont reallyy care for the runner ups ? Just wondering why NC-256 and rabbit a so way unheard off
    7 replies | 181 view(s)
  • SvenBent's Avatar
    23rd August 2019, 18:12
    "So maybe is a good idea to write a sticky post right in the top of the subforum, explaining clearly that the following threads are not to be taken to the letter, and consist mostly of unsustained claims that are either mathematically impossible or not implemented ergo not tested or validated. And, that if any one of them actually happens to be true, it will be moved to the main section. " I agree with this. just to counter the BS claims with a bit of sound skepticism
    3 replies | 224 view(s)
  • Sebastian's Avatar
    23rd August 2019, 17:02
    I have written one. It works on config files und uses stochastic search methods. Even multithreading is possible. To make it use with command line compressors we can add a simple python script to the pipeline. I'll take a look at it next week.
    88 replies | 8253 view(s)
  • Darek's Avatar
    23rd August 2019, 15:57
    @Sebastian - about what kind of autmatic optimization tool do you wrote? I've must miss something... I've used hands parameters settings to review full scoope of parameters, which is more time consuming and not optimal - that's absolutely right - then auto-optimization tool would be probably more accurate.
    88 replies | 8253 view(s)
  • compgt's Avatar
    23rd August 2019, 15:44
    On the other hand, if you're not a good programmer, you don't need to implement all known algorithms. Don't get bitten by the compression coding bug. There should be no compulsion.
    4 replies | 280 view(s)
  • compgt's Avatar
    23rd August 2019, 15:08
    Aim high: "Invent a super compression algorithm and sell it for hundreds of million$." You sell it. It's not just: "hey, it improves your credibility to tech companies." It is worth that much already, your deal of a lifetime realized.
    4 replies | 280 view(s)
  • Jarek's Avatar
    23rd August 2019, 14:00
    While I generally agree with Michael, I prefer Sportsman's view, e.g. there might be no ANS now if I knew AC in 2006. Fields are mostly developed by specialists - what is often necessary, but also they "know well what can and should be done" - there is systematic evolution, but it is tough to get out of standard ways of thinking. Having a strong base (!), it is worth to attack new problems with clear mind first - before studying and comparing with known approaches.
    4 replies | 280 view(s)
  • pacalovasjurijus's Avatar
    23rd August 2019, 12:14
    I agree on the thread. Compression by calculus https://github.com/pjmidgard/Spring?files=1
    193 replies | 74344 view(s)
  • pacalovasjurijus's Avatar
    23rd August 2019, 12:09
    import binascii a=0 b=0 l="" j=0 b=0 aq=0 qfl=0 t=0 h=0 byteb="" notexist="" lenf=0 numberschangenotexistq = numberschangenotexistqz = qwa=0 m = p=0 namea="" asd="" b=0 szx="" asf2="0b" while b<1790: m+=-1] b=b+1 k = wer="" numberschangenotexist = numbers = name = input("What is name of file? ") namea="file.Spring" namem=name+"/" s="" with open(namea, "w") as f4: f4.write(s) with open(namea, "a") as f3: f3.write(namem) with open(name, "rb") as binary_file: # Read the whole file at once data = binary_file.read() s=str(data) with open(namea, "ab") as f2: for byte in data: av=bin(byte) a=a+1 if a<=1790: byte=int(byte) m = byte numbers.append(byte) h=h+1 if a == 1790: p=0 while p<1790: if p!=m: k.append(p) p=p+1 #lenf count lenfg=len(k) if lenfg>0: wer=wer+"0" notexist=k0] szx=bin(notexist)2:] lenf=len(szx) xc=8-lenf z=0 while z<xc: szx="0"+szx z=z+1 wer=wer+szx szx="" if lenfg==0: wer=wer+"1" b=-1 bb=0 kl=1789 bnk=0 cb=0 bb=-1 er=-1 ghj=0 ghjd=1 bnk=1 p=-1 cvz=0 qwa=qwa+1 while p<1789: p=p+1 if lenfg>0: if 255!=numbers: byteb=numbers numberschangenotexist.append(byteb) if 255==numbers: numberschangenotexist.append(notexist) if lenfg==0: byteb=numbers numberschangenotexist.append(byteb) #count 1789 ghj=numberschangenotexist qfl=qfl+1 ghjd=ghj bnk=1 bnks=1 bb=-1 bnkd=1 kl=kl-1 if qwa<=1: while bb<kl: if qwa<=1: bb=bb+1 if qwa<=1: bnk=bnk*255 if qwa<=1: bnks=bnks*256 if qwa<=1: numberschangenotexistq.append(bnk) numberschangenotexistqz.append(bnks) if lenfg>0: bnk=numberschangenotexistq ghjd=ghjd*bnk if lenfg==0: bnks=numberschangenotexistqz ghjd=ghjd*bnks cvz=cvz+ghjd szx=bin(cvz)2:] lenf=len(szx) if lenfg>0: xc=14310-lenf z=0 if xc!=14310: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" if lenfg==0: xc=14320-lenf z=0 if xc!=14320: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" a=0 numberschangenotexist = del k del numbers m = b=0 while b<1790: m+=-1] b=b+1 b=0 b=0 s=h%1790 if s!=0: s=s-1 p=-1 if s!=1789: b=-1 bb=0 kl=s bnk=0 cb=0 er=0 bb=-1 cvz=0 ghj=0 ghjd=1 bnk=1 while p<s: p=p+1 byteb=numbers numberschangenotexist.append(byteb) #count 1789 ghj=numberschangenotexist ghjd=ghj bnk=1 bb=-1 kl=kl-1 while bb<kl: bb=bb+1 bnk=bnk*256 ghjd=ghjd*bnk cvz=cvz+ghjd szx=bin(cvz)2:] lenf=len(szx) ert=0 s=s+1 ert=s*8 xc=ert-lenf z=0 if xc!=ert: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" a=0 wer="0b1"+wer+"1" lenf=len(wer) xc=8-lenf%8 z=0 if xc!=ert: while z<xc: szx="0"+szx z=z+1 wer=wer+szx lenf=len(szx) szx="" n = int(wer, 2) jl=binascii.unhexlify('%x' % n) f2.write(jl)
    47 replies | 2936 view(s)
  • Jarek's Avatar
    23rd August 2019, 11:57
    Video from Sneyer's presentation in May: https://www.youtube.com/watch?v=RYJf7kelYQQ
    42 replies | 5020 view(s)
  • Sportman's Avatar
    23rd August 2019, 11:50
    I agree start small, test and improve, but to get radical new ideas it can be better not first to study a subject to micro improve already existing things, but first try your own thing and then compare it with already existing things. Because once a brain is resonating at a set of populair ideas or rules it's difficult to bend from them.
    4 replies | 280 view(s)
  • Sportman's Avatar
    23rd August 2019, 11:36
    Indeed: https://docs.python.org/2.0/ref/indentation.html Text editors: https://atom.io/ https://www.sublimetext.com/
    47 replies | 2936 view(s)
  • Gonzalo's Avatar
    23rd August 2019, 05:49
    I remember when I first started reading the forum, some ten years ago. God how the time flies. Anyway, I distinctly remember being dazzled by the claims of impossible ratios, promises of infinite storage and tales of 'unknown' or 'undiscovered' math. Heck, I even had a few of these ideas of my own. Luckily I never said anything out loud because I wanted to be sure first. Don't need to explain what happened next. The thing is, new members or visitors are very susceptible to that type of claims. To the time they understood what data compression is all about, they may have participated in more than a few pointless arguments, defending the 'right to dream' of some other misguided soul. So maybe is a good idea to write a sticky post right in the top of the subforum, explaining clearly that the following threads are not to be taken to the letter, and consist mostly of unsustained claims that are either mathematically impossible or not implemented ergo not tested or validated. And, that if any one of them actually happens to be true, it will be moved to the main section. This is not to sound dogmatic or elitist.
    3 replies | 224 view(s)
  • Shelwien's Avatar
    23rd August 2019, 04:35
    I made a new subforum: https://encode.su/forums/19-Random-compression Threads that disappeared might be there. Give me links of other related threads still remaining in "data compression"
    3 replies | 224 view(s)
  • michael maniscalco's Avatar
    23rd August 2019, 03:17
    I first got hooked on data compression in the mid 1990's when the fellow who ran the local video tape store (yes, some of us are that old) who used to build computers for people to supplement his business showed me how amazed he was with the ability of ARJ to compress data better than pkzip (at least I think those were the two compressors if I remember correctly). Anyhow, it got me thinking about data compression and how in the heck this stuff was actually working. I picked up a copy of Mark Nelson's book on Data Compression at the local bookstore that specialized in computer books and I got to reading. At that time BWT was little known and the book largely covered dictionary based compression and statistical compression (LZ and PPM). But I was mostly hooked on the 'magic' of the Burrows Wheeler Transform. Sure I built a few PPM compressors and LZ compressors (and many more professionally as well). But my muse was BWT. At the time I had to write my own BWT algorithm because, frankly, at that time everything sucked. That was the beginning of some of my ideas for MSufSort. The prototype v1.0 would be nothing to write about today but it still the basis for a lot of the methods that make the fastest BWT algorithms even 20 years on. But that was just something that I had to build to get to writing a real BWT compressor which would "change everything". I then wrote M99 and a year later I conceived of what was later published as M03 (published several years after I had conceived of the idea). Did they "change the world"? No. But I wouldn't change how it all unfolded for anything. It's been a hell of a ride. The point of all of this is this: Compression, Computer Science, Algorithms etc ... Those of us who are compelled to pursue these muses are truly a rare breed. But we each have to start small. Yes, enjoy the wonder of not quite knowing how it all works for as long as you can, but do not fail to listen to the advice of those who have already walked the path. If you are truly bless with creativity then it will be there for you when you have built the tool set needed to manifest that creativity. First you must crawl, then you can walk. So rather than trying to start out with the algorithm that will 'change the world', instead, spend a month building the 'algorithm that will enable you to build the skill set to change the world'. Start with a simple LZ compressor. There are still huge improvements introduced in this class even today (just look at Christian's RAZOR). Then move on to tougher challenges from there. The compulsion to pursue these challenges will make themselves known to you, don't worry. Please don't start with radical claims and no achievements. Instead, start with small achievements and the next thing you know your muse will find you and then you will have the tools necessary to unleash your creativity and give this group something that is truly innovative. - Michael
    4 replies | 280 view(s)
  • rarkyan's Avatar
    23rd August 2019, 02:40
    Im agree to move this thread into subforum sir. At this point i dont want to disturb any working known theory and other prove which state that we cant create lossless compression on random data. Also i still unable to prove my method. I did test with anything i know, still have zero result. But im still trying other way over and over again. I dont want to disturb anyone. Only those who have the same curious or maybe want to solve the lossless random data compression by studying the pattern maybe support the discussion. I wish any kind of non valid idea moved to subforum. Visit anytime to help us improve. Thank you ;)
    193 replies | 74344 view(s)
  • JamesB's Avatar
    23rd August 2019, 02:27
    My point wasn't so much to tell people they're nutters (although in this case I'm starting to wonder - sorry folks!) or to lose patience, but to channel the obvious enthusiasm into something *useful*. People may lack the necessary skills to make any useful contribution right now, but you have to start somewhere - even if it's RLE. Pick something that works and play with it. Then try some variations and play with those. Who knows, from a position of ignorance you may just go in a direction no one else considered due to being blinded by "knowing" the right way. However be a scientist. Test your hypothesis. Code it: encode, decode, compare. Who'd have dreamt BWT was around the corner just before it arrived? However just forget about the random data stuff. It's pointless on so many fronts, the first being it's impossible and even more importantly no one wants to compress random data anyway! Concentrate on the non-random. It's so much more useful.
    193 replies | 74344 view(s)
  • michael maniscalco's Avatar
    23rd August 2019, 02:19
    How about we simply differentiate between publicly available source code and no publicly available source code? I realize that there are some instances where legitimate contributions happen but where the code is currently not public. But I think the vast majority of the 'random compression' noise would be filtered simply by virtue of not having a publicly available demonstration which can be peer reviewed or have its conclusions verified or reproduced. This is sadly true. But I think there is a tipping point as well where the insane start to run the asylum. Eventually, even those of us who have been here for over a decade would simply stop checking in and worse yet, those who actually have something to actually show will eventually be drown out by the noise of the crackpots. We have seen this all play out before. I remind us all of this post on comp.compression by non other than Mark Adler: This must not happen to encode.su
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    23rd August 2019, 01:17
    There's kinda a plan to make another subforum for this and put these threads there. But it would only make sense once there's some other activity in this forum. People kinda stop posting and even visiting after several days of no activity. > Maybe an established test set of 'random' data for the site would help? Why, we have a perfect solution for this - http://prize.hutter1.net/
    193 replies | 74344 view(s)
  • introspec's Avatar
    23rd August 2019, 01:00
    I am sympathetic to your patience, but I'd like to mention that it is really unfair to say that JamesB was speaking from the position of authority. His point had nothing to do with his position or authority. He simply operates under fundamental assumption that one cannot disprove something one does not understand in the first place. I think it is fair to treat the refusal of some people in this thread to engage with the counting argument as the admission of lack of even basic understanding of what would be involved in countering it. And then it is just as fair to not support the discussion that surely will not lead to any advancements in theory or practice of data compression. This thread is a local equivalent of the offices in various Academies of Sciences that used to deal with the inventors of perpetual motion machines. Some of these offices existed for decades. Zero perpetual motion machines were invented, but not for lack of trying, and definitely not because scientists somehow lacked the imagination to think outside of the box or abused their positions of power.
    193 replies | 74344 view(s)
  • michael maniscalco's Avatar
    23rd August 2019, 00:29
    Sure, I get your point here. But I think James' frustration is completely understandable. Perhaps there's some way to break these threads into groups? Theory, practice, etc? This thread is clearly not 'in practice' since it mathematically can not be. There either isn't a working demonstration or the understanding of the test data is wrong. Maybe an established test set of 'random' data for the site would help? New member: "I have an algorithm that can compress random data!" Community: "How does it do on encode.su's 'random test suite'?" New member: "Um, I haven't tried yet." Community: "come back when you do. And bring you're decoder with you." Just thinkingbout loud. This "random data" nonsense was the beginning of the end for the compression groups of the past.
    193 replies | 74344 view(s)
  • Sebastian's Avatar
    22nd August 2019, 23:57
    Why don't you guys use an automatic optimization tool instead of trying out parameters by hand. It's not that these are independent from another.
    88 replies | 8253 view(s)
  • dream2014fly's Avatar
    22nd August 2019, 23:53
    I am working on unpacking. Test: Spring Size: Before: 1,048,576 bytes random file from faq After: 1,047,171 bytes file.Spring Time compression: 1 hour 19 minutes Made by Jurijus Pacalovas Wallet bitcoin: bitcoincash:qrvmgpa8nlfq4n3pnvx4ws6f674su9zhgsrjt536ym https://github.com/pjmidgard/Spring?files=1
    47 replies | 2936 view(s)
  • Shelwien's Avatar
    22nd August 2019, 22:41
    Actually I didn't like JamesB's post here: https://encode.su/threads/1176-loseless-data-compression-method-for-all-digital-data-type?p=61331&viewfull=1#post61331 Authority-based arguments just... don't work. So I'm trying a different approach - it'd be nice to have one that works, since this kind of discussion repeats infinitely.
    193 replies | 74344 view(s)
  • rainerzufalldererste's Avatar
    22nd August 2019, 22:23
    dnd's recent changes to TurboRLE are very impressive and even the encoder is now incredibly fast as well. However (following my horrible naming convention) I have added rle8 extreme. (also has 16, 32 and 64 bit rle variants) So now for a few days there are again certain scenarios (seems like it's files that run-length-encode down to ~20-50%) will decompress considerably faster using RLE8 than in TurboRLE. 1034.db (Checkers program "End Game Table Base") Type | Compressed Size (Ratio) | Encoding Speed | Decoding Speed | Size (Ratio) with rans_static_32x16 - | 419.225.625 Bytes (100.00%) | - | - | 56.728.176 Bytes (13.53%) rle8 | 88.666.372 Bytes (21.15%) | 412.501 MB/s | 2469.096 MB/s | 43.940.318 Bytes (10.48%) rle8 single | 88.666.372 Bytes (21.15%) | 414.679 MB/s | 2474.987 MB/s | 43.940.318 Bytes (10.48%) rle8 ultra | 104.249.934 Bytes (24.87%) | 393.790 MB/s | 2715.048 MB/s | 48.666.568 Bytes (11.61%) rle8 ultra single | 104.249.934 Bytes (24.87%) | 394.349 MB/s | 2710.763 MB/s | 48.666.568 Bytes (11.61%) rle8 extreme 8 bit | 96.495.695 Bytes (23.02%) | 677.071 MB/s | 6521.398 MB/s | 51.048.980 Bytes (12.18%) rle8 extreme 8 bit single | 86.326.906 Bytes (20.59%) | 361.772 MB/s | 6820.208 MB/s | 50.940.427 Bytes (12.15%) rle8 extreme 16 bit | 104.335.593 Bytes (24.89%) | 753.408 MB/s | 6456.242 MB/s | 51.955.612 Bytes (12.39%) rle8 extreme 32 bit | 118.999.253 Bytes (28.39%) | 1149.847 MB/s | 6663.612 MB/s | 52.076.188 Bytes (12.42%) rle8 extreme 64 bit | 139.860.053 Bytes (33.36%) | 1419.075 MB/s | 7263.287 MB/s | 50.986.906 Bytes (12.16%) - | - | - | - | - | trle | 73.108.990 (17.4%) | 633.02 MB/s | 2493.27 MB/s | - srle 0 | 84.671.759 (20.2%) | 390.43 MB/s | 4783.11 MB/s | - srle 8 | 92.369.848 (22.0%) | 886.09 MB/s | 5300.75 MB/s | - srle 16 | 113.561.537 (27.1%) | 804.69 MB/s | 5948.99 MB/s | - srle 32 | 136.918.311 (32.7%) | 1310.77 MB/s | 7372.94 MB/s | - srle 64 | 165.547.365 (39.5%) | 2140.93 MB/s | 8391.23 MB/s | - mrle | 88.055.360 (21.0%) | 207.23 MB/s | 1206.61 MB/s | - memcpy | 419.225.625 (100.0%) | 7686.57 MB/s | - | - video-frame.raw (heavily quantized video frame DCTs) Type | Compressed Size (Ratio) | Encoding Speed | Decoding Speed | Size (Ratio) with `rans_static_32x16` - | 88.473.600 Bytes (100.00%) | - | - | 11.378.953 Bytes (12.86%) rle8 | 17.630.322 Bytes (19.93%) | 449.978 MB/s | 1497.118 MB/s | 8.099.993 Bytes (9.16%) rle8 single | 17.657.837 Bytes (19.96%) | 450.186 MB/s | 2378.925 MB/s | 8.138.195 Bytes (9.20%) rle8 ultra | 21.306.466 Bytes (24.08%) | 432.428 MB/s | 1602.872 MB/s | 9.306.772 Bytes (10.52%) rle8 ultra single | 21.332.661 Bytes (24.11%) | 428.820 MB/s | 2657.480 MB/s | 9.342.219 Bytes (10.56%) rle8 extreme 8 bit | 17.147.077 Bytes (19.38%) | 771.743 MB/s | 7730.117 MB/s | 8.435.522 Bytes (9.53%) rle8 extreme 8 bit single | 16.242.653 Bytes (18.36%) | 403.398 MB/s | 7738.341 MB/s | 8.628.014 Bytes (9.75%) rle8 extreme 16 bit | 17.980.330 Bytes (20.32%) | 1009.200 MB/s | 7662.235 MB/s | 8.526.123 Bytes (9.64%) rle8 extreme 32 bit | 19.473.112 Bytes (22.01%) | 1522.979 MB/s | 7853.659 MB/s | 8.636.199 Bytes (9.76%) rle8 extreme 64 bit | 21.703.102 Bytes (24.53%) | 1858.194 MB/s | 8227.052 MB/s | 8.595.611 Bytes (9.72%) - | - | - | - | - | trle | 14.187.432 (16.0%) | 690.37 MB/s | 2974.10 MB/s | - srle 0 | 15.743.523 (17.8%) | 423.49 MB/s | 5686.69 MB/s | - srle 8 | 16.555.349 (18.7%) | 1003.01 MB/s | 6193.03 MB/s | - srle 16 | 18.868.388 (21.3%) | 1033.75 MB/s | 7139.00 MB/s | - srle 32 | 21.390.380 (24.2%) | 1689.23 MB/s | 8122.06 MB/s | - srle 64 | 24.311.530 (27.5%) | 2820.68 MB/s | 8809.48 MB/s | - mrle | 17.420.113 (19.7%) | 215.72 MB/s | 1320.74 MB/s | - memcpy | 88.473.600 (100.0%) | 7568.31 MB/s | - | -
    9 replies | 690 view(s)
  • michael maniscalco's Avatar
    22nd August 2019, 22:17
    Eugene, you have more patience than most. ;)
    193 replies | 74344 view(s)
  • Mauro Vezzosi's Avatar
    22nd August 2019, 22:12
    My opinion is: Option Description Default Note -time_steps n Number of time steps for TBTT 20 No special suggestion, say 5-30. Test 18, 22, 16, 24, ... and go in the best direction. -seed n Random number generator seed 123 In some cases, initializing weights with different values slightly improves the compression (see "change seed from 1 to 2" in https://encode.su/threads/2882-lstm-compress?p=61172&viewfull=1#post61172). -adam_beta1 n ADAM beta1 parameter 0.0 The widely suggested value to set is 0.9, I tested 0.01 and 0.1, maybe it can also be tested 0.001 (and, of course, any other values < 1.0). -adam_beta2 n ADAM beta2 parameter 0.9999 The widely suggested value to set is 0.999, I tested 0.999, maybe it can also be tested 0.99999 (and, of course, any other values < 1.0). -adam_eps n ADAM epsilon parameter 0.00001 I tested 0.000001, 0.0000001, 0.00000001 (any other value, say < 0.0001, can be tested). Must be > 0. -n_embed_out n Number of layers in output embedding = -n_layer Try n_layer - 1, n_layer - 2, ... (expecially when n_layer is "big"). Must be >= 1.
    88 replies | 8253 view(s)
  • mo0n_sniper's Avatar
    22nd August 2019, 22:11
    For what will the PIK based codec will be used in Jpeg XL?
    9 replies | 989 view(s)
  • encode's Avatar
    22nd August 2019, 16:01
    encode replied to a thread CHK Hash Tool in Data Compression
    In case of someone have some spare time - please create a CHK's Language Pack for your native language. The file "lang.txt" must be UTF-8 with BOM. :_coffee:
    180 replies | 77494 view(s)
  • Shelwien's Avatar
    22nd August 2019, 15:32
    > I don´t think so. I mean using arithmetics/models in conjunction with programming abilities. > Because computers are designed to works only with numbers (bits) 0 and 1. Its not so simple. Current cpus are designed to support up to 64-bit data types. Yes, we can simulate any data types with bit operations, but in practice there'd be 100x speed difference between hardware 128/64 division and boolean simulation of it. For example, I made this: https://encode.su/threads/3122-dec2bin-converter-for-Nelson-s-million-digits-file Its necessary to access some known redundancy in the data and processing is pretty slow even now, making it 100x slower would turn it impractical. Same applies to most other operations. Like, arithmetic coding ideally can be implemented with precise long numbers, but that would have quadratic complexity, so we'd have to wait years to encode a megabyte file with it. So in practice we have imprecise ACs modified for cpu precision - there's some invisible (under a bit) redundancy, but it can be actually used. > Lastly, but not the least, remember that one member here write that > "we thought that arithmetic coding is the end of it. > Now there is an ANS or Assymetric Numeral Systems"? ANS doesn't provide better compression than AC/RC, basically you can treat it as AC speed optimization method. > It explains that it´s all about finding the best way to compress data and even incompressible. Sure, for example here I made a coder which can compress some otherwise incompressible files: https://encode.su/threads/2742-Compressed-data-model?p=52493&viewfull=1#post52493 It also may be possible to compress some other filetypes via cryptography or recompression or advanced models. But if you can't compute how many bits you need to encode a 256-byte string where each byte value occurs only once, it means you can't provide any useful ideas for data compression.
    193 replies | 74344 view(s)
  • rarkyan's Avatar
    22nd August 2019, 15:18
    I like the spirit :_yahoo2: Thank you CompressMaster
    193 replies | 74344 view(s)
  • CompressMaster's Avatar
    22nd August 2019, 13:00
    I don´t think so. I mean using arithmetics/models in conjunction with programming abilities. Because computers are designed to works only with numbers (bits) 0 and 1. Next, we have 256 possible values in one byte. Instead of encoding the data sequence as 00011101 we can simply write "1D" to save some space. This can be further simplified to converting file to right interpretation - EXE, JPG, CHM, DOC, RAR etc, althought I´m not sure HOW it works, because I need to interpret data as text and binary files in RAW text mode are useless unless we use hexadecimal (or other) converter. But the base is still THE SAME. That´s big advantage for us. Well, you have 1 000 000 same characters. This can be simplified to "01000000" (that´s not byte, it´s simplification) where first character is yours and second part represents how many (or whatever you want, it depends on software´s ability to understand what it means). The simplest algorithm RLE will be useless here, because we have to store additional character - 0x1000000. Further, if we don´t have any prior knowledges of input, reduction would not be that strong. Second part. We are all able to see repeated patterns at every 256th position (farthest possible pos). And if we can predict it with best probability, then we will win. Lastly, but not the least, remember that one member here write that "we thought that arithmetic coding is the end of it. Now there is an ANS or Assymetric Numeral Systems"? It explains that it´s all about finding the best way to compress data and even incompressible. That means - for now, random data are incompressible, but what about improvements in 10 yrs? I explained that in detail at rarkyan´s welcome post.
    193 replies | 74344 view(s)
  • xcrh's Avatar
    22nd August 2019, 12:56
    At the end of day that's what you get for not so random random I guess. It can get worse. Say, Debian operating system eventually applied small "harmless" patch to openssl library, openssl devs even acknowledged it as "harmless". Ironically it proven to be not so harmless, killing off a lot of entropy during key generation and bringing whole set of possible RSA keys to just several thousands or so. At which point attacker can simply generate all possible RSA keys and brute force which of these would actually work. As the result, thousands of systems faced break-ins via SSH (while SSH protocol isn't TLS or SSL, openssh used openssl to process keys, so generated keys suffered from very same problem) - and it has been a full scale "emergency" for hosting providers and somesuch. There're also some attacks on e.g. embedded systems like wi-fi routers, abusing the fact embedded systems often lack good entropy sources and therefore their generated keys and so on could be less random than desirable, jeopardising otherwise secure cryptography. I'd say crypto is extremely unforgiving to shortcuts, careless coding and lack of attention to small details. Touching inner working of crypto algo takes full understanding of underlying math, possible cryptanalisys, and a very decent understanding of how hardware works. That's why most mortals shouldn't try to "improve" crypto algos, unless they understand all of that. Which is very challenging. As concrete example, AES on its own looks more or less secure as for now. However, if careless implementation is used, attacker can, say, measure run time of algo and its part - so it would turn rather insecure overall, up to ability to reconstruct key. That's one of reason why recent encryption algos are generally moving away from array operations (=memory access, subject to cache effects and related timing issues) in favor of math-only operations that do not access memory, ensuring algo completes in constant time, regardless of input. Salsa/chacha design is a good illustration.
    1 replies | 343 view(s)
  • rarkyan's Avatar
    22nd August 2019, 12:07
    Thanks in advance sir. Gonna search for that :_banana2:
    193 replies | 74344 view(s)
  • xcrh's Avatar
    22nd August 2019, 11:54
    Rather funny attack, somewhat resembles wi-fi weakness, where managemen frames are neither encrypted, nor authenticated either. While it doesn't breaks wi-fi crypto in e.g. WPA2, it still gives room for nasty attacks. I guess it's possible to work it around, by refusing to communicate if key is too short, but it can cause some compatibility problems and also requires to change both firmware and OSes bluetooth stacks, which makes deployment rather troublesome - so I can imagine this vuln would stay around for a while.
    1 replies | 82 view(s)
  • Darek's Avatar
    22nd August 2019, 11:26
    @Mauro - did you have parameters range whioch can be used for time_steps, adam_beta2 and seed ?
    88 replies | 8253 view(s)
  • JamesB's Avatar
    22nd August 2019, 11:13
    There's an irony there. :-) However to help you, use the unix "cut" tool. If you run Windows, then try installing the Windows subsystem for linux (WSL). I'm sure there are equivalent windows tools out there, but generally I find Unix to have readily available methods for basic file manipulation: cut, join, split, sort, in addition to trivial one-liners in simple programming languages (eg awk '{print $1}' would do the same too).
    193 replies | 74344 view(s)
  • xcrh's Avatar
    22nd August 2019, 10:45
    Interestingly... 1) I've never seen not even a single dev from Iran or North Korea. As far as I know people of North Korea are mostly denied access to global Internet. At which point I fail to imagine how they could contribute to modern sofrware development, to begin with. If there is nothing to lose, there is nothing to worry about, right? Uhm, well, totalitarian governments put self in a trap here, virtually sabotaging modern approaches in high-tech areas. But it's not Github to blame. 2) More or less advanced China devs can eventually access github. Their life is hard: Great China Firewall being on the way, censoring communications. Eventually denying access to Github. Also, China police seems to be rather active on github, I've seen few projects removed by their request. But it only works if someone specified they're from china and so github have to enforce chinese laws on them, so ton of devs all around globe quickly forked removed projects, and china police does not haves authority over many of them. So censorship efforts tend to be jeopardized by swarms of angered devs all around the globe. Interestingly, hardships bring interesting fruits. Chinese VPN/proxy/other advanced networking software are masterpiece, beating easy-to-detect-and-block "western" software to dust. Sometimes it even can communicate quite reliably over rather shitty connections and so on, using unorthodox tricks to improve overall experience. 3) Nobody forces people to use Github. They only use it because it convenient, free of charge and there're ton of devs & related ppl around, it improves chance project would get some attention from "outsiders". Git on its own is really neutral to how you do development. You can even exchange commits using floppy disks, if you prefer it this way. However you're supposed to have team that is willing to do it this way, otherwise cooperation wouldn't happen. 4) Feel free to compete with guthub if you think you can do it better than that. But, realistically, I'll have a problem naming competitor. Maybe Gitlab, but its user interface is awkward and it more like some corporate groupware crapware rather than tool to work with source. ...and if someone wants real reasons to worry about Github, here is one. It's bought by MS!!! So... 1) NSA could and would spy on it sure, getting access to stockpiles of data. Though if they really curious they can obtain comparable data by other means, just takes more efforts than (ab)using some "hub" to do that. 2) Privacy policy would deteriorate over time for mentioned reason. Say, you already can't register on GH using Tor. Sure, there're countless ways around - but it raises bar a bit. 3) M$ historically considered opensource as kind of "free lunch". So they can try to police e.g. software licensing - they are known to dislike GPL/LGPL, where license demands to share improvements rather than just get away with free lunch. So after being bought by MS, GH faced quite an exodus by opensource projects who do care about this aspect. But they mostly gone gitlab and it feelis like downgrade. I'd say it kinda "emergency" measure to get rid of such and unpleasant neighborhood as MS, but it got its price. But so far its hard to imagine decent competitor to Github. They have nice and clean UI, all about source, everything else is secondary. Great idea, reasonable implementation. No other services came close.
    1 replies | 241 view(s)
  • xcrh's Avatar
    22nd August 2019, 10:13
    xcrh replied to a thread AES Alternatives in The Off-Topic Lounge
    I can imagine the following advantages of Salsa/Chacha: 1) Rather simple and fast. Even more or less OK on small MCUs and somesuch, like Cortex M. It can be better than AES in terms of speed and/or program memory & RAM requirements, as it mostly boils down to pure math, with fairly small state. Its possible to trade security vs speed by number of rounds. Even versions with lower number of rounds haven't suffered major breakdowns (however, extra margin never hurts - it can eventually save a day). 2) Designed by independent cryptographers, who are experts in the area and long-known proponnents of crypto. Its hard to expect foul play from someone like DJB and his fellow cryptographers. 3) There're numerous implementations - and these were used for some years already. Without major known problems, as long as implementations are correct. DJB himself got fairly brief and readable implementation in tweetnacl library, where whole lib doing public key, signatures and symmetric crypto fits in shy 100 (140-char) tweets. 4) Plenty cryptanalisys has been run, full versions have not faced any major problems so far, and even faster, lower-round versions faced only mild attacks, fairly useless in practice. So I'd say salsa/chacha (they are quite similar in design) are fairly good things. Especially when you lack hardware AES. And actually, whole idea to trust hardware doing hell knows what, implemented hell knows how jeopardizes security. Even if implementation lacks backdoors, it can have unexpected data leak paths (e.g. timing attacks) and so on. As Spectre/Meltdown has shown us, CPU manufacturers are willing to take quite a debatable shortcuts pursuing the speed, and with no implementation sources available for scrutiny implementation is not to be trusted. At which point salsa/chacha get advantage of being fast even without HW acceleration, while implementation could be thoroughly examined. p.s. ECRYPT challenges and also SUPERCOP benchmark and initiatives around got ton of various algos, including many things ppl never heard of. Some algos fell apart on early phases, some algos lasted longer. Before rushing to use new cool algo one have to consider the fact it could be relatively poorly researched - and therefore eventually broken or so. Though it relatively unlikely for algos surviving few phases of such competitions, it can still happen. So the more algo used and researched, the better (as long as no major flaws found).
    7 replies | 181 view(s)
  • rarkyan's Avatar
    22nd August 2019, 07:14
    Need some times to make validation on idea. I know im very lack too much information, so i need help from the expert. I still have several formula, and need to proved to be fail. Let me use my own ideas, and let me see my own mistake. Can anyone help me to make a little experiment on the pattern? I attach 10.000 bytes file. I need help to cut the hex editor view into only 1 column like this : I want to know how much each hex value fill every rows. Anyone?
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    22nd August 2019, 05:56
    > I just want to find another way to deal with the pattern in lossless compression. That's ok, but you lack too much information, so you can't find any valid ideas. Its like wanting to beat 5G protocols while only knowing that smartphone is a shiny black box. At the very least to have to understand some basic concepts like information, entropy, probability, combinatorics, enumeration, Kolmogorov complexity. Funny thing is that its possible to design successful compression algorithms without any mathematical foundation - but that's only possible with known types of compression algorithms - like LZ77 or RLE. While what you seem to want - compressing data which is usually incompressible - is actually much harder to do and requires much more knowledge from all areas of computer science.
    193 replies | 74344 view(s)
  • rarkyan's Avatar
    22nd August 2019, 04:25
    Because life is like a puzzle sir. I mean, some pieces lack of information, others have a lot of information but maybe miss a tiny piece of information. Im not assuming my information useful because the big pieces already contain what they need. But maybe at some case, a little rusty bolt still needed to make the whole engine works. Other things, actually i need help from mathematician or programmer to solve the method. Because they have experience on this field of study. I only propose idea, and learn from the feedback whether it is fail or not. Try to find another path when it is blocked: how to go there, what if we use this or that way, etc. Human just try to evolve. Problem surely need a method to solve. I dont know, but if when someone trying something new to develop a good things and the other just stay : "stop. its useless". Well maybe we never hold a smartphone nowaday. My apologize for being stupid but really i dont want to mess on this forum. I just feels like sleeping in a big hall, surrounded by expert doing their great works. Somehow i can learn from everyone here. Even its very hard for me to understand, i want try and i need help. Forget it. I just want to find another way to deal with the pattern in lossless compression.
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    22nd August 2019, 03:37
    Shelwien replied to a thread Introducing... in Data Compression
    We don't really have much information about repacks here. Its more about finding the right IS plugins/scripts, container analysis and so on, you don't really need programming or deep understanding of data compression. Here I googled some links for you, but I don't know where to find a complete tutorial in english. https://reverseengineering.stackexchange.com/questions/18620/how-to-unpack-inno-setup-bundles-with-arcsrep-data http://web.archive.org/web/20161125232839/http://freearc.org/InnoSetup.aspx https://tech.tiq.cc/2013/05/how-to-make-a-repack-of-a-game/ https://www.fileforums.com/showthread.php?t=99273
    1 replies | 167 view(s)
  • Gaudy's Avatar
    22nd August 2019, 02:19
    Gaudy started a thread Introducing... in Data Compression
    Hi everybody, I usually work with game Roms (12 years) and at this moment I manage projects near 200 GB, but when I knew FreeARC I fell in love. Since then I am interested in learning a bit about the world of Lossless Data Compression. Now, I want be clear, I dont know nothing about this (nothing). I want to introduce, because I would like experiment with FreeARC customs like Fox Kompressor, but Idont know how to decompress those files, and I want to generate some SFX, etc... Maybe I should start by studying the concepts (Mask, addon, CLS, difference between method and format, advantages and disadvantages of each algorith, filters, etc.), Ive been looking for, but this information is generally more programmer oriented, rather than people like me. Could you help me? or recommend me what can I read, but that isnt so exhaustive?
    1 replies | 167 view(s)
  • Shelwien's Avatar
    22nd August 2019, 01:23
    okay, let's make it specific: openssl enc -e -aes-128-cbc -nopad -in zeroes -out zeroes.aes
    193 replies | 74344 view(s)
  • WinnieW's Avatar
    21st August 2019, 23:09
    You've to take the cipher mode into account when you perform such an encryption. Some modes add additional randomness, others – like ECB – don't.
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    21st August 2019, 22:25
    1. Counting argument is not something to "solve", its just a visualization of an obvious fact, that there're more n-bit numbers than there're (n-1)-bit numbers. It only says that its impossible to compress any n-bit files without expanding some others. It doesn't say that you can't compress some random-looking files like jpegs or other compressed formats. Problem is that people tend to underestimate the number of files without any useful patterns in them. It may be counter-intuitive, but (n-1)/n of all n-bit files would have near-equal counts of 0s and 1s, for example. And when a file doesn't have any patterns to identify it with, it also means that its likely to be expanded rather than compressed. 2. Actually we can safely pretend that random data don't exist on a PC. 99%+ files one can download from the net won't be random even if random-looking. Their compression also can be considered solved for any personal purposes - there're enough of free hosting options that can be used to store the data forever and just keep a hash for identification. But why do you think that its possible to losslessly compress a file without any programming skill, just with simple arithmetics? Let's say, we have a megabyte of zeroes encrypted with AES using some unknown password. Would you compress it by finding the password, or would you believe that there's a magical formula that can do it some other way?
    193 replies | 74344 view(s)
  • fhanau's Avatar
    21st August 2019, 22:18
    Hmm, I'm not sure why this is happening. ECT was initially developed just with speed improvements compared with zopfli in mind and not with maximum compression. It has also only been tested extensively on PNG files, so this might be a case where ECT is not well tuned for the data encountered here. It might make help to examine the zip file and see if block splitting is than in zopfli or if ECT underperforms on small files.
    400 replies | 103554 view(s)
  • CompressMaster's Avatar
    21st August 2019, 20:14
    Completely agree. Of course I´ve read the counting argument and also the random data compression, and I think that everything COULD be possible (although there are some limits as always, and for other things, limits also applies), although some things are hardly compressible than others. I´m not believer of lossless infinite compression, because some information MUST be stored of course, but if we will be able to express even random patterns with smallest information, then we will be able to compress even random data such as SHARND much better.
    193 replies | 74344 view(s)
  • Shelwien's Avatar
    21st August 2019, 18:40
    > then you can simply combine these two to get a distribution > you can encode the data according to (with an entropy coder like AC)? Yes. Or just encode compressor's output to add redundancy. Here's an example that uses this approach to re-encode data to specified alphabet: http://nishi.dreamhosters.com/u/marc_v1.rar (like base64, but any alphabet). So in case of ECC, for example, we can generate an alphabet containing only codes with hamming distance >N, and thus detect N-bit errors. > Interesting. Although surely it could be better to incorporate some kind of > error correcting process into the approximation of the data-generating > distribution itself? It would be much more limited (and redundant) because of precision issues. Also its not too easy to design good codecs even without ECC considerations, so solutions like taking an existing codec (eg. lzma) and tweaking its entropy model seem much more practical.
    3 replies | 187 view(s)
  • dougg3's Avatar
    21st August 2019, 17:37
    I never figured it out. I realized that it's some kind of custom CPU as well, so I'm not sure how much I would have been able to do with the decompressed firmware anyway.
    11 replies | 2126 view(s)
  • h0m3us3r's Avatar
    21st August 2019, 16:37
    Were anyone able to figure this one out?
    11 replies | 2126 view(s)
  • tbird's Avatar
    21st August 2019, 16:07
    To make sure I understand this point - are you saying that if you had knowledge of the (possibly approximate) data-generating distribution and also knowledge of the ECC you wish to use, then you can simply combine these two to get a distribution you can encode the data according to (with an entropy coder like AC)? Interesting. Although surely it could be better to incorporate some kind of error correcting process into the approximation of the data-generating distribution itself? Assuming we don't have access to the true data-generating distribution.
    3 replies | 187 view(s)
  • Shelwien's Avatar
    21st August 2019, 13:28
    Well, it can be important for any kind of analog channel or storage (TLC+ SSD), since its common to use both compression and ECC there. But I think normally compression and ECC are separate layers. Also, any full-precision arithmetic coder can be combined with ECC easily enough (AC decoding can be used to produce data fitting to any given model), so there's little point to consider higher-level compression algorithms, since error correction can be solved at the level of entropy coder. But some formats (jpeg, mp3) do in fact provide an option to re-sync after data error. Then, there're some compression algorithms based on steganography, which is kinda related.
    3 replies | 187 view(s)
More Activity