# Thread: Compressing pseudo random files

1. >>
Its basically the same bruteforce search (well, Monte-Carlo).
but implemented using elementary particles as memory and physical laws for operations.
So both faster than current electronics and massively parallel.

Is there an NP-HARD problem quantum computers cannot solve?

Like whether quantum computers can brute-force,
or otherwise, whether P=NP?

Can we really say quantum computers can never become a reasoning intelligence ( like human, or animals...what else can reason? ) WITHOUT MORE?

2. 1. Currently available quantum computers cannot solve any practical problems yet.
Even fujitsu thing seems more practical.

2. NP-Hard problems are basically exponential.
With quantum computing, in theory, this can be compensated by the level of parallelism to an extent.
For example, the number of atoms in 1kg of hydrogen seems to be around 2^89... it should be possible
to test that many inputs in parallel.
Also, there seems to be around 2^75 chronons per second,
so that's 2^164 inputs checked per second (just the theoretic upper limit though).
It becomes much harder to scale up after that - working for a year instead of a second only adds 25 bits.

Some specific algorithms (like RSA) actually have lower complexity than size of their input would normally imply,
but in general it should be like this.
I mean, perfect quantum computer would be able to crack 100-150 bits more than normal cpus, but that's it.

3. >>liike whether quantum computers can brute-force,
or otherwise, whether P=NP?

>>Can we really say quantum computers can never become a reasoning intelligence ( like human, or animals...what else can reason? ) WITHOUT MORE?

before these hard questions , can quantum computers multiply two 2^60 bits numbers ( assuming has 2^120 qubits) ?

4. Multiplication of two N-bit numbers basically has to be unrolled to a sum of N of 2N-bit numbers.
And addition generates the result and a carry vector.
So even with some optimization it would require ~2*N^2 intermediate bits.
Which would be 2^121 in your case, so nope.

Of course, it would be likely possible to implement it as a series of additions, on interface side,
but that would be normal digital computing.

5. Can quantum computers reason things superposed ( even backtracks superposed) just like with variables?

If so already better than humans

6. Quantum computing weirdness comes from error correction.
Even normal modern electronics have like 10^-20 probability of bit flipping in a circuit - that's why servers need ECC memory etc.
And stuff like generating electrons with specific spin is even less reliable - easily tens of percents of error.
So it gets compensated by mathematics.
In particular, entanglement is a statistical property - there're some processes that generate multiple particles at once,
so by analysing one of them we can predict properties of others.
Otherwise the idea is the same as with normal computers - there's no magic in quantum computing,
its just faster (potentially) because it uses a lower level of physics.

> If so already better than humans

Normal computers are already better anyway.
Quantum computers are actually still useless atm.

Problem is, where do we get the data to train our software?
Like, where do I download videos for 5 or so first years of baby life?

As it is, computers beat humans in areas where we have reliable data.
But in US/EU you can't sell software trained on copyrighted works, so we don't see much results.

7. Sportman, could you post your random test file? I want to test it. Thanks.

8. If one is consistent able reduces input bitstring of eg 10000 bits into 1 bit less 3/4 of times consistent

How would you go about implementing an algorithm to practical useful reduce eg a large input file by few bits?

For certain this is definite not simple straight forward

9. Originally Posted by CompressMaster
First post.
https://mega.nz/#!hxFTTCIA!BulO0GWml...yfsWLSiZi64V8E

10. ## Thanks:

CompressMaster (10th June 2019)

11. Originally Posted by LawCounsels
If one is consistent able reduces input bitstring of eg 10000 bits into 1 bit less 3/4 of times consistent

How would you go about implementing an algorithm to practical useful reduce eg a large input file by few bits?

For certain this is definite not simple straight forward
You needs log(base2)[ C(400,300,100) ] bits to distinguish 300 'reduced by 1' from 100 'not reduced', ie more than the 300 bits total reduced

Perhaps some other method by grouping few of these together and manipulate??

12. Any general good idea worth to try , anyone?

13. If you can compress even just one block in a file by a few bytes (3-4 is enough) its pretty simple to compress the whole file.
For that we can generate and compress a bitstring of filesize/blocksize bits which marks compressible blocks.
Or we can use an unique signature for compressed blocks.

But the probability theory won't let you get consistent compression on real random data.
For example, we can expect one of P random blocks to produce zero mod P - block%P=0.
So we can use that to compress this block by ~log2(P) bits.
But then we'd also need ~log2(P) bits to select 1 block of P which is compressed.
Well, by tweaking constant parameters in the coder (like P value) its possible to
tune it to compress any one given file.
But you can't expect consistent compression of random data - it would work like I described
with _any_ function of the data - that's the definition of randomness basically.

Of course, most of actual random-like "incompressible" files one can find on a computer or internet are not really random,
but rather generated by PRNG, encrypted, compressed, or a combination of these.
So you'd need SAT to defeat PRNG, SAT+cryptography for encrypted data, and recompression for compressed.
Somewhat generic algorithms are also possible (like cdm) - based on common properties of multiple
generation algorithms (like lack of order1+ modelling in bitcodes for cdm).
But no kind of mindless data manipulation would get you consistent compression of random data.
The only way is to target specific filetypes with specific solutions.

14. >>
But the probability theory won't let you get consistent compression on real random data.

3/4 of times 'reduced by 1 bit' ( & decodes back correct!)
is not 'random' anymore!!! ( even if initially underlying input file data is 'random' originally)

15. Yes, like that you'd need ( (3/4)*log(3/4)+(1/4)*log(1/4) )/-log(2) = 0.811278 bits for compression flag per block,
while saving ((3/4)*1+(1/4)*0) = 0.75 bits per block.
Still no luck :)

16. Any scheme devisable whereby auto-detects when decompresses that current cumulative total blocks reduced by 1 bit AT THE JUNCTURE yields NET overall 1 or more bit/s saved?? ( current cumulative'Surplus' fluctuates above/ below standard average...)

Would a current 'surplus' be sufficient large allows simple expenditure of few bits to record the particular juncture / block number

17. Its basically what cdm does: https://encode.su/threads/2742-Compr...ll=1#post52493

But it looks for a clearly defined type of redundancy - occasional chunks with significantly skewed bit probability distribution.
For example, some codecs would write redundant block headers, even though most of their output is random.

18. >> Would a current 'surplus' be sufficient large allows simple expenditure of few bits to record the particular juncture / block number

Initial results indeed good!!! : )

Watch this space..

19. Originally Posted by Shelwien
Multiplication of two N-bit numbers basically has to be unrolled to a sum of N of 2N-bit numbers.
While slightly off-topic, multiplication of two N-bit numbers does not have complexity of O(N^2), but O(N log N), see Knuth "The Art of Computer Programming", Vol. 2 "Numerical and Seminumerical algorithms". The essential observation is that multiplication of N-bit numbers is similar to a convolution of two N-point sequences. By fourier transformation, convolution goes into pointwise multiplication, which is an O(N) algorithm, and Fourier transformation itself is O(N log N), so multiplication has an upper bound complexity of O(N log N).

20. ## Thanks:

Shelwien (10th June 2019)

21. Yes, but I was talking about a boolean decomposition of multiplication for SAT, so specific complexity of that multiplication would be pretty hard to estimate.
For example, if we'd count precomputing of coefs for Fourier transform, it might actually appear slower than naive implementation?

22. I had an idea how to create a random file, let the Bitcoin core "miners rat race" block time decide the values. Used all blocks from start till today, no idea if it's real random.

580,880 bytes - block time last digit from 580,880 Bitcoin core blocks

273,794 bytes gzip
273,053 bytes rar
260,849 bytes 7zip
243,518 bytes rz
241,975 bytes nz
241,586 bytes paq9pxd
241,486 bytes paq8px
241,426 bytes cmix

23. Originally Posted by Sportman
I had an idea how to create a random file, let the Bitcoin core "miners rat race" block time decide the values. Used all blocks from start till today, no idea if it's real random.

580,880 bytes - block time last digit from 580,880 Bitcoin core blocks

273,794 bytes gzip
273,053 bytes rar
260,849 bytes 7zip
243,518 bytes rz
241,975 bytes nz
241,586 bytes paq9pxd
241,486 bytes paq8px
241,426 bytes cmix
I do not know why it is needed.
Compresses this file to 241,206 bytes

241,486 bytes paq8px
241,426 bytes cmix
241,206 bytes nelson_dec2bin_v0

24. ## Thanks:

Sportman (15th June 2019)

25. Block time is every 10 minutes (600 seconds) average (between some seconds and some hours), so I expected more last digits close to 0 but as far I know there is no (Atom/GPS) time sync in the network or miners hardware.

26. 241,206 bytes - binary version bcrnd.txt

241,444 bytes paq8pxd
241,423 bytes rz
241,344 bytes 7zip
241,335 bytes paq8px
241,279 bytes rar
241,277 bytes nz
241,274 bytes gzip
241,253 bytes cmix

27. Originally Posted by xinix
I do not know why it is needed.
To have a "true" random file for compression tests.

28. Originally Posted by Shelwien
Of course its impossible to compress all files (without tricks like abusing filesystem for this).
Can you explain it more?

29. There're various side channels for information in the file system, like names, attributes, timestamps, extra streams.
Also file data normally uses an integer number of clusters, so its kinda possible to store 8k of data into a file with visible size 4k+1.
So its not easy to setup objective compression evaluation.

30. Compress by my method.
Time 3 minutes
19.22-19.25
Speed: 5.8KB/s

31. @Sportman
2020 challenge? Please attach archive if possible.

Page 2 of 2 First 12

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•