Page 3 of 3 FirstFirst 123
Results 61 to 68 of 68

Thread: A pattern in random data

  1. #61
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    216
    Thanks
    66
    Thanked 18 Times in 18 Posts
    A pattern in random data - there is ALWAYS some pattern, problem is, how to express it using less information (bytes) than original...

  2. #62
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,136
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    Enumerate all possible data strings with specific pattern, then encode the index of given string in the list.

  3. #63
    Member
    Join Date
    Sep 2018
    Location
    Philippines
    Posts
    121
    Thanks
    31
    Thanked 2 Times in 2 Posts
    The algorithm i described here is an attack on random data; it assumes or best for random data:

    https://encode.su/threads/3024-Best-...ental-integers

    If i remember it right, my gr32 program in the Miscellany page of "The Data Compression Guide" is designed for random data in mind. Somehow, when things got complicated, i just opted for repetition i/o codes. That's why i decided to write a decoder. It's like a failed BPE, with very miniscule compression.

    https://sites.google.com/site/dataco...ide/miscellany

    Random Data quips here:

    https://grtamayoblog.blogspot.com/20...ndom-data.html

    Random data compression is futile unless, perhaps, you start from the most basic of information unit that is bits.
    Last edited by compgt; 15th May 2019 at 08:13.

  4. #64
    Member
    Join Date
    Jun 2018
    Location
    Yugoslavia
    Posts
    82
    Thanks
    8
    Thanked 6 Times in 6 Posts
    but such a 'compressor', if it exists, would then enlarge 'nonrandom data'? I don't see how that could be useful.

  5. #65
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    4,136
    Thanks
    320
    Thanked 1,397 Times in 802 Posts
    Actually it would - switching codecs requires 1 bit.

    Problem is, too many of possible data strings are random.
    I'd say, only something like 2^(n/log(n)) of 2^n strings (where n is the length of a binary string) are compressible.
    So it would be hard to encode all random strings (ie not compressible) with less than n bits.

  6. #66
    Member
    Join Date
    Jun 2018
    Location
    Yugoslavia
    Posts
    82
    Thanks
    8
    Thanked 6 Times in 6 Posts
    perhaps there is finite amount of information in universe, if good enough compression/deduplication/whatever is used. that theory of everything or something.

  7. #67
    Member
    Join Date
    Sep 2018
    Location
    Philippines
    Posts
    121
    Thanks
    31
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by pklat View Post
    perhaps there is finite amount of information in universe, if good enough compression/deduplication/whatever is used. that theory of everything or something.
    Pklat, your "information" = data; "theory" = (set of) functions, or equations.

    The space of all possible pictures in a 640x480 picture, for example, is clearly finite with respect to "infinity". If you can somehow brute-force or generate all these pictures, there are finite set/s of very interesting pictures that have enough very high "information" such as pictures that show you when you were a child or ten years old, and even up to what you are doing at the present moment. However, that is only theoretical or logical play of mind. You cannot catch up with this theoretical machine or this "camera in infinite time," but perhaps find this set of functions for the camera that would only show "the most important" events in your life.
    Last edited by compgt; 17th May 2019 at 08:53.

  8. #68
    Member
    Join Date
    Sep 2018
    Location
    Philippines
    Posts
    121
    Thanks
    31
    Thanked 2 Times in 2 Posts
    Quote Originally Posted by CompressMaster View Post
    A pattern in random data - there is ALWAYS some pattern, problem is, how to express it using less information (bytes) than original...
    One slow early approach is to guess the pattern or symbol. Just try to guess the input byte in < 32 tries, to output just 5 bits. (You can do this by randomly setting bits of a dummy byte on or off and compare it with the input byte.) If not guessed, output 00000 and then 8-bit byte. How would you initialize the dummy byte? Maybe by context; crude LZP like. What else? Build on this. Improve this.
    Last edited by compgt; 7th January 2020 at 13:22.

Page 3 of 3 FirstFirst 123

Similar Threads

  1. Replies: 41
    Last Post: 6th May 2016, 20:13
  2. Specific case - High redundant data in a pattern
    By Gonzalo in forum Data Compression
    Replies: 12
    Last Post: 19th September 2014, 06:39
  3. Euler's Number Triangle and Random Data
    By BetaTester in forum Data Compression
    Replies: 55
    Last Post: 19th February 2013, 05:02
  4. Sometimes data look like random... here's an interesting file:
    By Alexander Rhatushnyak in forum The Off-Topic Lounge
    Replies: 29
    Last Post: 25th December 2010, 04:05
  5. Random Data Question.
    By Tribune in forum Data Compression
    Replies: 7
    Last Post: 13th June 2008, 20:30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •