Results 1 to 8 of 8

Thread: Random Data Question.

  1. #1
    Member
    Join Date
    Jun 2008
    Posts
    26
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Random Data Question.

    So i have been messing with different ideas in the spirit of random data compression.. and yes it has driven me slighly mad.

    I was wondering if anyone had some ideas to make use of the following senario.

    you have a random say(1k) of bytes. But 1 (any) number say (42) will never appear in the data.(yes there is some overhead for providing a free value)

    So you can insert this number to denote your changing something now or in the next piece/pieces of data.

    Can anyone think of anything usefull for that.

    My previous ideas included psudo random sequence. that when the next data item matches your pre-generated sequence was replaced with the 42. so increasing the redundancy in the data for every match.

    so if anyone has any thoughts.

    Trib
    Last edited by LovePimple; 14th June 2008 at 04:04.

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,237
    Thanks
    192
    Thanked 968 Times in 501 Posts
    With a single missing value in 1k of near-random data you'd be able to compress that 1k
    into Log2(255^1024)/8 = 1023.28 bytes (8186.22 bits), so whole 5 bits can be saved.
    That's the best choice for near-random data, and talks about using it for alphabet
    extension or something are just too sad nowadays.

    Check out:
    http://compression.ru/sh/marcdemo.rar
    http://compression.ru/sh/sh002.rar http://compression.ru/sh/sh002s.rar

  3. #3
    Member
    Join Date
    Jun 2008
    Posts
    26
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Data

    I did leave some information out of the Question. But the main premise of the question is not based entirely on probability for a result.

    My intention was to produce some speculation or possibilities to try.

    for example if you like probability.

    PI is infinite; so your entire dataset you wish to compress could be equated to an offset from the start of PI.

    Or alternately as the Offset would be so HUGE (read bigger than the data itself). you could chose a sequence in pie that gives an offset that is your data. Which ever takes less space. lol

    Lets not get distracted with Pi sht.

    Anyways. the main point i would like to get across; is the speculation of any ideas please.

    I was hoping for some cool suggestions.

    Trib.
    Last edited by Tribune; 13th June 2008 at 18:35.

  4. #4
    Tester
    Black_Fox's Avatar
    Join Date
    May 2008
    Location
    [CZE] Czechia
    Posts
    471
    Thanks
    26
    Thanked 9 Times in 8 Posts
    Hi Tribune,
    Shelwien and some other people on this forum have 'Expert' under their nickname, because they don't have to speculate, as they know precisely
    I am... Black_Fox... my discontinued benchmark
    "No one involved in computers would ever say that a certain amount of memory is enough for all time? I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." -- Bill Gates

  5. #5
    Member
    Join Date
    Jun 2008
    Posts
    26
    Thanks
    0
    Thanked 0 Times in 0 Posts
    there must be alternate methods. as the expample i hinted at. Given X random seed. you may be able to generate X matches that suit your random 1k. that would exceed 5 bits.

    if i remove 2 values. for example 127 and 255. as they are "7*1" and "8*1". then i know if i read just 7 bits. il never need to read above 7 bits if i get a 127 as 255 will not be in the stream either. that would save 1bit for every match. if you wanted to ignore redundancy.

    The entries dont have to be conseqtive either. as 127 is equivilent to the next match in your pre-generated list.

    So that method alone could produce some better results perhaps.

    just some random thoughts. but obviously whatever is done would need to result in a lower bit count for the 1k stream.

    Trib
    Last edited by Tribune; 13th June 2008 at 18:54.

  6. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 778 Times in 485 Posts
    Looking for an algorithm to compress random data is just going to waste your time in fruitless search. It is proven that no such algorithm exists.

    http://datacompression.dogma.net/ind..._and_others%29

  7. #7
    Member
    Join Date
    Jun 2008
    Posts
    26
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Matt. Ty

    Yes that does seem like solid logical Evidence. and good advice from yourself.

    I still find it a little interesting though.

    Trib.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,976
    Thanks
    374
    Thanked 345 Times in 136 Posts
    Well, such things like compression of incompressible already killed all data compression related forums, including comp.compression newsgroup. So I will keep my forum clean, keeping this thread only, for reference...

Similar Threads

  1. goodbye and some random thoughts
    By Christian in forum The Off-Topic Lounge
    Replies: 72
    Last Post: 25th January 2010, 04:40
  2. LZC Question
    By moisesmcardona in forum Data Compression
    Replies: 3
    Last Post: 16th August 2009, 22:33
  3. RVLC Question
    By pessen in forum Data Compression
    Replies: 3
    Last Post: 11th July 2009, 03:29
  4. Dark Space Random Thoughts
    By Tribune in forum The Off-Topic Lounge
    Replies: 19
    Last Post: 14th March 2009, 15:22
  5. rnd - simple pseudo random number generator
    By encode in forum Forum Archive
    Replies: 21
    Last Post: 14th January 2008, 02:41

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •