Results 1 to 16 of 16

Thread: Block Cipher to produce more skewed or just different distributions of symbols ?

  1. #1
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts

    Block Cipher to produce more skewed or just different distributions of symbols ?

    Can Block Cipher ( n bits input n bits output ) ever produces different distributions of symbols from input distribution of symbols ?

  2. #2
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    Entropy coder (accurate: AC/ANS), if replacing encoder with decoder, can transform a looking random sequence of bits (e.g. encrypted data) into a sequence of symbols/bits of practically any chosen statistical model ... not making it easier to decrypt.

    Something like that can be done for steganography/watermarking ( https://en.wikipedia.org/wiki/Steganography ) applications, e.g. to encrypt data as QR code (0/1 sequence) resembling a chosen image (grayness of pixel determines probability of using '1' in code for this pixel): http://ieeexplore.ieee.org/xpl/login...number=7348695

    Examples of such codes obtained with https://github.com/JarekDuda/PRcodes/ , rate is the number of hidden bits/pixel:


  3. Thanks:

    Mike (22nd March 2020)

  4. #3
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    It would appear input symbols of 49 1s 51 0s would not possible encrypt into 100 bits of 4
    eg40 1s 60 0s ?

  5. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,552
    Thanks
    767
    Thanked 684 Times in 370 Posts
    any sane cipher produces white nois with equal distribution of byte values

  6. #5
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    Sure, enforcing some distribution (constraints e.g. statistical) on encoding sequence is at cost of reduced information content - "rate" in example above saying how many bits/constrained bit can we hide there.

    Theoretical limit is just Shannon entropy: enforcing p probability of '1' in sequence of 0/1, we can store at most h(p) = -p lg(p) - (1-p)lg(1-p) bits/value.
    It is obtained with entropy coder, but requiring that both encoder and decoder know the statistical model (e.g. 'p' here, image to resemble for QR code).
    This requirement can be weakened to only encoder knowing the statistical model ('p' or image here), still approaching the same theoretical boundary (at higher computational cost for encoder) - generalization of Kuznetsov-Tsybakov problem in github and paper above.

  7. #6
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    In simple few words how would encoder needs to encode ( knowing the defective cells ) so decoder can similar correct decode here ( not knowing the defective cells ) ?

  8. #7
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    This is Kuznetsov-Tsybakov problem, you can see it as that we have some 'freedom bits' chosen to satisfy the constraint, and some complex bijection T:
    encoder searches for 'freedom bits' such that
    T(encoded bits|freedom bits) satisfies the constraints, then decoder just inverts T and discards 'freedom bits'.

    It would reach the theoretical limit, but requiring exponential computation cost ... however, approximations turn out quite good, see the paper.

  9. #8
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Supposing of 49 1s 51 0s input , 29 1s 21 0s in 1st 50 bits & 20 1s 30 0s in 2nd 50 bits ....will block cipher always produce white noise e.g. 24 1s 26 0s in 1st 50 bits & e.g. 25 1s 25 0s in 2nd 50 bits ?

    Does Block Cipher always produce exact same 49 vs 51 output symbols frequencies ?
    If so , would input's existing 2-bits-unit symbol frequencies ( ie 00s 01s 10s 11s) be ALSO exact same preserved in Cipher Block's output of the 49 vs 51s single bits symbol frequencies ?
    Last edited by LawCounsels; 22nd March 2020 at 18:53.

  10. #9
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,488
    Thanks
    26
    Thanked 130 Times in 100 Posts
    I think that the widely used encryption algorithms tend to make the output distribution near uniform, effectively prohibiting compression. They are also as unpredictable as possible, because predictable encryption algorithms are by definition weak. Therefore you cannot rely on any particular structure of encrypted file.

  11. #10
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Do you have URL s to such Block Cipher .exe ( simplest ones , N input N output ) to quick easy verify this so ( produces white noise binary output distributions) for any skewed input N bits

    It had seemed theoretic that Block Cipher could only maintain exact same binary output symbol frequencies as binary input symbol frequencies . Suggest one such white noise output symbols distribution simple Block Cipher scheme

    It had also seemed the original input's 2-bits-units symbols frequencies is NOT exact preserved, except only so far as falling within the general usual fluctuation chances dictated by the exact same output binary symbol distributions preserved
    Last edited by LawCounsels; 23rd March 2020 at 02:06.

  12. #11
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    Yes, standard ciphers should produce looking completely random sequence of bits: i.i.d. Pr(0)=Pr(1)=1/2, carrying 1 bit/bit (rate=1).
    But if needed, e.g. its output can be transformed into sequence of any chosen statistical model (using reversed entropy coding or Kuznetsov-Tsybakov) - not making it easier to decrypt, at cost of longer encoding sequence: the original length has to be divided by rate.

  13. #12
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Yes

    But NOT applicable when N bits input N bits output strictly

  14. #13
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    In N constrained bits you can hide N/rate bits of information, where rate is average Shannon entropy of these constraints.

    If you want to store N bits of information in N bits, then there is no place for constraints, you can only apply some 2^N -> 2^N bijection on them.

  15. #14
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Can there even be one simplest N bits to N bits Block Cipher .exe url link , where the output symbol frequencies does not EXACT preserve input symbol frequencies ?

  16. #15
    Member
    Join Date
    Nov 2013
    Location
    Kraków, Poland
    Posts
    767
    Thanks
    236
    Thanked 244 Times in 149 Posts
    N to N bit cipher is a bijection - for every input sequence there exists exactly one output sequence.

    Hence there exists e.g. input leading to output being all zeros, so YES - output frequencies can be very different from input.

    However, these are relatively rare cases, on average and typically ( https://en.wikipedia.org/wiki/Typical_set ) they have the same i.i.d. Pr(0)=Pr(1)=1/2 frequencies.

  17. #16
    Member
    Join Date
    Apr 2012
    Location
    London
    Posts
    263
    Thanks
    13
    Thanked 0 Times in 0 Posts
    Yes very rare exceptions

    If input symbols ratio is 30 vs 70 ( or 50 vs 50 ) , then overall in general
    Output symbol ratios in general will be exact same 30 vs 70 ( or 50 vs 50 )

Similar Threads

  1. Replies: 95
    Last Post: 27th May 2019, 10:07
  2. bsc, new block sorting compressor
    By Gribok in forum Data Compression
    Replies: 363
    Last Post: 14th June 2017, 20:55
  3. Segmenting after block-sorting
    By Piotr Tarsa in forum Data Compression
    Replies: 1
    Last Post: 2nd September 2011, 02:27
  4. Improving RC4 (MC1 cipher proposal)
    By encode in forum The Off-Topic Lounge
    Replies: 14
    Last Post: 5th August 2010, 20:57
  5. Block sorting for LZ compression
    By Bulat Ziganshin in forum Forum Archive
    Replies: 15
    Last Post: 14th April 2007, 15:37

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •