Results 1 to 5 of 5

Thread: Concealing data in deflate streams

  1. #1
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,267
    Thanks
    200
    Thanked 985 Times in 511 Posts

    Concealing data in deflate streams

    See http://en.wikipedia.org/wiki/Steganography

    The availability of multiple ways to encode the same data in LZ streams
    allows for their application as steganography containers.
    And now that I made a working deflate recompressor, it also became easy
    to implement deflate-based steganography.
    I imagine something like this - you select a zip archive and a payload file,
    and some utility generates another .zip based on that; maybe with an option to
    only include enough files from source .zip to encode the payload.
    Now that second zip doesn't contain any suspicious records and can be extracted
    normally, but its also possible to extract the payload from it using
    the same tool.

    As to applications though, somehow I can only think about stuff like
    posting a (legal) trial version of a program in a zip with embedded keygen.
    Or hiding trojans from AV scanners.

    Any more positive ideas?
    Does it make any sense for me to write it?

  2. The Following User Says Thank You to Shelwien For This Useful Post:

    porneL (20th July 2013)

  3. #2
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Smart idea. Initially I liked it a lot, but now I have doubts. I'm afraid that when utilized to it's fullest, it would be very easy do detect. After all I don't think that any encoder does many oddball choices. If you limited it to behave mostly predictably, how much capacity would it get?
    I think that analog sources where highest frequency coefficients are random anyway are the better suited for such use.
    ADDED:
    Or hiding trojans from AV scanners.
    There are much simpler ways to do it and for execution you need to extract it anyway.
    Last edited by m^2; 6th November 2011 at 14:52.

  4. #3
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,267
    Thanks
    200
    Thanked 985 Times in 511 Posts
    > After all I don't think that any encoder does many oddball choices.

    Even plain collection of matches in all zlib modes at each position would provide some storage space
    without compromising much compression.
    I'd estimate the maximum capacity at 15-20% (if we'd use all available choices), and obviously for storage models
    with lower capacity the detection is also harder. Also to avoid detection its likely better to use complex models
    with optimal parsing as a base. I think even with 1% payload it would be difficult to distinguish
    between the steganography container and the real kzip/7z output.

    Though of course the payload has to be encrypted and a password would be required for extraction,
    and with a wrong password even the original utility won't be able to detect the containers.

    > There are much simpler ways to do it and for execution you need to extract it anyway.

    Not really. The decoder won't contain any suspicious code and AV won't be able to test the actual payload.
    In fact it can be a modified SFX and would actually extract the archive... and its main body along the way.
    Also there won't be any suspicious random/compressed data block anywhere - just an innocent SFX and
    an archive in known format which can be tested completely.

    > I think that analog sources where highest frequency coefficients are random anyway are the better suited for such use.

    Yes, and there I know how to make payload detection completely impossible.
    But zips are much more popular than uncompressed wavs/images/videos, so...
    Also for wavs and bmps some implementations already exist, thus making it less interesting.

  5. #4
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by Shelwien View Post
    > After all I don't think that any encoder does many oddball choices.

    Even plain collection of matches in all zlib modes at each position would provide some storage space
    without compromising much compression.
    I'd estimate the maximum capacity at 15-20% (if we'd use all available choices), and obviously for storage models
    with lower capacity the detection is also harder. Also to avoid detection its likely better to use complex models
    with optimal parsing as a base. I think even with 1% payload it would be difficult to distinguish
    between the steganography container and the real kzip/7z output.
    With images, 12.5% is very secure and 25% is fine, so it's not that great. Nevertheless there might be some use. Losslessly compressed multimedia files are rather rare and usually fairly small, at least for home usage. gzips that have hundreds of megabytes are nothing special.
    Quote Originally Posted by Shelwien View Post
    Though of course the payload has to be encrypted and a password would be required for extraction,
    and with a wrong password even the original utility won't be able to detect the containers.
    Like usually.

    Quote Originally Posted by Shelwien View Post
    > There are much simpler ways to do it and for execution you need to extract it anyway.

    Not really. The decoder won't contain any suspicious code and AV won't be able to test the actual payload.
    A 10-liner that decrypts an AES file is not suspicious either.

  6. #5
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,267
    Thanks
    200
    Thanked 985 Times in 511 Posts
    http://nishi.dreamhosters.com/u/stegzip_v0.rar
    Speed is like that mostly due to dumb window scan instead of usual LZ matchfinder, i'd fix it later.

    its basically already usable
    Code:
    stegzip c raw-input payload-input raw-output
    stegzip d raw-output payload-output
    just works with raw deflate streams instead of zips
    and needs some speed optimization
    the test script in the demo tries to put files into streams then extract them,
    so these .bin files after the test are extracted payloads.

    Code:
                       +book1 load +wcc386 load
    book1.raw   318230 327837 16992 327827 16993
    wcc386.raw  321431 332148 13865 332038 13878
    book1.raw          +3.02% 5.34% +3.02% 5.34% 
    wcc386.raw         +3.33% 4.31% +3.30% 4.32%

Similar Threads

  1. loseless data compression method for all digital data type
    By rarkyan in forum Random Compression
    Replies: 221
    Last Post: 6th October 2019, 17:29
  2. I'm looking for the best free implementation of deflate
    By caveman in forum Data Compression
    Replies: 2
    Last Post: 22nd November 2010, 08:27
  3. DEFLATE/zlib implementations
    By GerryB in forum Data Compression
    Replies: 10
    Last Post: 7th May 2009, 17:03
  4. deflate model for paq8?
    By kaitz in forum Data Compression
    Replies: 2
    Last Post: 6th February 2009, 20:48
  5. Interesting Deflate source
    By encode in forum Forum Archive
    Replies: 10
    Last Post: 21st April 2008, 15:30

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •