Results 1 to 10 of 10

Thread: NTFS compression

  1. #1
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts

    Exclamation NTFS compression

    I just modified my LZSS to be compatible with NTFS file compression.

    OK, here are the results (What will happen if add an Optimal Parsing to the NTFS compression):

    Photoshop.exe (19,533,824 bytes)
    NTFS: 13,291,520 bytes
    LZSS: 11,844,303 bytes

    TraktorDJStudio3.exe (29,124,024 bytes)
    NTFS: 12,709,888 bytes
    LZSS: 10,838,525 bytes

    UT3.exe (28,064,848 bytes)
    NTFS: 16,531,456 bytes
    LZSS: 14,505,960 bytes

    MPTRACK.EXE (1,159,172 bytes)
    NTFS: 790,528 bytes
    LZSS: 705,706 bytes

    Doom3.exe (5,427,200 bytes)
    NTFS: 3,264,512 bytes
    LZSS: 2,869,937 bytes

    Reaktor.exe (14,446,592 bytes)
    NTFS: 5,996,544 bytes
    LZSS: 4,852,622 bytes

    test.exe (7,919,616 bytes)
    NTFS: 2,621,440 bytes
    LZSS: 2,008,090 bytes

    world95.txt (2,988,578 bytes)
    NTFS: 2,031,616 bytes
    LZSS: 1,813,929 bytes

    fp.log (20,617,071 bytes)
    NTFS: 5,230,592 bytes
    LZSS: 4,475,472 bytes

    How many space can be freed...


  2. #2
    Member
    Join Date
    Jun 2008
    Location
    G
    Posts
    372
    Thanks
    26
    Thanked 22 Times in 15 Posts
    Maybe you can add you ls88 compressor to zfs compression?

  3. #3
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Cool. Very cool. Does Windows let programs work low enough to make a tool that can really apply this?

  4. #4
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Quote Originally Posted by m^2 View Post
    Cool. Very cool. Does Windows let programs work low enough to make a tool that can really apply this?
    Not sure. However, I guess it is possible to get inside NTFS driver and add an optimized compression.

  5. #5
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Quote Originally Posted by encode View Post
    Not sure. However, I guess it is possible to get inside NTFS driver and add an optimized compression.
    Practically no. Vista x64 and Vista SP1 require all drivers to be signed.
    That's a few hundred euro yearly, if you can convince Verisign that you're a reliable developer and won't use the key to sign crapware.

  6. #6
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    Actually, NTFS compression is not something special and have some weakness.
    OK, what NTFS compression is?

    It is a LZSS with 4KB window. 4KB is a cluster. Token flags (literal/match) are stored with tags - one byte defines eight flags (0 - literal, 1 - match). Each literal is stored as one byte. Match is stored as two-byte LZ-code. Based on current position within a block (cluster) we define bits needed for offset and match length. For example, at the beginning of a block, we may not get a far match, so we reuse unused bits for extra match length. Min. match is three bytes long.

    For optimized LZSS it might be better to have MINMATCH=2, even if the LZ-code requires two bytes itself. Because storing two literals will take 1+8+1+8 = 18 bits, and with two byte match 1+16 = 17 bits. With unoptimized LZSS this may not be so important, but if we're dealing with optimizer it may show a big difference at the end, indeed.


  7. #7
    Programmer toffer's Avatar
    Join Date
    May 2008
    Location
    Erfurt, Germany
    Posts
    587
    Thanks
    0
    Thanked 0 Times in 0 Posts
    All you need to do is to minimize the coding cost using dynamic programming (Bellman):

    http://en.wikipedia.org/wiki/Dynamic_programming
    http://en.wikipedia.org/wiki/Bellman_equation

    This actually is the background of optimal parsing. One should change the metric from "select the path which is optimal w.r.t. the number of matched bytes" to " ... w.r.t. coding cost". If i'm not wrong it doesn't make any difference, since the coding cost doesn't change dynamically (like an adaptive model) and selecting one byte matches is always worse than just storing literals.

  8. #8
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    3,982
    Thanks
    377
    Thanked 351 Times in 139 Posts
    With LZSS (with fixed-length codes) problem was solved...

  9. #9
    Member
    Join Date
    May 2008
    Location
    Kuwait
    Posts
    334
    Thanks
    36
    Thanked 36 Times in 21 Posts
    timing was not shown and i knew its diffcult to count but i think there is a hidden setting to NTFS engine (as MS did with CHM) to provide higher compression..

  10. #10
    Member
    Join Date
    May 2008
    Location
    HK
    Posts
    160
    Thanks
    4
    Thanked 25 Times in 15 Posts
    sorry for bumping old thread.

    NTFS driver calls RtlCompressBuffer() for compression, which is LZNT1 compression.
    other people implemented same compression with faster speed. (I wonder if we can patch ntfs.sys to use this?)
    https://github.com/coderforlife/ms-c...master/lznt1.h
    https://github.com/coderforlife/ms-c...ster/lznt1.cpp

  11. Thanks (2):

    Black_Fox (23rd May 2014),m^2 (25th May 2014)

Similar Threads

  1. NTFS compression
    By Matt Mahoney in forum Data Compression
    Replies: 3
    Last Post: 3rd March 2009, 02:42

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •