Results 1 to 18 of 18

Thread: disasm-based executable's filter

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts

    disasm-based executable's filter

    https://sourceforge.net/tracker/?fun...49&atid=373088 :

    I've written a (relatively) small preprocessor/filter for compiled x86 code
    that is completely reversible and typically increases compression ratio for
    executable files by about 10% compared to the default filter (BCJ) used by
    7Zip/LZMA for .EXE files. I guess this might be interesting for NSIS (in
    terms of reducing download sizes), but I have no clue about NSIS internals,
    so I'm not in the position to submit a full patch that adds this
    functionality . Adding it as a feature request seemed like a reasonable
    compromise. I've put the code online here:
    http://www.farbrausch.de/~fg/code/disfilter/ (written using VC++, but it
    should be trivial to get working with other compilers). The actual
    algorithm is in dis.cpp, the rest is mainly a small (Windows) demo app.

    I've placed the code in the public domain; no strings attached. If you're
    interested but have questions about the code or integration problems, feel
    free to contact me (my email address is mentioned in readme.txt).

  2. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    as source file says, it's filter from kkrunchy, published by the author

    firefox.exe: 7,620,696 bytes
    all 3 filters, further compressed with lzma:max:
    none: 3,004,746 +5.1%
    bcj: 2,858,179 --- baseline
    bcj2: 2,782,313 -2.7%
    dis: 2,591,514 -9.3%
    durilca'light: 3,066,324 -> 2,509,209, i.e. its disasm filter improved compression by 18.2%

    skype.exe 19,490,344 bytes
    none: 8,060,598 +2.6%
    bcj/bcj2: 7,856,207 --- baseline
    dis: 8,279,426 +5.4%
    durilca'light: 9,114,835 -> 8,105,689, i.e. its disasm filter improved compression by 11.1%

    probably skype is too complex for this over-smart algorithm

    ps: http://freearc.org/download/testing/dispack.exe
    Last edited by Bulat Ziganshin; 13th February 2010 at 15:49.

  3. #3
    Member
    Join Date
    May 2008
    Location
    Earth
    Posts
    115
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    as source file says, it's filter from kkrunchy, published by the author

    skype.exe 19,490,344 bytes
    none: 8,060,598 +2.6%
    bcj/bcj2: 7,856,207 --- baseline
    dis: 8,279,426 +5.4%
    durilca'light: 9,114,835 -> 8,105,689, i.e. its disasm filter improved compression by 11.1%

    probably skype is too complex for this over-smart algorithm

    ps: http://freearc.org/download/testing/dispack.exe
    probably skype.exe is so protected that some assumptions made by this algo are false

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    it seems that freearc.exe is protected too: http://forum.ru-board.com/topic.cgi?...&start=1100#14

    i rather think that these executables use some untypical coding techniques. at least freearc is written in haskell and this compiler generates code that's somewhat unlike to C compilers output

  5. #5
    Member
    Join Date
    Jun 2008
    Location
    Germany
    Posts
    369
    Thanks
    5
    Thanked 8 Times in 4 Posts
    it seems that freearc.exe is protected too

  6. #6
    Member
    Join Date
    May 2007
    Location
    Poland
    Posts
    91
    Thanks
    10
    Thanked 4 Times in 4 Posts
    Bulat do you plan to incorporate these filters into freearc? Is it possible to use durilca'light exe filter?

  7. #7
    The Founder encode's Avatar
    Join Date
    May 2006
    Location
    Moscow, Russia
    Posts
    4,023
    Thanks
    415
    Thanked 416 Times in 158 Posts
    Quote Originally Posted by Bulat Ziganshin View Post
    skype.exe 19,490,344 bytes
    none: 8,060,598 +2.6%
    bcj/bcj2: 7,856,207 --- baseline
    dis: 8,279,426 +5.4%
    durilca'light: 9,114,835 -> 8,105,689, i.e. its disasm filter improved compression by 11.1%

    probably skype is too complex for this over-smart algorithm
    Quite probably, that Skype contains lots of graphics/image resources - it's to big for code-only EXE. Try to test on EXEs that contains really big amount of a code - photoshop.exe, UT3.exe (Unreal Tournament 3 Game executable)...

  8. #8
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    dispack processes only code section. but it seems that even this section contains something unusual in skype (and freearc itself)

  9. #9
    Member evg's Avatar
    Join Date
    May 2009
    Location
    Austria
    Posts
    23
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Thumbs up quick comparison

    used mencoder executable as it is quite large and contains mostly code

    Code:
    12231468 mencoder.exe.bcj2
    12222615 mencoder.exe.bcj
    12222608 mencoder.exe.paq8px
    12222483 mencoder.exe
    12067731 mencoder.exe.dis
     5229585 mencoder.exe.bcj.quad
     5173892 mencoder.exe.paq8px.quad
     5144346 mencoder.exe.bcj2.quad
     5013292 mencoder.exe.bcj.balz
     4980072 mencoder.exe.quad
     4937866 mencoder.exe.paq8px.balz
     4880998 mencoder.exe.bcj.lzpm
     4846274 mencoder.exe.bcj2.balz
     4819259 mencoder.exe.paq8px.lzpm
     4783326 mencoder.exe.balz
     4733565 mencoder.exe.bcj2.lzpm
     4703066 mencoder.exe.dis.quad
     4685316 mencoder.exe.lzpm
     4437036 mencoder.exe.dis.balz
     4435922 mencoder.exe.bcj.lzma2
     4435237 mencoder.exe.bcj.lzma1
     4389096 mencoder.exe.lzma2
     4388421 mencoder.exe.lzma1
     4359169 mencoder.exe.paq8px.rzm
     4352200 mencoder.exe.dis.lzpm
     4267609 mencoder.exe.paq8px.lzma2
     4266954 mencoder.exe.paq8px.lzma1
     4265914 mencoder.exe.bcj2.lzma2
     4265260 mencoder.exe.bcj2.lzma1
     4264786 mencoder.exe.bcj.mcomp2-f3x
     4217834 mencoder.exe.bcj.rzm
     4198041 mencoder.exe.mcomp2-f3x
     4113134 mencoder.exe.bcj2.rzm
     4069037 mencoder.exe.bcj2.mcomp2-f3x
     4057161 mencoder.exe.rzm
     4049583 mencoder.exe.paq8px.mcomp2-f3x
     3934852 mencoder.exe.dis.lzma2
     3934236 mencoder.exe.dis.lzma1
     3853420 mencoder.exe.dis.rzm
     3760805 mencoder.exe.dis.mcomp2-f3x
    
    used:
     quad 1.12 -x
     balz 1.15 ex
     lzpm 0.16 ex
     xz   0.49b --lzma1,2=dict=16M,nice=273,depth=10000 -F raw --suffix=.lzma1,2
     rzm  0.07h
     mcomp2 2.3 -mf3x -M256m
    best regards
    evg

  10. #10
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,593
    Thanks
    801
    Thanked 698 Times in 378 Posts
    Now i'm adding DisPack to FreeArc.of course, there is problem of executable code detection. Now i split data into 16 kb blocks and use the following filter:

    Code:
    EXETYPE detect (BYTE *buf, int len)
    {
      double e8=0, exe=0, obj=0;
      for (BYTE *p=buf; p+4<buf+len; p++)
      {
        if (*p == 0xE8)
        {
          e8++;
          if (p[4]==0xFF)
            exe++;
          if (p[4]==0)
            obj++;
        }
      }
      return  e8/len >= 0.002  &&  (exe+obj)/e8 >= 0.20  &&  exe/e8 >= 0.01?  CODE : DATA;
    }
    i've attached code&executable

    what's the better ways to detect x86 code? in particular how to find exact boundaries of code, not aligned to 16 kb chunks? now i think about searching in boundary chunks for first/last E8 instruction with argument of form 0x00... or 0xFF...
    Attached Files Attached Files
    Last edited by Bulat Ziganshin; 22nd March 2010 at 14:19.

Similar Threads

  1. Executable patch generation methods
    By Shelwien in forum Data Compression
    Replies: 2
    Last Post: 2nd April 2010, 10:13
  2. Courgette - A new differential executable compressor
    By Arkanosis in forum Data Compression
    Replies: 1
    Last Post: 17th July 2009, 23:30
  3. Need god PCM compressor/filter
    By SvenBent in forum Data Compression
    Replies: 10
    Last Post: 8th July 2008, 16:52
  4. Stand alone pcm dat preprocessor/filter
    By SvenBent in forum Data Compression
    Replies: 5
    Last Post: 15th May 2008, 16:36
  5. About filter
    By vcore in forum Forum Archive
    Replies: 4
    Last Post: 22nd January 2008, 13:45

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •