Results 1 to 15 of 15

Thread: Delta: binary tables preprocessor

  1. #1
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    i publish the last algorithm i've developed for FreeaArc 0.40:

    Delta: binary tables preprocessor v1.0 (c) Bulat.Ziganshin@gmail.com 2008-03-13

    This algorithm preprocess data improving their further compression. It detects tables
    of binary records and 1) substracts sucessive values in columns, 2) reorder columns trying
    to maximize results of further compression.

    Algorithm includes 3 phases:

    1) Preliminary table detection. It finds 6+ repetitions of the same byte at the same distance,
    i.e. anything like a...a...a...a...a...a where '.' denotes any byte except for 'a'.
    This is done in delta_compress

    2) Candidates detected at first phase are then checked by FAST_CHECK_FOR_DATA_TABLE looking for
    monotonic sequence of bytes with fixed distance. Most candidates found at first stage are
    filtered out here

    3) Remaining candidates are tested by slow_check_for_data_table() that finds exact table boundaries
    and detects columns that should and that shouldn't be substracted. Only if table is large enough
    it will be finally processed

    The algorithm processes 20 mb/sec on 1GHz CPU, but i'm sure that the speed may be 3-fold increased

    http://www.haskell.org/bz/delta10.zip

    Now http://www.haskell.org/bz page includes all the algorithms i've developed for FreeArc so far

  2. #2
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    Nice i will definitely try this.
    i guess this should work before REP but after precomp ?

    ECM > Precomp > Delta >> rep >> RZM/CCM ?

  3. #3
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    my first little Alone in the dark 4 cd 1 (iso)


    precomp -> rep - > rzm = 315.200.204
    precomp -> delta* -> rep - > rzm = 315.265.533

    *
    Tables 7308 * 525 = 3843124 (3744624) bytes (10031649/52132569 probes) 7.5 skipbits
    Compression: 1069 mb, 15.283 seconds, speed 69.924 mb/sec

  4. #4
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by SvenBent
    i guess this should work before REP but after precomp ?
    after precomp, before or after rep - it doesnt matter

    i wonder why you dont use freearc to simplify your work. f.e., hpw to use fa to do abovementioned compression sequence:

    arc a archive file -m=precomp+delta+rep+rzm

    delta/rep are builtin in fa
    precomp is already comfigured in arc.ini
    and ive posted lines to add rzm support

    you can also add -t option to test decompression with CRC32 checksum. it may be worth reading FA docs

  5. #5
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    Quote Originally Posted by Bulat Ziganshin
    i wonder why you dont use freearc to simplify your work.
    i would but its funnier to make multithrede batches for me.
    besdies Im foign alot of brute force in my batch files.
    coparimng out put with and without E.GE precomp, delta and rep to make sure the end file is as smalle as possible.

    besides i still cannot get FA gui to work nder XP 64

  6. #6
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    What kind of files are Delta meant for?

    i have tried delta on some different kinds of iso files which had been precomped firstly. but they all grew in side when i introduced Delta into the compression mix.

    Right now I'm about to test it on some .nrq files which contains PCM data (audio tracks)

  7. #7
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by SvenBent
    What kind of files are Delta meant for?
    it improves compression of executables, databases when used together with lzma. may be on iso files it has sideeffect of decreasing compression ratio of MM data?

  8. #8
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    hehe

    and it just improved my compression on the NRG with PCM data

    RZM = 337.587.110 bytes)
    DEL > RZM = 324.613.378 bytes)

    so it looks like it good for PCM data ? or it might just be a coincidence.

    i will find another nrg with audio tracks to try it out

  9. #9
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by SvenBent
    so it looks like it good for PCM data ?
    in theory, it should be. in practice, using specialized MM compressors should be much better. try to compress these data with tta method:

    arc ... -mtta

  10. #10
    Guest
    Bulat, at what FA compresion options preprocesors delta/rep are turned on/off by default? or they always on by default(exept fastest method )?

    can tta process raw audio data?

  11. #11
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by Zonder
    can tta process raw audio data?
    yes, its exactly its purpose

    Quote Originally Posted by Zonder
    Bulat, at what FA compresion options preprocesors delta/rep are turned on/off by default?
    use -di+$ option to see it. shortly speaking, delta used for binary data, rep for binary data in symmetric modes (-m3..-m9/-mx)

  12. #12
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    Seem CCM(x) doesn like delta but RZM does

    Org = 601 MB (630.408.442 bytes)

    CCMX 6 = 282 MB (296.109.161 bytes)
    DEL + CCMX 6 = 284 MB (298.355.666 bytes)

    RZM = 321 MB (337.587.110 bytes)
    DEL + RZM = 309 MB (324.613.378 bytes)

    REP + RZM = 321 MB (337.189.036 bytes)
    DEL + REP + RZM = 309 MB (324.184.644 bytes)

  13. #13
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    >arc a a Skype.exe -mccmx
    Compressed 1 file, 19.490.344 => 7.622.934 bytes. Ratio 39.1%
    Compression time 87.43 secs, speed 223 kb/s. Total 89.04 secs

    >arc a a Skype.exe -mdelta+ccmx
    Compressed 1 file, 19.490.344 => 7.553.694 bytes. Ratio 38.7%
    Compression time 83.26 secs, speed 234 kb/s. Total 84.01 secs


    i've woked on delta targeting maximal compression together with lzma. this means that with other compressors it may select non-optimal encodings, f.e. encode rather small tables that improves compression with lzma, but decrease it with ccmx. so, on some tests this leads to the compression improvements, on some - to decreaed compression. rzm, being closer to lzma, probably more "compatible" with current delta settings.

    in particular, ccm algorithms, like paq, probably able to encode binary tables with fixed width, but unable to substract successive values in columns of these tables

  14. #14
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    and one more test - on Access database where Delta is especially great:

    >arc a a -mccmx BAZA.MDB
    Compressed 1 file, 50.237.440 => 7.223.347 bytes. Ratio 14.3%
    Compression time 189.86 secs, speed 265 kb/s. Total 193.03 secs

    >arc a a -mdelta+ccmx BAZA.MDB
    Compressed 1 file, 50.237.440 => 6.846.051 bytes. Ratio 13.6%
    Compression time 187.66 secs, speed 268 kb/s. Total 189.53 secs

  15. #15
    Member
    Join Date
    Sep 2007
    Location
    Denmark
    Posts
    873
    Thanks
    49
    Thanked 106 Times in 84 Posts
    Increased speed and increased compression strength... nice

Similar Threads

  1. BMF is not binary lossless NOR pictore lossy
    By SvenBent in forum Data Compression
    Replies: 4
    Last Post: 23rd August 2009, 13:54
  2. Brute forcing Delta block size
    By SvenBent in forum Data Compression
    Replies: 2
    Last Post: 2nd May 2009, 13:44
  3. REP and Delta fails with big files
    By SvenBent in forum Data Compression
    Replies: 14
    Last Post: 23rd November 2008, 20:41
  4. Bytewise vs Binary
    By Shelwien in forum Forum Archive
    Replies: 9
    Last Post: 30th March 2008, 17:51
  5. Delta transformation
    By encode in forum Forum Archive
    Replies: 16
    Last Post: 4th January 2008, 12:13

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •