Page 1 of 4 123 ... LastLast
Results 1 to 30 of 95

Thread: Hook - Free, closed source file compressor

  1. #1
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    A new thread for the closed source versions!

    Quote Originally Posted by Nania Francesco Antonio
    HOOK v. 0.9 (C) 2007 free closed source demo archiver/compressor
    * new modifications in the system (LZP mode):
    - requirement of memory -> 60MB + X MB ADMC BUFFER -
    (older version 0.8E -> 160 MB + X MB ADMC BUFFER);
    - very faster version
    (20-30%) in single file compression;
    (300-400%) in multiple files compression (future GUI version) !!!
    - requirement of Processor: SSE instructions;
    Download Hook v0.9

  2. #2
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    On English:
    LovePimple thanks, the program has become enough fast from being able itself to allow one interface GUI for the compression! once put to point for all file format could give thread to you to twist to winrar at least!
    On Italian:
    Grazie LovePimple, il programma ? diventato abbastanza veloce da potersi permettere una interfaccia GUI per la compressione! una volta messo a punto per tutti i formati potrebbe dare filo da torcere a winrar almeno!

  3. #3
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    I look forward to future versions of HOOK!

    When are you planning to write the first GUI version?

  4. #4
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    English:
    They are in phase of testing, I hope within maximum two months of being able to end version 1,0 with interface GUI, consumption of little memory naturally, and faster at least I hope!
    Italian:
    Sono in fase di testing, spero entro massimo due mesi di potere finire la versione 1.0 naturalmente con interfaccia GUI, consumo di poca memoria, e pi? veloce almeno spero!

  5. #5
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Nania Francesco Antonio
    They are in phase of testing, I hope within maximum two months to be able to finish the version 1.0 naturally with interface GUI, consume of little memory, and faster at least I hope!
    Excellent! I look forward to the first public release!

  6. #6
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    HOOK v0.9
    Tested on Pentium IV 3 GHZ HT ( 2 GB MEM) mem DDR 333MHZ

    A10.jpg > 833.072 (2 8 0 ) sec 1.00
    AcroRd32.exe > 1.518.772 (64 1 1 3) sec 3.42
    english.dic > 562.266 (16 2 1 1 ) sec 2.08
    FlashMX.pdf > 3.784.488 (32 5 1 2) sec 6.13
    FP.LOG > 512.759 (256 1 1 22) sec 7.84
    MSO97.DLL > 1.914.193 (128 1 1 3) sec 4.58
    ohs.doc > 836.276 (16 2 1 4) sec 1.95
    rafale.bmp > 833.234 (64 3 0) sec 3.78
    vcfiu.hlp > 716.751 (64 1 1 6) sec 2.36
    world95.tx > 515.447 (128 1 1 12) sec 2.69

    Total = 12.027.258 Bytes

  7. #7
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quick test with ENWIK8...

    Test Machine: AMD Sempron 2400+


    Hook v0.8e

    Settings: Memory=1700 MBytes Limit=3 LZ enabled step 6

    Compressed Size = 22,039,930 bytes

    Compression Time = 00:03:56.781


    Hook v0.9

    Settings: Memory=1800 MBytes Limit=3 LZ enabled step 6

    Compressed Size = 22,077,878 bytes

    Compression Time = 00:03:34.562


    Hook v0.9

    Settings: Memory=1800 MBytes Limit=2 LZ enabled step 6

    Compressed Size = 21,969,337 bytes

    Compression Time = 00:03:28.609



    * Note the difference in size between mine and Matt's compressed files even though the exact same settings were used for both tests. I assume that this is caused by the different hardware?


    Matt Mahoney's test results:

    Hook v0.8e
    1700 3 1 6 > 22,039,935 bytes

    Hook v0.9
    1800 2 1 6 > 21,969,342 bytes
    1800 3 1 6 > 22,077,883 bytes


    My test results:

    Hook v0.8e
    1700 3 1 6 > 22,039,930 bytes

    Hook v0.9
    1800 2 1 6 > 21,969,337 bytes
    1800 3 1 6 > 22,077,878 bytes

  8. #8
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    i bed u cant decompress the file matt has compressed and vice versa. we have a problem here. the cause is probably different cpus do small deltas with floating point arithmetics. hook is very sensible to even the smallest delta.

    another issue is the compiler. if i take hook 0.8e, compile it with my compiler, with no optimization flags at all, and i compress a file first with nanias exe, then with mine, the output is different also.

    but before we speculate too much, we should investigate the code again, maybe some variable isnt initialized correctly and therefore can stick to different values influencing the outcome.

  9. #9
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    Like already dealt well from Eugene hook09 as the previous versions are much sensitive one to the processore in how much are based on calculations in mobile virgola. Naturally I am working in order to be able to use system ADMC with of the cells that they use given "integer". Unfortunately this could involve loss of data.
    -----------
    However I hope to make it us!
    Come gi? trattato bene da Eugene hook09 come le versioni precedenti ? molto sensibile al processore in quanto si basa su calcoli in virgola mobile. Naturalmente sto lavorando per potere usare il sistema ADMC con delle celle che usino dati "integer". Purtroppo questo potrebbe comportare perdita di dati. Comunque spero di farcela!

  10. #10
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by eugene
    i bed u cant decompress the file matt has compressed and vice versa. we have a problem here. the cause is probably different cpus do small deltas with floating point arithmetics. hook is very sensible to even the smallest delta.
    IMHO Future versions of HOOK (and FreeHook) should be modified so that they will always perform exactly the same with all current PC hardware.

    I dont know if anyone else will agree with this?

  11. #11
    Member
    Join Date
    Dec 2006
    Posts
    611
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by LovePimple
    I dont know if anyone else will agree with this?
    Definitely! Everyone should be able to decompress given archive, otherwise the archiver gets too close to unusable.

  12. #12
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    Unfortunately the engineers who have created the several one processor' s not are taken care of the compatibility from floating to integer and vice-pour! I engage to me sin from the next version, also in collaboration with Eugene, to modify the predictor from float to integer! It is not impossible!

    Purtroppo gli ingegneri che hanno creato i vari processor's non si sono preoccupati della compatibilit? da floating a integer and vice-versa! Mi impegno sin dalla prossima versione , anche in collaborazione con Eugene, a modificare il predictor da float a integer! Non ? impossibile!

  13. #13
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Quote Originally Posted by Nania Francesco Antonio
    Unfortunately the engineers who have created the several one processor s not are taken care of the compatibility from floating to integer and vice-pour! I engage to me sin from the next version, also in collaboration with Eugene, to modify the predictor from float to integer! It is not impossible!

    Purtroppo gli ingegneri che hanno creato i vari processors non si sono preoccupati della compatibilit? da floating a integer and vice-versa! Mi impegno sin dalla prossima versione , anche in collaborazione con Eugene, a modificare il predictor da float a integer! Non ? impossibile
    Excellent!

  14. #14
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Quote Originally Posted by LovePimple
    * Note the difference in size between mine and Matts compressed files even though the exact same settings were used for both tests. I assume that this is caused by the different hardware?


    Matt Mahoneys test results:

    Hook v0.8e
    1700 3 1 6 > 22,039,935 bytes

    Hook v0.9
    1800 2 1 6 > 21,969,342 bytes
    1800 3 1 6 > 22,077,883 bytes


    My test results:

    Hook v0.8e
    1700 3 1 6 > 22,039,930 bytes

    Hook v0.9
    1800 2 1 6 > 21,969,337 bytes
    1800 3 1 6 > 22,077,878 bytes
    This is probably because I am testing in a different directory and hook stores the file name. I tested:

    hook c 1800 3 1 6

    esenwik8 x8

    so my results are 5 bytes larger to store the path name. My mistake. I will test future versions in the current directory to save 5 bytes.

  15. #15
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Quote Originally Posted by Matt Mahoney
    hook c 1800 3 1 6
    esenwik8 x8
    Somehow my post is being mangled. It should say

    hook c 1800 3 1 6 (backslash) res (backslash) enwik8 x8

    but I guess that "(backslash) r" gets converted to a carriage return.

  16. #16
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    nania: it would rock, if we make it to convert the engine from float to integer calculations... for most applications it is possible and a straight forward process... how it effects effficiency in compression we can only see if we do it, but its worth the case study!

    else we would alsways go for telling the compiler to use 100% IEEE float point math, this way it will be 100% compatible! we only loose speed...

    but its better than to bringt the compressor to an unusable state!!

    we already have another big drawback... the symmetrical nature... that if we crunch with 1 GB of mem, the decrunchign person needs the same ammount of memory... this sucks, but we cant bypass it by nature of this algorithm. no way i here guess.

    thats why all the lz packers are more popular... because u can use lots of mem and lots of cpu to pack, and the unpacker is slim and fast...

  17. #17
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    @matt: yes, its the 5 bytes in that case, pooh! so the cpus are actually doing sse the same whether intel or amd or whatever... (im using amd for example)

    about the filename, every char is copied 1 to 1 into the archive... prefixed with the "len" instead of an appended

  18. #18
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    uff.

    the ? is "backslash 0"
    the 6 is "backslash 6"

  19. #19
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Quote Originally Posted by eugene
    nania: it would rock, if we make it to convert the engine from float to integer calculations... for most applications it is possible and a straight forward process... how it effects effficiency in compression we can only see if we do it, but its worth the case study!

    else we would alsways go for telling the compiler to use 100% IEEE float point math, this way it will be 100% compatible! we only loose speed...
    paq8l has a DMC coder that uses integer math. It also saves memory. A node takes 12 bytes instead of 16 (or 24 on a 64-bit processor, since pointers are 8 bytes). In paq8l I used array indexes instead of pointers so a 64 bit implementation is still 12 bytes.

    Actually my model uses 11 bytes because I use 1 byte to record a bit history, and I use bitfields to pack the two counts into 12 bit fixed point integers with 8 bits of precision (range about 0.004 to 16). That seems to be all you need. You can clamp the counts at 15. I dont know how well that works by itself because in paq8l there are 2 probability outputs. One is the usual count1/(count0+count1) and the other is the bit history mapped to a probability using an adaptive table. Then these two predictions are combined with all the other models. The pair of counts works best for uniform data like enwik9 and the bit history works best for a mix of data types (it is more adaptive).

    About floating point math. I think the difference between x87 and SSE is that x87 uses 80 bit temporaries and SSE doesnt. Also, SSE has a faster mode where denormalized numbers are treated as 0. This only affects very small numbers (~10^-3. SSE could do 4 floating point operations at once but C/C++ compiliers will only generate code to use the xmm registers as scalars which use only 1/4 of the register. If you want to use the full power of SSE you have to use assembler and figure out which operations in parallel. I dont see where you could do this with DMC. In paq7/8 the mixer (a neural network) uses MMX to do 4 multiply-accumulates at once, but most of the other code is not parallelizeable.

    Anyway if you dont use compiler options otherwise then the compiler generates x87 code so it doesnt matter what you run it on.

    Its true you need as much memory to decompress as to compress, but all PPM and CM compressors have this problem too, and BWT needs almost as much.

  20. #20
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    @matt: ack to all points. nearly... is ppm really always symetrical?

    and yes, if dmc were to parallelize in the usual way someone or me would have done so.. i like assembler

  21. #21
    Moderator

    Join Date
    May 2008
    Location
    Tristan da Cunha
    Posts
    2,034
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Future versions of HOOK and FreeHook should be truly awesome!

  22. #22
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Quote Originally Posted by eugene
    is ppm really always symetrical?
    Take a look at decompression times at http://cs.fit.edu/~mmahoney/compression/text.html

    PPM, CM, DMC, LZW and LZP all have to do modeling in the decompressor. For fast decompression you want LZ77 (LZX, LZMA, ROLZ), or BWT as your next choice.

    BTW, barf has a really fast LZ77 decompressor, even though I wrote it as a joke. Its real simple. Bytes 0-31 code for a literal and 32-255 code for a match of length 2. Compression isnt great but it allows for a bit of recursion. Maybe I should benchmark it

  23. #23
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Quote Originally Posted by Matt Mahoney
    BTW, barf has a really fast LZ77 decompressor, even though I wrote it as a joke. Its real simple. Bytes 0-31 code for a literal and 32-255 code for a match of length 2. Compression isnt great but it allows for a bit of recursion. Maybe I should benchmark it
    Aw, what the hell,
    http://cs.fit.edu/~mmahoney/compression/text.html# 7594
    quicklz is still faster.

  24. #24
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    matt: seems i duplicate by porting to fixed math as dmc in paq is already well done... though its carried out with less memory and without the lz step there but thats what we would expect from dmc model in paq.

  25. #25
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    Excused the delay but unfortunately for job reasons until tomorrow I do not have free time! I hope indeed that with to Eugene and Matt Mahoney a motor can be realized integer! In my tests it carries out to you have increments in speed nearly of the 100%!

    Scusate il ritardo ma purtroppo per motivi di lavoro fino a domani non ho tempo libero! Spero davvero che insieme a Eugene e Matt Mahoney si possa realizzare un motore integer! Nei miei test effettuati si hanno incrementi in velocit? quasi del 100%!

  26. #26
    Member
    Join Date
    Apr 2007
    Posts
    16
    Thanks
    0
    Thanked 0 Times in 0 Posts
    increments in speed good, but what about compression ratio and what about memory usage?

    iwith fixed point integer math we loose precision i guess, so there should be a difference!

    im quite busy, ill report here when i finished the first working version from what i think of.

    i cant state a deadline right now coze i know the next days will be very busy ones... i have a very demanding life right now. i just want to say: dont wait for tomorrow, but in short time ill present something here

    nania: u can always mail me of course and ill look into any issue...

    (we can beat barf already! haha)

  27. #27
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    I used integers in paq8l DMCModel not for speed but for portability and to save memory. I know the archive will be exactly the same on any processor. Integer division is actually slower than floating point division. There is not much effect on compression ratio. I could not improve on compression over using just 12 bit integers.

  28. #28
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    For Matt and Eugene! system DMC of Matt from me modified and adapted enough to the ADMC works perfectly but it is not put to point! A po difficult it is the solution to find! however not as soon as upgraded I will send a source to you of freehook with the same engine integer that I will use in the next version! It seems to save memory! typedef struct emubrain {struct emubrain * ahead[2];unsigned long cxt[2 ];} branch; Here it turns out to you:
    (Pentium IV 3 GHZ HT)
    FP.LOG ->128 1 1 12 532,227 sec. 5.92 (3382Kb/s)
    ENWIK8 ->1200 3 1 5 22.931.713 sec. 107.80 (905Kb/s)

    Per Matt e Eugene! il sistema DMC di Matt da me abbastanza modificato ed adattato all'ADMC funziona perfettamente ma non ? messo a punto! Un po difficile ? la soluzione da trovare! tuttavia non appena potenziato vi invier? una sorgente di freehook con lo stesso engine integer che user? nella prossima versione! Sembra risparmiare memoria!
    typedef struct emubrain {struct emubrain *ahead[2];unsigned long cxt[2];} branch;
    Ecco i risultati: (Pentium IV 3 GHZ HT)
    FP.LOG 128 1 1 12 532.227 sec. 5.92 (3382Kb/s)
    ENWIK8 22.931.713 sec. 107.80 (905Kb/s)

  29. #29
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    RESULTS ON
    Enwik8 ->22.448.453

  30. #30
    Tester
    Nania Francesco's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    1,565
    Thanks
    223
    Thanked 146 Times in 83 Posts
    HOOK (C) 2007
    free closed source demo archiver/compressor :
    v.09b *
    - Full CPU Compatibility;
    - Very faster compression/decompression + 20-30 % (in ADMC mode and ADMC+LZP mode);
    - Very Low Memory ADMC cells [12 BYTE] - 20-60 %;

Page 1 of 4 123 ... LastLast

Similar Threads

  1. Replies: 23
    Last Post: 24th March 2018, 17:57
  2. BALZ - An Open-Source ROLZ-based compressor
    By encode in forum Data Compression
    Replies: 60
    Last Post: 6th March 2015, 16:47
  3. Fpaq0pv3 file Compressor
    By Nania Francesco in forum Forum Archive
    Replies: 13
    Last Post: 7th April 2008, 17:20
  4. New fast open-source paq-based jpeg compressor
    By Bulat Ziganshin in forum Forum Archive
    Replies: 14
    Last Post: 13th September 2007, 13:57
  5. FPAQ file compressor
    By LovePimple in forum Forum Archive
    Replies: 5
    Last Post: 1st November 2006, 23:48

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •