Page 1 of 5 123 ... LastLast
Results 1 to 30 of 140

Thread: another (too) fast compressor

  1. #1
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts

    LZ4 : a (too) fast compressor

    well, LZP2 release is not so distant, but there is already something on the table...
    In a will to compare LZ77 and LZP, i just modified LZP2 to look into a hash table instead of a context table.

    And that's nearly all. Although the logic of LZP and LZ77 are quite different, from a code perspective, they are strikingly similar.

    This produces, LZ4, yet another fast compressor, but with a much better compression rate on text than LZP2, thanks to LZ77 hit rate superiority. It's not as fast as LZP2 though, but that's not too bad : 20% speed decrease for 20% ratio increase for text, good deal.

    What really strikes me however, is it's speed at decompression. This is much beyond what i expected, reaching 1GB/s on virtual HDD images. Enwik9 is decoded in 2 seconds. So that's a hell of fast. And this is the reason why i'm posting this release earlier than expected.

    Well, if you want to have it a try, it is hosted here :
    http://pc-compression.dnsalias.com//...ws-t95.htm#144

    Edit : maybe a competitor to LZSS decoding speed (next release) ?

    Regards
    Last edited by Cyan; 28th October 2011 at 18:51. Reason: changed title

  2. #2
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    so far I only have "tested" the initial screen
    Three questions though:
    - what is the difference between "-u" and "-d"? Both are defined as "decode compressed file"?
    - which license will you take?
    - Archiver or compressor?

    Best regards

  3. #3
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Quote Originally Posted by Vacon View Post
    - what is the difference between "-u" and "-d"? Both are defined as "decode compressed file"?
    Hi Vacon

    -u and -d are identical. Maybe i should make it clearer.
    It is because some people think of "-d" as "decode", other "-u" as "uncompress".

    Quote Originally Posted by Vacon View Post
    - which license will you take?
    - Archiver or compressor?
    This is a pure single-file command-line compressor.
    No container.
    And by the way, if anyone knows of an interesting generic GUI that could be adapted, with container possibilities if that's possible, i'm really interested ...


    Regarding licenses, i don't know enough about this, so it seems a little to soon to settle.
    Let's say this is free for use and distribute.
    Is there a need for a document / readme /disclaimer ?
    Last edited by Cyan; 18th October 2009 at 08:35.

  4. #4
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    -u and -d are identical. Maybe i should make it clearer.
    It is because some people think of "-d" as "decode", other "-u" as "uncompress".
    yes, you should...

    This is a pure single-file command-line compressor.
    No container.
    And by the way, if anyone knows of an interesting generic GUI that could be adapted, with container possibilities if that's possible, i'm really interested ...
    Generic...? What about PeaZip? DoubleCommander? Or CoffeeArc?

    Regarding licenses, i don't know enough about this, so it seems a little to soon to settle.
    Let's say this is free for use and distribute.
    Is there a need for a document / readme /disclaimer ?
    Not so urgent, but it cvould be of interest. Nothing is as quick "borrowed" as a good idea for software...

    Best regards!

  5. #5
    Member Fu Siyuan's Avatar
    Join Date
    Apr 2009
    Location
    Mountain View, CA, US
    Posts
    176
    Thanks
    10
    Thanked 17 Times in 2 Posts
    And by the way, if anyone knows of an interesting generic GUI that could be adapted, with container possibilities if that's possible, i'm really interested ...
    I'm really interested too! Has anyone ever observed how WinUHA works? It calls the original UHARC.exe and send parameters to it while processing files. It can also show the progress. So I wonder how it be implemented if I want a similar shell for other CMD compressors.

  6. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    btw, i plan to impelment stdio mode in freearc, i.e. fa and compress works in parallel and fa shows progress indicator based on how much data was already read by compressor and how much data it have written

    also, it's possible to make compression dlls using CLS technology, look CLS directory for simple_codec.cpp example

    i plan to improve CLS and rewrite all internal algos of FA using it, so they can be moved to external dlls and used in other programs, upgraded, downloaded from inet and so on..

  7. #7
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Quote Originally Posted by Vacon View Post
    Generic...? What about PeaZip? DoubleCommander? Or CoffeeArc?
    Yes, interesting.

    CoffeeArc has been presented in this forum, and is used for LZP2.
    It is a pretty nice idea, easy to setup (just edit an xml file, and there you go).
    I would say it goes in the right direction.

    However, it is no longer maintained, and the last version is, well, quite perfectible, especially on the file manager part.
    And i don't like the idea of dependance to Java.


    Now, regarding Peazip, yes at first glance it looks the right thing,
    but i have not found anything to edit options, names, etc.
    It seems everything is in the code, which is not good for 3rd party edition.
    Then you could say "take the source, modify them". Well, sure. Take your feet, go around the mountain... It is obviously much much longer.

    Double commander... well, i don't know this one.... I haven't seen any editing option either...


    So in a word, i would like the easy xml integration of CoffeeArc, with the file manager of Peazip...


    Note that the idea of Bulat seems very interesting too and given your experience and realisation, one can expect a very nice execution.

  8. #8
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    Double commander... well, i don't know this one.... I haven't seen any editing option either...
    well, to be honest: DoubleCommander is a TotalCommander-clone aimed to be multi-platform. So all plugins made for TC (should) work for DC (fingers crossed...). At least Bulat's freearc-addon (shipped with FreeArc in folder "Addons -> TotalCommander MultiArc plugin") does make some nice little archives while used with DC.
    If you have some spare time to look at it, take "Configure" -> "Options" -> "Plugins" (middle of left side) -> "Archiver-Plugins" (on top of right side)

    Best regards!

  9. #9
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    38
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by Cyan View Post
    And by the way, if anyone knows of an interesting generic GUI that could be adapted, with container possibilities if that's possible, i'm really interested ...
    GTK+ maybe ?
    Or more simple FroG

    Quote Originally Posted by Fu Siyuan View Post
    I'm really interested too! Has anyone ever observed how WinUHA works? It calls the original UHARC.exe and send parameters to it while processing files. It can also show the progress. So I wonder how it be implemented if I want a similar shell for other CMD compressors.
    Its very wide known and you know it too Its called Borland Delphi (6.0\7.0)

  10. #10
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    That little frog seems interesting
    i shall have a look at it....

  11. #11
    Member Fu Siyuan's Avatar
    Join Date
    Apr 2009
    Location
    Mountain View, CA, US
    Posts
    176
    Thanks
    10
    Thanked 17 Times in 2 Posts
    Its very wide known and you know it too Its called Borland Delphi (6.0\7.0)
    Yeah, I know it. Acturally I only know it. But what about the details of implement? How to communicate with the console program, especially receive progress information?

    btw, i plan to impelment stdio mode in freearc, i.e. fa and compress works in parallel and fa shows progress indicator based on how much data was already read by compressor and how much data it have written
    What if the compressor read a block of hundreds of megabytes per time?

  12. #12
    Member Skymmer's Avatar
    Join Date
    Mar 2009
    Location
    Russia
    Posts
    681
    Thanks
    38
    Thanked 168 Times in 84 Posts
    Quote Originally Posted by Fu Siyuan View Post
    Yeah, I know it. Acturally I only know it. But what about the details of implement? How to communicate with the console program, especially receive progress information?
    Maybe this source will help you.
    http://www.aboutmyip.com/files/DeltaCopySrc.zip
    As I understood DeltaCopy tool also catches progress from console and reflects it in GUI. I can be wrong though.

  13. #13
    Programmer giorgiotani's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    166
    Thanks
    3
    Thanked 2 Times in 2 Posts
    Hi, even if its mechanism is not comparable with the flexibility of CoffeArc's one, PeaZip supports using custom executables for compression and extraction.
    In example, you can set the frontend to compress with an arbitrary program selecting "Custom" in format combobox; in advanced table you can set executable's name, desired file extension, enter a string of parameters and chose the order parameters' string, input and output will be combined in the command line.
    Last used custom executables are remembered by the application.
    For extraction, in "Advanced" tab it is possible to enable handling of custom file types and to define the custom extractor's syntax in the same way.
    Both in compression and extraction it is possible to further refine the command line in "Console" tab.
    This mechanism was mainly implemented to help users in experimenting new command line compressors (the benefit is mainly a GUI to set input and output), rather than as a pluging model for integrating custom executables.

    A quicker, less orthodox solution, if the executable implements a syntax compatible with the one of another format (i.e. paq, which has simple but well designed syntax), is replacing the old executable with the new one, that PeaZip will use as replacement.

  14. #14
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks Giorgiotani. This is nice hint.

    Sounds *very* interesting.
    I will sure have a look at it

    Could be exactly what i'm looking for ...

  15. #15
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts

    Unhappy

    Alas,
    Peazip does not offer container capability in 'custom' mode,
    which is a pity, because TAR code is obviously present,
    but it is intentionnally greyed out when selecting "custom compressor".

    Without this container capability, file-only compressors are much too restricted to use Peazip as a front-end...

  16. #16
    Programmer giorgiotani's Avatar
    Join Date
    May 2008
    Location
    Italy
    Posts
    166
    Thanks
    3
    Thanked 2 Times in 2 Posts
    I understand the problem, "tar before" option was disabled for custom format because this option would not allow to edit the resulting command in console (the job will be performed in two stages, from two command lines).
    It is not a problem for custom archivers, but it is a limit for custom single file compressors, that would be used as such rather than being transparently extended to archive+compression utilities.
    I may change it in future update to extend single-pass archiving capabilities to custom single file compressors, until then the only way is two passes, first create the tar, then use the custom single file compressor.

  17. #17
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Thanks for information, giorgiotani

  18. #18
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Cyan, do you intend to open source LZ4HC eventually?

  19. #19
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Well i've been asked that question quite many times, so it think the answer is yes.

    The main difference is going to be license : since LZ4HC uses MMC has its core search algorithm, which is GPL, so LZ4HC will be GPL.
    Its output will however remain 100% compatible with the regular "fast" LZ4, which is BSD.
    So one can build an "offline" compressor using the GPL'd LZ4HC, and distribute its final commercial code using BSD LZ4 decoder.

    Regards

  20. #20
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Thanks for the answer.

  21. #21
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Just FYI,
    LZ4-HC is now Open Sourced.
    It is currently hosted within the MMC google code website, since it makes heavy use of this search algorithm :
    http://code.google.com/p/mmc/

    LZ4-HC and LZ4 are fully compatible with each other.
    The HC version (High Compression) compresses better, by about 20%, but is also much slower. It seems more appropriate for offline compression, when compression time does not matter.

    Obviously, a file/stream created by LZ4-HC can be decoded by LZ4, since the decoder function is exactly the same.

    Regards

  22. #22
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    2 problems.
    It doesn't compile with gcc on Windows, your macros make it use Visual Studio types, which doesn't work. After forcing it to use the stdint ones, everything is fine.
    Second, when linking together lz4 and lz4hc I have 2 copies of lz4_uncompress, which I don't know how to deal with.

  23. #23
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts
    Good points, m^2 !

    Merging LZ4 & LZ4HC together is solved in r8.
    LZ4HC is now strictly dedicated to the LZ4_compressHC() function,
    avoiding duplicated definition.

    Regarding type : yes, i was aware of gcc compilation issue under windows, but was unable to understand why, especially since gcc compilation under linux is fine.
    Your explanation does indeed make sense.

    That would mean that the following line, testing Win32 target :
    #if defined(_MSC_VER) || defined(_WIN32) || defined(__WIN32__)

    is not good enough. It should also test Visual compilator.
    Is there a #define for that ?

    Rgds

  24. #24
    Member m^2's Avatar
    Join Date
    Sep 2008
    Location
    Ślůnsk, PL
    Posts
    1,611
    Thanks
    30
    Thanked 65 Times in 47 Posts
    Well, _MSC_VER is for that. So why not simply #if defined(_MSC_VER) ?

  25. #25
    Member
    Join Date
    Sep 2008
    Location
    France
    Posts
    863
    Thanks
    460
    Thanked 257 Times in 105 Posts

    Thumbs up

    Ok, granted.
    Latest LZ4HC release avoids compilation issue with GCC under windows (notably MinGW) thanks to proposed modification by m^2.

    While at it, i've also added the capability to select Fast Compression (-c0) or High Compression (-c1) at command line interface.

    There is also a Pipe mode available, which can chain the output of a previous command (typically "tar") directly to the input of LZ4HC, and the other way round. It works well under Linux.

    However, pipe mode in Windows seems limited to Text mode. There is probably something specific to do under this OS.
    Last edited by Cyan; 3rd September 2011 at 20:12.

  26. #26
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by Cyan View Post
    However, pipe mode in Windows seems limited to Text mode. There is probably something specific to do under this OS.
    #ifdef FREEARC_WIN
    #define set_binary_mode(file) setmode(fileno(file),O_BINARY)
    #else
    #define set_binary_mode(file)
    #endif

    set_binary_mode (fin);
    set_binary_mode (fout);

  27. #27
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,366
    Thanks
    212
    Thanked 1,018 Times in 540 Posts
    To be more specific:
    Code:
    #ifndef __GNUC__
    #include <io.h>
    #include <fcntl.h>
    #endif
    
    
    #ifndef __GNUC__
      setmode( fileno(stdin), O_BINARY );
      setmode( fileno(stdout), O_BINARY );
    #endif

  28. #28
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    it's windows-specific rather than gcc-specific

  29. #29
    Member
    Join Date
    Jun 2009
    Location
    Kraków, Poland
    Posts
    1,474
    Thanks
    26
    Thanked 121 Times in 95 Posts
    IIRC, Unixes do not differentiate between text files and binary files. No conversions are done, so different modes aren't needed.

  30. #30
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    to be exact, dos/windows has another line-end character sequence, and "text mode" just emulates unix behavior. of course, that's not required on unix itself

Page 1 of 5 123 ... LastLast

Similar Threads

  1. Blizzard - Fast BWT file compressor!!!
    By LovePimple in forum Data Compression
    Replies: 40
    Last Post: 6th July 2008, 15:48
  2. PACKET v.0.01 new fast compressor !
    By Nania Francesco in forum Data Compression
    Replies: 45
    Last Post: 19th June 2008, 02:44
  3. RINGS Fast Bit Compressor.
    By Nania Francesco in forum Forum Archive
    Replies: 115
    Last Post: 26th April 2008, 22:58
  4. Tornado - fast lzari compressor
    By Bulat Ziganshin in forum Forum Archive
    Replies: 23
    Last Post: 27th July 2007, 14:26
  5. Fast PPMII+VC Compressor
    By in forum Forum Archive
    Replies: 4
    Last Post: 2nd August 2006, 20:17

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •