Results 1 to 13 of 13

Thread: Urban Compressor?

  1. #1
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    331
    Thanks
    191
    Thanked 55 Times in 39 Posts

    Urban Compressor?

    Hi everyone,

    I am working on creating a benchmark of old DOS archivers from the early 90's.

    I am trying to get a compiled version of the old archiver from 1991. All I can find is the source and I know nothing about compiling

    Can anyone make a dos executable for me?

    Here is the source: ftp://ftp.sac.sk/pub/sac/pack/ddjcompr.zip

    Thanks everyone. I will be posting my benchmark here soon. Nothing scientific, just simple comparison.

  2. #2
    Member
    Join Date
    May 2012
    Location
    UK
    Posts
    7
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hi again, i look forward to reading your benchmark. And for the compiled version of that program i have just compiled it here
    http://www.mediafire.com/?wv8i83n234qzzx4
    it was easy to compile (with MinGW) but it took me few minutes to figure out the commands to compress files (i then found out "enter.asc" tells you the command for compressing)
    so usage for compression is
    ddj <DOCUMENT.TXT >OUTPUT.DDJ
    and decompression is
    unddj <OUTPUT.DDJ >DOCUMENT.TXT

    the < and > are important and it won't work without those. Any other problems let me know.

  3. #3
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    Here is a 32 bit Windows compile. I had to change main.c so that input and output redirection work in binary mode. Modified source is included. If you compile for Linux, then use the original code. you only need the files in the subdirectory urban to compile. I just used "make ddj" and "make unddj" with MinGW g++ 4.6.1.

    To compress: ddj < input > output
    To decompress: unddj < input > output

    Also, some tests on Silesia. Compression is a little worse than zip. http://mattmahoney.net/dc/silesia.html

    Edit: Also LTCB. As far as I can tell from the uncommented source code, it is an order 2 context model with bitwise arithmetic coding.
    Attached Files Attached Files
    Last edited by Matt Mahoney; 17th June 2012 at 23:58.

  4. #4
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    331
    Thanks
    191
    Thanked 55 Times in 39 Posts
    Hi again, i look forward to reading your benchmark. And for the compiled version of that program i have just compiled it here
    http://www.mediafire.com/?wv8i83n234qzzx4
    it was easy to compile (with MinGW) but it took me few minutes to figure out the commands to compress files (i then found out "enter.asc" tells you the command for compressing)
    so usage for compression is
    ddj <DOCUMENT.TXT >OUTPUT.DDJ
    and decompression is
    unddj <OUTPUT.DDJ >DOCUMENT.TXT

    the < and > are important and it won't work without those. Any other problems let me know.
    This one did not work for me. Is it complied as a 16-bit DOS executable? I would try the syntax you told me and it would output a 200 byte file.

    Here is a 32 bit Windows compile. I had to change main.c so that input and output redirection work in binary mode. Modified source is included. If you compile for Linux, then use the original code. you only need the files in the subdirectory urban to compile. I just used "make ddj" and "make unddj" with MinGW g++ 4.6.1.

    To compress: ddj < input > output
    To decompress: unddj < input > output

    Also, some tests on Silesia. Compression is a little worse than zip. http://mattmahoney.net/dc/silesia.html

    Edit: Also LTCB. As far as I can tell from the uncommented source code, it is an order 2 context model with bitwise arithmetic coding.
    This one works! Is it too much to ask to compile it as 16-bit DOS executable? My benchmark is of 16-bit DOS archivers of the early 90's and 32-bit wasn't used yet in 1991 when Urban Compressor came out.

    Thanks to both of you for the quick replies!

  5. #5
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    MinGW does not compile to DOS AFAIK. And actually, the program is written for Unix. It would not work in DOS because (it appears) it uses more than 640K of memory. Compiling with dmc -mc (Mars, 16 bit, small code, large data) gives an error that the array h is larger than 64K.

    I didn't test the other compile. But with no source changes, it doesn't work in Windows. I had to modify the source for binary I/O like this in main.c

    Code:
    #include <fcntl.h> // for setmode(), requires g++
    main() {
      setmode(0, O_BINARY);  // stdin in binary mode
      setmode(1, O_BINARY);  // stdout in binary mode
    This would not work in Linux. For the LTCB test, I ran under Linux with no source changes.

  6. #6
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    331
    Thanks
    191
    Thanked 55 Times in 39 Posts
    Thanks Matt.

  7. #7
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    I looked at the code some more. The program is an order-2 indirect context model with bitwise arithmetic coding. It maps a context hash (2 whole bytes plus previously coded bits) to a hash table of size 710123 containing a pair of bit counts and a second hash for detecting collisions so counts can be reset to 0. The counts have range 0..8 and are both halved if either exceeds 8. The pair of counts (and byte count mod 3, which probably doesn't help except for maybe .bmp files) is then mapped to a second pair of counts in the range 0..60000, which are both halved if the sum exceeds 60000. The initial mapping is (n0,n1) -> (n0,n1) unless one of the counts is 0, in which case it is (0,n1) -> (1,1+2^n1). The bit prediction is n1/(n0+n1).

    This is the earliest case of an indirect context model that I know of, from 1991. I used it in PAQ over 10 years later, not aware of this work. I also found that detecting hash collisions is much more important for indirect models than for direct context models.

  8. #8
    Member kampaster's Avatar
    Join Date
    Apr 2010
    Location
    ->
    Posts
    55
    Thanks
    4
    Thanked 6 Times in 6 Posts
    comp1, My collection:
    Attached Files Attached Files

  9. #9
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    331
    Thanks
    191
    Thanked 55 Times in 39 Posts
    Quote Originally Posted by kampaster View Post
    comp1, My collection:
    Wow! Great collection! At first glance, I'm sure there are some archivers I don't have that are in your collection that I can add to the list for the benchmark.

    Thanks!

  10. #10
    Member kampaster's Avatar
    Join Date
    Apr 2010
    Location
    ->
    Posts
    55
    Thanks
    4
    Thanked 6 Times in 6 Posts
    It is necessary?
    Attached Files Attached Files

  11. #11
    Member
    Join Date
    May 2012
    Location
    United States
    Posts
    331
    Thanks
    191
    Thanked 55 Times in 39 Posts
    Ok everyone,

    I found the compiler program needed to compile these old archivers into 16-bit dos executables.

    I am trying to decide if my benchmarks should be of pre-90's archivers or if I should do some from the 90's as well? it cannot be of all 16bit dos arhivers because the obvious winners are those that were created in the late 90's-early 00's because they use large amounts of memory (which defeats the purpose because most people did not have 64mb+ of ram in the times when ms-dos was dominant).

    So I am open for suggestions as to which 16-bit archivers I should use.

    Thanks everyone.

  12. #12
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 795 Times in 488 Posts
    You could benchmark on very small files. Then memory won't make much difference.

  13. #13
    Member
    Join Date
    Aug 2013
    Location
    Stockholm
    Posts
    1
    Thanks
    0
    Thanked 1 Time in 1 Post
    There was a bug which would cause it to fail when input was all zeroes. Fixed. It might compress a little better.
    https://github.com/Koistinen/Urban

  14. Thanks:

    Surfer (24th August 2013)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •