Results 1 to 12 of 12

Thread: Good Practical Strategy to Compress Multi-Media BIGFILES

  1. #1
    Member
    Join Date
    Jan 2008
    Posts
    15
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I have recently begun to dump and backup my original PSP UMD games as ISO-files. I want to store them in a compressed format to save space but also would like the decompression time to be practical.

    For this purpose I have played around with 7zip 4.57 and other packers such as CCMx 1.30 (and UHARC06, LPAQ8, and FreeArc). My source files were all single ISO-files and contain inhomogeneous multi-media/game data (incl. ATRAC3 audio and other types like graphic and map data).

    General obervation: In many cases I got very good results with 7zip LZMA Ultra Solid, 64MB dictionary, word size 273. However, in some cases UHARC and especially CCMx and LPAQ was way better in compression strength (~20% smaller file). FreeArc (-mx switch) usually gave about the same results as 7zip (sometimes slightly better).

    In my test case a lot may depend on the detection and segmentation of the different file types within the ISO and the application of appropriate compression filters to cope with them. I tried Precomp but for almost all of my test files it could not do anything (the ISOs do not contain a lot of zLib/Deflate/GIF files).

    Some questions:

    1) Is there a way to pre-process (e.g., segment) them for better compression? I tried the durilca02 -t1 -l switches but didn't know how to handle the segmented files then and how to merge them back to the original file.

    2) Is there a way to analyse an ISO-file and "predict" which compressor/compression method might work better?

    Any suggestions?

  2. #2
    Member
    Join Date
    Jun 2008
    Location
    G
    Posts
    372
    Thanks
    26
    Thanked 22 Times in 15 Posts
    Maybe is an Iso format with included LZMA / whatever compression the Solution of your problem.

  3. #3
    Member
    Join Date
    Jan 2008
    Posts
    15
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Not sure whether I understood that correctly... Does such a format exist?

  4. #4
    Member
    Join Date
    Jun 2008
    Location
    G
    Posts
    372
    Thanks
    26
    Thanked 22 Times in 15 Posts
    Yeah Poweriso has a build-in compression inside their own iso format.
    But isnt really good i think its a zlib / deflate algorithm.

  5. #5
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,255
    Thanks
    306
    Thanked 779 Times in 486 Posts
    I think there is a program called ECM that preprocesses ISO files to make them more compressible. It removes the error correcting information that looks like random data to most compressors. I haven't tried it, but it looks like it helps a lot on the CD image benchmark http://squeezechart.freehost.ag/main.html

  6. #6
    Member
    Join Date
    May 2008
    Location
    Kuwait
    Posts
    335
    Thanks
    36
    Thanked 36 Times in 21 Posts
    ECM remove the Error Correcting in Raw images like BIN but not from ISO as its overhead is rather small..(test it my self) i made a bin file from a cd and ISO file. when i ECM the bin file it reaches ISO size thats why i skipped it for normal images which produces size near ISO size.. i know NRG is one of them but i don't know about others..

  7. #7
    Member Vacon's Avatar
    Join Date
    May 2008
    Location
    Germany
    Posts
    523
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Hello everyone,

    ecm was once used to improve 7-zip's compression for (edit 2 ->) CD-images.
    There has been a plug-in for 7-zip, but it's quite a long time ago (early 4.xx or late 3.xx versions... I'll search and give link)
    Edit: here is the link to discussions-thread: http://sourceforge.net/forum/message.php?msg_id=3435505
    Sometimes it gave a nice gain
    Edit 3: here is the homepage for most recent version: http://ajax16384.narod.ru./

    Best regards!

  8. #8
    Member
    Join Date
    Jan 2008
    Posts
    15
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thanks for the feedback so far!

    Quote Originally Posted by thometal
    Yeah Poweriso has a build-in compression inside their own iso format.
    But isnt really good i think its a zlib / deflate algorithm.
    Ok, got it. But this option would be much worse than 7zip LZMA.

    Quote Originally Posted by maadjordan
    ECM remove the Error Correcting in Raw images like BIN but not from ISO as its overhead is rather small..
    Yes, thats true. For ISOs the gain is negligible (in my tests less than 100KB). Thanks anyhow for pointing me to the 7zip ECM plugin, Vacon. This may work well on BIN files.

    Does anybody of you know how to work with files you get when using the -t1 -l switches in durilca02? It segments a file in different chunks file000, file001, ... but whats the reasoning behind that? I guess it is a kind of organisation of the bigfile in smaller files of the same "media type" but how do I know which one represent which type. And how can I merge them back to the original file?

    Any experience you can share?

  9. #9
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,426
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    > Does anybody of you know how to work with files you get when using the -t1 -l
    > switches in durilca02?

    You're not supposed to mess with that, as -l is a hidden debug option.
    The only actual use I had for it was bruteforcing the options strings
    for Bermans' comparison. Eg. this one is from executable test:
    -t1(9,(1-4(1)),(2-3),(3-3),(7-4(1)),(13-2(5)), (15-4(1)),(16-4(1)),(17-4(1)),(20-2(10)),(23-2(31)), (28-2(9)),(31-4(1)),(33-4(1)),(34-4(1)))
    I mean that stripping the segments into separate files allows to
    separately try all the possible methods for each segment.

    > It segments a file in different chunks file000, file001,
    > ... but what's the reasoning behind that?

    Afair, order1 statistics similarity or something like that.

    > I guess it is a kind of organisation
    > of the bigfile in smaller files of the same "media type" but how do I know which
    > one represent which type.

    Basically, by looking at it

    > And how can I merge them back to the original file?

    There's no sense in doing that.

    > Any experience you can share?

    http://www.compression.ru/ds/seg_file.rar
    http://www.compression.ru/ds/disasm32.rar

  10. #10
    Member
    Join Date
    Jan 2007
    Location
    Moscow
    Posts
    239
    Thanks
    0
    Thanked 3 Times in 1 Post
    Quote Originally Posted by MrC
    but how do I know which one represent which type
    You shouldnt know its type. Your compressor will do the job

  11. #11
    Member
    Join Date
    Jan 2008
    Posts
    15
    Thanks
    0
    Thanked 0 Times in 0 Posts
    First of all, thanks Shelwin for the explanation. However, I am not an expert in compression algorithms but will try my best...
    Quote Originally Posted by Shelwien
    > And how can I merge them back to the original file?

    Theres no sense in doing that.
    Quote Originally Posted by nimdamsk
    You shouldnt know its type. Your compressor will do the job
    At least for me there was a sense as my (stupid?) plan was to use different compressors on the segmented files. In order to rebuild the original file I must know how to do it based on the segmented files durilca has produced.

    Aside from this, I did some more testing with my ISOs. Another observation is that when I extract an ISO (especially one that comprises a lot of files) and then compress it with 7zip I usually get better results (~5-10% better ratio). This might be because then 7zip can make use of sorting the files by their extensions prior to compression. However, I dont like to extract the ISO.

    Is it possible to sort the files within the ISO? Or - even better - can I get 7zip to do the file sorting within the ISO prior to compression? Any plugins/tools available you are aware of?

  12. #12
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,426
    Thanks
    223
    Thanked 1,053 Times in 565 Posts
    > At least for me there was a sense as my (stupid?) plan was to use different
    > compressors on the segmented files. In order to rebuild the original file I must
    > know how to do it based on the segmented files durilca has produced.

    That's ok.
    Understanding that, I gave you a link to seg_file (with source) which is used
    in durilca for file segmentation.

    Of course, there's other way - durilca -t1 -l stores the location/size info
    to the .000 file, so you can use it to reconstruct the file.
    But that index (.000) uses some variable-length encoding scheme, which has
    to be reversed first. So I think that its less bothersome to simply make
    something for yourself using seg_file source.

    > Is it possible to sort the files within the ISO? Or - even better - can I get
    > 7zip to do the file sorting within the ISO prior to compression? Any
    > plugins/tools available you are aware of?

    It should be possible (and simple enough) to "unpack" the .iso keeping the
    additional information for lossless reconstruction. But I don't know of any
    freely available tools for that, though some GUI archivers might include
    something of that kind.

Similar Threads

  1. Good Compression for Microcontrollers
    By elektronika in forum Data Compression
    Replies: 12
    Last Post: 23rd March 2010, 20:36
  2. Most efficient/practical compression method for short strings?
    By never frog in forum Data Compression
    Replies: 6
    Last Post: 1st September 2009, 05:05
  3. Multi-threading motivation
    By Trixter in forum Data Compression
    Replies: 1
    Last Post: 10th September 2008, 06:18
  4. Maximum Practical Compression
    By Bulat Ziganshin in forum Forum Archive
    Replies: 5
    Last Post: 31st March 2008, 16:20
  5. Best practical archiver
    By nimdamsk in forum Forum Archive
    Replies: 34
    Last Post: 24th March 2007, 22:51

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •