Results 1 to 8 of 8

Thread: Method to properly compress using UltraArc by Razor

  1. #1
    Member
    Join Date
    Jan 2021
    Location
    Nowhere
    Posts
    4
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Method to properly compress using UltraArc by Razor

    Click image for larger version. 

Name:	Screenshot (46).png 
Views:	485 
Size:	276.2 KB 
ID:	8246

    I have a bin file of ~91gb of a game and on scanning for zlib streams it showed a compression ratio of 63%.

    I have no coding knowledge so I wanted to know how I could compress the file by using what Mask settings with UltraArc?

  2. #2
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    630
    Thanks
    288
    Thanked 252 Times in 128 Posts
    Be careful with interpreting the numbers here. The compression ratio is 63%, but this is related to the zLib compressed parts of the file that could be found (which are surprisingly few, only 15 MB of 95 GB, so only around 0.02% of the file). Even worse, these are most likely false positives, so it might not improve compression at all compared to other compressors like 7-Zip.

    The main advantage of recompression tools like UltraArc is the recompression of these (already compressed) parts (compressed "streams") that can be done better than using the original algorithm (zLib in this case). As a good rule of thumb for zLib, the gain is about 50% of the original compressed size (more or less, depends heavily on the data), so this would only reduce your file by estimated additional 7.5 MB, which is not significant even if the other 95 GB of the file compressed really well.

    Anyway, you can still do two things:
    - Split the file in parts to speed up analysis - 95 GB is quite big and even with the 138 MB/s scanning speed, takes a lot of time. Most likely, UltraArc doesn't depend on an "intact" file, so it can handle f.e. 100 MB pieces - this way you can try other settings more quickly and concentrate on parts where many streams are detected.
    - Check other settings - especially "Headerless" might lead to more zLib streams detected and improve the results. Not sure about "Force detection", but this would be another possible thing to enable.

    Also note that game data is often encrypted or other formats are used (Oodle, video codecs, Ogg, compressed textures, ...), in these cases none of the existing "easy" recompression solutions like UltraArc work (i.e: give a benefit over usual compression methods like 7-Zip).

    For reference: r/compression thread. I saw your post there, but I didn't get around to answer yet and thought an answer here might help as well - also, someone already mentioned/suggested 7-zip, lzma, precomp, zpaq and cmix there, so I focussed on the UltraArc part here.

    I can confirm the precomp suggestion (disclaimer: I'm the author), it handles much more possible streams involved in game binaries (PNG, JPG, MP3, ...). It has no (official) GUI though, so you have to call it from the command line. Same recommendation here, though: Try some smaller parts of the file first to get a feeling about if it's worth to process the full 95 GB with it. Also, I'd suggest the following command line calls (for a file called game.dat) to get a feeling for the effectiveness:
    - "precomp -t+ game.dat" - compresses the file using lzma2 only - this is almost identical to using 7-Zip to compress the file and should be considered the "baseline"
    - "precomp -intense game.dat" - compressed the file using recompression - "intense" should be similar to what "Headerless" does for UltraArc.
    - optional: "precomp -cn -intense game.dat" - would be similar to what UltraArc seems to do. Gives a larger file with processed streams that can be compressed (potentially) better afterwards. Useful for compressing with something different than lzma2.
    - afterwards: "precomp -r game.pcf" - restores "game.dat"

    If precomp detects many streams and the "-intense" result is significantly smaller than the "-t+" result on many of the 100 MB test samples, you can add SREP to the mix for improved content deduplication and run everything on the big 95 GB file.
    Last edited by schnaader; 7th January 2021 at 17:59.
    http://schnaader.info
    Damn kids. They're all alike.

  3. Thanks (2):

    Bulat Ziganshin (12th January 2021),NoeticRon (8th January 2021)

  4. #3
    Member
    Join Date
    Jan 2021
    Location
    Nowhere
    Posts
    4
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by schnaader View Post
    Anyway, you can still do two things:
    - Split the file in parts to speed up analysis - 95 GB is quite big and even with the 138 MB/s scanning speed, takes a lot of time. Most likely, UltraArc doesn't depend on an "intact" file, so it can handle f.e. 100 MB pieces - this way you can try other settings more quickly and concentrate on parts where many streams are detected.
    - Check other settings - especially "Headerless" might lead to more zLib streams detected and improve the results. Not sure about "Force detection", but this would be another possible thing to enable.

    Also note that game data is often encrypted or other formats are used (Oodle, video codecs, Ogg, compressed textures, ...), in these cases none of the existing "easy" recompression solutions like UltraArc work (i.e: give a benefit over usual compression methods like 7-Zip).
    I followed your doubts about the zlib scan and used a better scanner for multiple streams that may have been used in that game compression. Here are the results:

    zlib (discrete levels) :- https://ibb.co/2SQmw36
    LZO :- https://ibb.co/3RzKVsF
    ZSTD :- https://ibb.co/SXgf0h0
    Crilayla :- https://ibb.co/YbDTDN2
    WAV :- https://ibb.co/vP8Wyw6
    BINK :- https://ibb.co/pzhqF8N
    VP6 :- https://ibb.co/x3CNrvFv
    LZ4 :- Stuck at 1.70GB detection

    Here the zlib detection was much more optimum I believe (but not sure since I don't know much about all these).

    Also the pre-installed game folder looks like this :- https://ibb.co/jL2Bgw4
    Should I install this game and obtain the raw data after installation and then proceed with its compression or can UltraArc handle the badly compressed 91gb bin file on its own?

    I was planning on using the following FreeArc settings under UltraArc but was not sure about the correct Mask settings to be applied since I do not have a good knowledge on command lines so could you please have a look at this attached image as well :- https://ibb.co/C1gGmsf

    Quote Originally Posted by schnaader View Post
    Also, I'd suggest the following command line calls (for a file called game.dat) to get a feeling for the effectiveness:
    - "precomp -t+ game.dat" - compresses the file using lzma2 only - this is almost identical to using 7-Zip to compress the file and should be considered the "baseline"
    - "precomp -intense game.dat" - compressed the file using recompression - "intense" should be similar to what "Headerless" does for UltraArc.
    - optional: "precomp -cn -intense game.dat" - would be similar to what UltraArc seems to do. Gives a larger file with processed streams that can be compressed (potentially) better afterwards. Useful for compressing with something different than lzma2.
    - afterwards: "precomp -r game.pcf" - restores "game.dat"
    To be perfectly honest I couldn't understand where to put the following command lines correctly so I hope you forgive me for not having the basic knowledge on this compression stuff.

    Thank you for your immense help thus far.

  5. #4
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    630
    Thanks
    288
    Thanked 252 Times in 128 Posts
    Quote Originally Posted by NoeticRon View Post
    I followed your doubts about the zlib scan and used a better scanner for multiple streams that may have been used in that game compression. Here are the results:

    zlib (discrete levels) :- https://ibb.co/2SQmw36
    That's much better indeed - 75.3 GB of compressed streams decompressed to 145 GB, so the 50% rule of thumb would give (95 - 75.3) + 75.3 * 0.5 = 47.5 GB.

    Suggestion from here: Try if the tool you tried in the first post gives similar numbers using the "Headerless" option. Processing installed data could be slightly beneficial, but UltraArc should also be able to proceed with the setup .bin as well because there clearly are zLib streams inside. After that, I'd advise to try different settings yourself with the tools you found, preferably on other smaller files, to get a feeling which settings are useful and how they influence speed and compression ratio. Other people can give you hints, but its much more effective in the long run to try things yourself and learn.

    Compression has the property to give minor ratio improvements for huge drops of speed, e.g. using something like a slow zpaq variant might give 10% better ratio, but might also take 10 hours instead of 1 hour to decompress. Also, memory, disk speed and multi-threading influence results. So there's no "optimal way" to compress data, but it all depends on your use case, the speed/size target you aim at and the hardware you use.

    Further research using the icon from your screenshot shows that the game is Forza 4 Horizon Standard Edition. There is a lossless repack of the ultimate edition available with a size of 40-50 GB depending on the installed (DLC?) content (obviously though, I didn't validate that statement as I don't own the game and don't want to support piracy, but it matches with the rule of thumb calculation above), so this is the compressed size to expect when using zLib recompression.

    Click image for larger version. 

Name:	forza_horizon_4_standard_edition.png 
Views:	48 
Size:	152.0 KB 
ID:	8250
    http://schnaader.info
    Damn kids. They're all alike.

  6. Thanks:

    NoeticRon (8th January 2021)

  7. #5
    Member
    Join Date
    Aug 2020
    Location
    taiwan
    Posts
    26
    Thanks
    21
    Thanked 1 Time in 1 Post
    @schnaader

    I don't recommend use srep393a, in this version have memory leak bug,
    recommend use srep392 or srep393,

    Last edited by Lithium Flower; 9th January 2021 at 12:40.

  8. #6
    Member
    Join Date
    Jan 2021
    Location
    Nowhere
    Posts
    4
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by schnaader View Post

    Suggestion from here: Try if the tool you tried in the first post gives similar numbers using the "Headerless" option.
    I have to hand it to you, your knowhow on this is astounding. Trying with the "Headerless" option did give me more accurate results :- https://ibb.co/377psfv (75.1 GB decompressed to 145 GB)

    Quote Originally Posted by schnaader View Post
    I'd advise to try different settings yourself with the tools you found, preferably on other smaller files, to get a feeling which settings are useful and how they influence speed and compression ratio.
    I took your advice and fiddled around with the compression algos under FreeArc. I took Need for Speed: The Run with a smaller file size of 15.3 GB and obtained the following results :-

    1. First compression was done by using Precomp, Srep, LOLZ and MMC giving a final size of 4.23 GB (72.35% decrease).
    2. Second compression was done by using Precomp, Srep and LOLZ giving a final size of 5.61 GB (63.3% decrease).
    3. Third compression is being carried out by using Precomp, Srep, LOLZ, MMC and MSC.

    I have a question regarding Precomp as to what do the different versions P.ZT and P.XT mean for it?

    Also I couldn't find any proper information about the different algos but from what I figured LOLZ, ZSTD and LZMA2 are final compressors and LOLZ trumps the lot with faster decompression. But what are MMC and MSC about?

    Quote Originally Posted by schnaader View Post
    Further research using the icon from your screenshot shows that the game is Forza 4 Horizon Standard Edition.
    Hands down you have great observation skills, that is indeed the Forza Horizon 4 game which I got free during the XBOX game pass promotional offer.

    Sorry for the barrage of questions and I truly apprreciate you taking out time from your schedule to reply to these threads.

  9. #7
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    630
    Thanks
    288
    Thanked 252 Times in 128 Posts
    Quote Originally Posted by NoeticRon View Post
    I have a question regarding Precomp as to what do the different versions P.ZT and P.XT mean for it?
    From the abbreviation, I'd guess these are the variants used in ZTool and XTool from the UltraArc author. Since XTool is the successor of ZTool, P.XT most likely gives better results. Note that the meaning of "precomp" is the zlib/zstd/lz4 "precompression" methods from UltraArc. These share the name and the main idea with my tool precomp, but the implementation is different, so I don't know much details about it and can just guess - for the zlib recompression this works well as I've done it myself, but I don't know anything about the zstd and lz4 recompression variants.

    Quote Originally Posted by NoeticRon View Post
    But what are MMC and MSC about?
    Couldn't find any details about MMC in UltraArc either. My guess would be it's either related to MMC (Morphing Match Chain) or an abbreviation for "MultiMedia Content".

    MSC seems to be a media streams compressor specialized on some game resource formats (.dds/.dxt = compressed textures, images, .wav/.mp3 = sounds/music).

    Quote Originally Posted by NoeticRon View Post
    Hands down you have great observation skills, that is indeed the Forza Horizon 4 game which I got free during the XBOX game pass promotional offer.
    Thanks and welcome in the club of curious "game compressors" ( "nice game, but I wonder how small I can get it?" ). Doing this myself frequently, it's very interesting to analyse game data to see the different strategies and file formats used.
    http://schnaader.info
    Damn kids. They're all alike.

  10. #8
    Member
    Join Date
    Jan 2021
    Location
    Nowhere
    Posts
    4
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by schnaader View Post
    "nice game, but I wonder how small I can get it?"
    ^^It really was a factor in my decision to compress it and also the fact that I was running out of my hoarding space. I have learnt a lot about the algos from you and the fileforums site...thanks a lot for being patient with a newbie in the scene. I look forward to our paths crossing again on one compression forum in the future.

    Stay safe and Godspeed!
    Last edited by NoeticRon; 9th January 2021 at 23:25. Reason: Forgot tag

Similar Threads

  1. RAZOR - strong LZ-based archiver
    By Christian in forum Data Compression
    Replies: 210
    Last Post: 2nd February 2021, 23:41
  2. Replies: 6
    Last Post: 12th January 2020, 11:12
  3. copy of "Source code for Razor compressor" thread
    By Shelwien in forum Data Compression
    Replies: 38
    Last Post: 2nd January 2020, 17:10
  4. UltraARC for Inno Setup 2.1.0.1
    By RamiroCruzo in forum Data Compression
    Replies: 0
    Last Post: 11th July 2015, 10:42

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •