Page 1 of 2 12 LastLast
Results 1 to 30 of 35

Thread: Compressed Archive: Identifying compression method and Decompressing?

  1. #1
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Compressed Archive: Identifying compression method and Decompressing?

    Hey everyone!
    I have this file here, I was hoping that someone much more knowledgeable in this stuff might be able to help me out.
    https://dl.dropboxusercontent.com/u/...tart0.kiloPack

    It's a rather small file, luckily someone was incredibly nice and was able to write an extractor for this type of file, but I still need to get around the compression before I can do anything with it.
    It's a file archive for the game, The Saboteur, and is the same format that's used for the major data archives, and if I can get through the compression with your help it would do wonders for building a community for modding this game that I love dearly.

    The main thing I was looking for was hopefully someone might be able to identify the compression method, which would be a massive help to me! Further if there's an easy/known way of decompressing the method that would be great too.


    I was hoping if someone has some extra time they could help turn this into a bit of a learning experience, what would be the next steps for me if we were to find out that this was not a common compression method but an unidentified method with no documentation? Would it still be possible or reasonable to try to open still? And how common is this type of thing with modern game archives?


    Anyhow, Thank you for reading this!
    I sincerely appreciate any help I can get.
    Thank you,
    Dan

  2. #2
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Hi! and welcome.
    Possible? Of course. Reasonable? It depends on how much time you want to spend, and what your skills are. Google this: reverse engineering.
    It is recomended start understanding what compression really does, how known algorithms and implementations works, and finnaly, write your own.
    Here´s something really good: http://mattmahoney.net/dc/dce.html (Data Compression Explained).

    Later will take a look into the sample.

  3. Thanks:

    UnknownToaster (12th September 2014)

  4. #3
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Plus: Very much of the "game archives" are zip files renamed, or custom file types using deflate, lzma, or just plain copy.
    So, you could start trying to open it with 7-zip. Then, there's precomp, reflate, lzmarec, lzmadump, etc.

  5. #4
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Hey! Thanks for reading and helping out.
    I've taken look at that link and I definitely appreciate that. Obviously it's a lot of information so I'm taking it in bit by bit.
    I'm definitely glad I have a nice resource to read from, especially all in one place and I appreciate that!

    As for whether it's a zip, at this point I feel fairly satisfied that it's not an exact carbon copy zip file, I've tried lots of programs to try to open it, and a lot of people take a look at it and see if they could figure it out.
    At one point someone told me it might be zlib, another told me it was definitely not zlib. Regardless I've also tried all the zlib decompressors I could get my hands on with no luck. I've tried at a few other communities to see if someone could identify it without much luck.

    Whether it's a very simple format with some basic compression, that's still a possibility, but I don't think it's just a zip file. Have you seen anything in the file that might allude to it being such?
    I've tried looking at the other programs you've mentioned, precomp to start with, but I've been having a hard time finding 'reflate' and was going to soon go to the rest.


    I wanted to comment because I had something that MIGHT be of some sort of assistance?
    So I've got the source code that Will Kirkby wrote to extract the format and I was hoping that if by chance it looked like it was extracting a zip file that might be of use, or you might be able to identify something a bit more complicated?
    Either way, I hope this might be useful:
    https://dl.dropboxusercontent.com/u/...urExtractor.cs

    I'm going to continue on looking for those programs and see if I can get a success, if I do I'll come right back and edit the thread to let you know.
    Again, I appreciate the assistance and any help I can get is a big favor.
    Thank you!

  6. #5
    Member biject.bwts's Avatar
    Join Date
    Jun 2008
    Location
    texas
    Posts
    449
    Thanks
    23
    Thanked 14 Times in 10 Posts

    Exclamation

    You had better hope it was not one of my compresses since most would uncompress it and the resulting file when compressed back would go to starting file. This is what will usually happens if you find the correct method. However that in itself is not a guarantee so good luck. Also some compressors use random bits of time stamps and such so when you uncompress and compress back the original file might not match the strarting let alone if a encryption key is required.
    However most uncompresses will barf and exit when using a file that it knows it could not have been compressed with its matching compressor!

  7. #6
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 796 Times in 488 Posts
    Some tests with precomp -intense shows some zlib streams.

    Code:
    C:\tmp>precomp -cn -intense -oStart0.kiloPack-cn-intense.pcf Start0.kiloPack
    
    Precomp v0.4.3 - ALPHA version - USE FOR TESTING ONLY
    Free for non-commercial use - Copyright 2006-2012 by Christian Schneider
    
    Input file: Start0.kiloPack
    Output file: Start0.kiloPack-cn-intense.pcf
    
    Using packjpg25.dll for JPG recompression.
    --> packJPG library v2.5a (12/12/2011) by Matthias Stirner / Se <--
    More about PackJPG here: http://www.elektronik.htw-aalen.de/packjpg
    
    100.00% - New size: 54757466 instead of 28739584
    
    Done.
    Time: 1 minute(s), 10 second(s)
    
    Recompressed streams: 1632/1647
    zLib streams (intense mode): 1632/1647
    
    You can speed up Precomp for THIS FILE with these parameters:
    -zl18,88,98 -d0
    No streams are found without -intense. Option -cn says don't compress after expanding. (Default is to compress with bzip2). To check that these were not false zlib detections I found that it does improve compression:
    Code:
    28,739,584 Start0.kiloPack
    25,432,637 Start0.kiloPack-m5.zpaq
    25,385,457 Start0.kiloPack-8.paq8pxd13
    54,757,466 Start0.kiloPack-cn-intense.pcf
    17,998,515 Start0.kiloPack-cn-intense.pcf-m5.zpaq
    There is a little compression without precomp due to a little bit of uncompressed data which you can see in the output of fv. Also, strings -12 reveals about a thousand readable text strings:

    Code:
    C:\tmp>strings -12 Start0.kiloPack |more
    ragdoll_b_lowres
     !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|
    PGA_DynDebris_Bird_Body
    PGA_DynDebris_Bird_Feather
    PGA_DynDebris_Bird_Wing
    CH_CF_Casual_LOD_01
    CH_CM_Casual_LOD_01
    PGAContraL_Cigarette
    GB_WP_Mackeral
    PGAContraL_Cigarette_Nofire
    PGAContraL_Zippo
    PGlobalA_WhiskeyFlask
    OccMed_Debris_Mtl_Tny_A
    OccMed_Debris_Mtl_Tny_B
    OccMed_Debris_Mtl_Tny_C
    OccMed_Debris_Mtl_Tny_D
    OccMed_Debris_Mtl_Tny_E
    OccMed_Debris_Mtl_Sml_A
    OccMed_Debris_Mtl_Sml_B
    OccMed_Debris_Mtl_Sml_C
    OccMed_Debris_Mtl_Sml_D
    OccMed_Debris_Mtl_Sml_E
    OccMed_Debris_Mtl_Med_A
    OccMed_Debris_Mtl_Med_B
    OccMed_Debris_Mtl_Med_C
    OccMed_Debris_Mtl_Med_D
    OccMed_Debris_Mtl_Med_E
    PGA_DynDebris_Wood_MAX_A
    PGA_DynDebris_Wood_MAX_B
    PGA_DynDebris_Wood_MAX_C
    PGA_DynDebris_Wood_MAX_D
    PGA_DynDebris_Wood_MAX_E
    OccMed_Debris_Mtl_Lrg_A
    OccMed_Debris_Mtl_Lrg_B
    OccMed_Debris_Mtl_Lrg_C
    OccMed_Debris_Mtl_Lrg_D
    ...
    Attached Thumbnails Attached Thumbnails Click image for larger version. 

Name:	fv.png 
Views:	169 
Size:	149.3 KB 
ID:	3142  

  8. Thanks:

    UnknownToaster (13th September 2014)

  9. #7
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Well, if someone has already parsed it and wrote a unpacker, the best you can do at this time is probably to contact to the actual author of the uncompressor.
    BTW, seems like this guy have an enormous experience at writting depackers. Just look at the amount of diferent archive types his program can decompress.
    In the source, at first sight, there are code to deal with even exotic algos.

  10. #8
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    >>Matt says: Some tests with precomp -intense shows some zlib streams.
    ...
    You can speed up Precomp for THIS FILE with these parameters:
    -zl18,88,98 -d0

    -d0 implies also there's no png images or other file types recognizable by precomp inside those unpacked streams...

  11. #9
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Matt Mahoney: Really useful information! Thank you so much for taking your time out to help me! It's really great information by itself, but even further it helped connect some ideas I had before with the new information. Thanks again, I really appreciate it.

    I was wondering what you might recommend for decompressing the files then, I've tried decompressing the extracted files, but all the tools I've tried using have had an output file of 0 kb. I don't quite understand, it's similar to my experience in the past. I used to just assume it wasn't zlib because I had no success..
    Any advice would be super helpful!

    Gonzalo: I had a little bit of a hard time understanding. I should contact the author of which uncompressor? If we're talking about the person in charge of compressing the file in the first place, I've had no such luck. If we're talking about the uncompressors I've tried in the past that haven't give me the right results, that is a good idea. If anyone would be able to figure it out it should be them for sure.

    As for Aluigi, I have talked to him before, and he doesn't seem very interested in it, which is perfectly alright. He's given me some good advice which I appreciate. Though it is a good idea to see if he might be able to help more now that I have more information to give for sure.

  12. #10
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Oh. I see i misunderstood you. The decompressors not worked. So it's useless wasting time on they.
    Maybe our next try sould be jumping into Aluigi's source code to see what's inside.

  13. #11
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 796 Times in 488 Posts
    I tried the obvious step of renaming it with a .zip extension and trying unzip but that doesn't work. You might need to write some code to find the start of the zlib streams and use zlib to decompress.

  14. #12
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    @Matt: Of course, it has not the usual PK header, nor metadata of any file. But indeed uses zlib to compress. The good new is that it's not encrypted. BTW... you're right: the simplest solution is most times the right solution. Well, sorry but not this time

    2all: Tomorrow (sunday at my place) will take a deeper look trying to find some known headers or magic numbers inside the PCF. Maybe the <file> unix command could throw some light on this mess. Tried Pontello's "trid" but it says not to know anything about the file involved. No luck either with <seg> tool, nor other similar tools.
    But, in some ridiculous way, i have the naive feeling that tomorrow we could figure out what´s inside...


    Edit: I'm forgetting reflate... In the end, as i recall it's output is an uncompressed shar archive with all streams decompressed and some diff to lossless reconstruct the original.
    Last edited by Gonzalo; 14th September 2014 at 04:39.

  15. #13
    Member
    Join Date
    Jun 2014
    Location
    Micama
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I noticed blocks beginning with ALBS at these locations:

    00000800 ALBS 1
    01028800 ALBS 2
    01198800 ALBS 3
    011E0800 ALBS 4
    012EF800 ALBS 5
    01306800 ALBS 6
    01433000 ALBS 7
    014CD800 ALBS 8
    0178A800 ALBS 9
    0185A000 ALBS10
    01938000 ALBS11
    01A1B800 ALBS12
    01B68800 END OF FILE

    The first 0x800 bytes seem to have this layout:

    0000 dword maybe file magic "00PM"
    0004 dword num of ALBS blocks 0x0C =12

    0008 20 Bytes describing ALBS block #1
    001c 20 Bytes describing ALBS block #2
    ...
    00E4 20 Bytes describing ALBS block #12
    00F8..0157 0x60 bytes, look random
    0158..07ff padding 0xcb


    each block consists of
    0000 qword 64bit could be a checksum, looks random
    0008 dword 32bit block length
    000c qword 64bit file offset where ALBS block begins

  16. #14
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Yes, and someone noticed it before us. Even made an 'extractor' which dumps that "ALBS" sections.

    Plus, each packet, as we can call they, seems to contain diferent files inside.

    Again, someone noticed it before. This time is the famous Luigi Auriemma. His "quickbms" program, using "saboteur_kilopack.bms" script, was able to depack nothing less than 1325 files organized in 12 folders, just the number of packets are.
    The point i'm stucked at, is to find out if they are a custom image format, or what...

    PS @batelnik: Does "Micama" mean what i think??? Remember, i'm a Latino ... And it's raining at my home. So i think soon or later i'm going to live in micama too

  17. #15
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Edit: I was not able to find high-entropy files, nor wave-like ones. So i guess there's no audio files.
    Actually, there are some tiny files poorly compressible. Most of they have a four bytes header: <CFX.>, <43 46 58 08> in hex.

    Code:
    Microsoft Windows XP [Versión 5.1.2600]
    (C) Copyright 1985-2001 Microsoft Corp.
    
    D:\KiloPack\OUT>for %c in (0 1 2 3 4 5 6 7 8 9 10 11) do hcheck -quiet %c\*
    
    D:\KiloPack\OUT>hcheck -quiet 0\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 760 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  22709001 byte
    possible comp. size      :  17244128 byte ( 0. order )
    
    min decision content     :  2.000 bit ( 0\000001a7 )
    max decision content     :  8.000 bit ( 0\0000012a )
    minimum cond. entropy    :  0.861 bit ( 0\PowerLine_2 )
    maximum cond. entropy    :  7.800 bit ( 0\DET_Mud01_AB_636 )
    min possible comp. size  :  97.49 percent ( 0\DET_Mud01_AB_636 )
    max possible comp. size  :  10.76 percent ( 0\PowerLine_2 )
    average pos. comp. size  :  40.86 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 1\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 95 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  1474257 byte
    possible comp. size      :  1010626 byte ( 0. order )
    
    min decision content     :  2.000 bit ( 1\00000037 )
    max decision content     :  8.000 bit ( 1\CH_AC_Eyes_Brown_AB_62 )
    minimum cond. entropy    :  1.186 bit ( 1\00000037 )
    maximum cond. entropy    :  7.343 bit ( 1\CH_CM_High_Men_HD_64 )
    min possible comp. size  :  91.78 percent ( 1\CH_CM_High_Men_HD_64 )
    max possible comp. size  :  14.83 percent ( 1\00000037 )
    average pos. comp. size  :  51.45 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 2\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 7 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  1991644 byte
    possible comp. size      :  557213 byte ( 0. order )
    
    min decision content     :  7.098 bit ( 2\difficulty_pointer_AB_1 )
    max decision content     :  8.000 bit ( 2\00000000 )
    minimum cond. entropy    :  1.639 bit ( 2\symbol_faded_AB_5 )
    maximum cond. entropy    :  7.997 bit ( 2\00000000 )
    min possible comp. size  :  99.96 percent ( 2\00000000 )
    max possible comp. size  :  20.49 percent ( 2\symbol_faded_AB_5 )
    average pos. comp. size  :  46.31 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 3\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 27 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  2863744 byte
    possible comp. size      :  1523843 byte ( 0. order )
    
    min decision content     :  5.907 bit ( 3\decal_diamond_AB_7 )
    max decision content     :  8.000 bit ( 3\00000000 )
    minimum cond. entropy    :  1.417 bit ( 3\settings_decals_linesOnly_AB_19 )
    maximum cond. entropy    :  7.998 bit ( 3\00000000 )
    min possible comp. size  :  99.97 percent ( 3\00000000 )
    max possible comp. size  :  17.71 percent ( 3\settings_decals_linesOnly_AB_19 )
    average pos. comp. size  :  45.68 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 4\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 7 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  115469 byte
    possible comp. size      :  83430 byte ( 0. order )
    
    min decision content     :  2.322 bit ( 4\00000002 )
    max decision content     :  8.000 bit ( 4\CH_AL_SeanDevlin_01_HAT_1 )
    minimum cond. entropy    :  1.311 bit ( 4\00000002 )
    maximum cond. entropy    :  6.436 bit ( 4\CH_MB_SeanDevlinn_hat_D_4 )
    min possible comp. size  :  80.45 percent ( 4\CH_MB_SeanDevlinn_hat_D_4 )
    max possible comp. size  :  16.39 percent ( 4\00000002 )
    average pos. comp. size  :  58.23 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 5\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 25 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  3051932 byte
    possible comp. size      :  1708331 byte ( 0. order )
    
    min decision content     :  7.468 bit ( 5\bg_hwtf4_wedge_3 )
    max decision content     :  8.000 bit ( 5\00000000 )
    minimum cond. entropy    :  1.764 bit ( 5\bomber_single_AB_7 )
    maximum cond. entropy    :  7.998 bit ( 5\00000000 )
    min possible comp. size  :  99.97 percent ( 5\00000000 )
    max possible comp. size  :  22.06 percent ( 5\bomber_single_AB_7 )
    average pos. comp. size  :  56.31 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 6\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 71 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  1838424 byte
    possible comp. size      :  884296 byte ( 0. order )
    
    min decision content     :  4.700 bit ( 6\shr_fanfare_ring_11 )
    max decision content     :  8.000 bit ( 6\00000000 )
    minimum cond. entropy    :  1.310 bit ( 6\shr_ammo_underscoreink_AB_3 )
    maximum cond. entropy    :  7.988 bit ( 6\00000000 )
    min possible comp. size  :  99.85 percent ( 6\00000000 )
    max possible comp. size  :  16.38 percent ( 6\shr_ammo_underscoreink_AB_3 )
    average pos. comp. size  :  45.10 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 7\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 45 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  3147212 byte
    possible comp. size      :  2208743 byte ( 0. order )
    
    min decision content     :  2.322 bit ( 7\00000008 )
    max decision content     :  8.000 bit ( 7\CH_AL_SeanDevlin_01_FM_1 )
    minimum cond. entropy    :  1.311 bit ( 7\00000008 )
    maximum cond. entropy    :  7.124 bit ( 7\CH_MB_SeanDevlinn_Head_D_18 )
    min possible comp. size  :  89.05 percent ( 7\CH_MB_SeanDevlinn_Head_D_18 )
    max possible comp. size  :  16.39 percent ( 7\00000008 )
    average pos. comp. size  :  57.94 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 8\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 114 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  532645 byte
    possible comp. size      :  517303 byte ( 0. order )
    
    min decision content     :  3.459 bit ( 8\00000070 )
    max decision content     :  8.000 bit ( 8\00000000 )
    minimum cond. entropy    :  3.325 bit ( 8\00000070 )
    maximum cond. entropy    :  7.966 bit ( 8\00000035 )
    min possible comp. size  :  99.58 percent ( 8\00000035 )
    max possible comp. size  :  41.56 percent ( 8\00000070 )
    average pos. comp. size  :  73.50 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 9\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 55 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  842328 byte
    possible comp. size      :  596590 byte ( 0. order )
    
    min decision content     :  2.322 bit ( 9\00000010 )
    max decision content     :  8.000 bit ( 9\CH_AC_Eyes_Brown_AB_36 )
    minimum cond. entropy    :  1.311 bit ( 9\00000010 )
    maximum cond. entropy    :  7.251 bit ( 9\CH_CM_VestWool_UB_37 )
    min possible comp. size  :  90.63 percent ( 9\CH_CM_VestWool_UB_37 )
    max possible comp. size  :  16.39 percent ( 9\00000010 )
    average pos. comp. size  :  52.82 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 10\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 36 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  926464 byte
    possible comp. size      :  693122 byte ( 0. order )
    
    min decision content     :  2.322 bit ( 10\0000000a )
    max decision content     :  8.000 bit ( 10\CH_AC_Eyes_Brown_AB_24 )
    minimum cond. entropy    :  1.311 bit ( 10\0000000a )
    maximum cond. entropy    :  7.087 bit ( 10\CH_NZ_SS_Infantry_HD_22 )
    min possible comp. size  :  88.58 percent ( 10\CH_NZ_SS_Infantry_HD_22 )
    max possible comp. size  :  16.39 percent ( 10\0000000a )
    average pos. comp. size  :  56.73 percent
    
    
    
    D:\KiloPack\OUT>hcheck -quiet 11\*
    
    --- hcheck v2.0 by Matthias Stirner ---
    
    
    --> final results ( 82 file( s ) ) ( 0. order model ) <--
    
    total bytes read         :  1394276 byte
    possible comp. size      :  1046762 byte ( 0. order )
    
    min decision content     :  2.000 bit ( 11\0000001d )
    max decision content     :  8.000 bit ( 11\CH_AC_Eyelashes_AB_55 )
    minimum cond. entropy    :  0.993 bit ( 11\00000030 )
    maximum cond. entropy    :  7.136 bit ( 11\CH_CF_Hat_HC_NM_73 )
    min possible comp. size  :  89.20 percent ( 11\CH_CF_Hat_HC_NM_73 )
    max possible comp. size  :  12.42 percent ( 11\00000030 )
    average pos. comp. size  :  50.96 percent
    All files decompressed here
    Last edited by Gonzalo; 15th September 2014 at 00:49.

  18. #16
    Member
    Join Date
    Jun 2014
    Location
    Micama
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts
    CFX files

    0000 "CFX." file magic
    0004 dword looks like uncompressed file size
    0008 78 da = zlib stream

    google says zlib magics are:
    78 01 - No Compression/low
    78 9C - Default Compression
    78 DA - Best Compression

    Gonzalo: yes. Micama. Micasa. Essucasa.
    Last edited by batelnik; 15th September 2014 at 02:44.

  19. #17
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Hey guys, again, I can't thank you enough for looking into this.
    Yes Gonzalo, I did end up contacting Aluigi if he might be able to help now that I had a bit more information about the files.
    He was happy to take a look later in the afternoon if I posted a thread on a forum about it.

    That exact script that you linked he made just a few days ago!

    Anyhow, the QuickBMS script he made to do it extracts most of the kilopacks but seems to have compatibility with the megapacks except for one file.
    It seems he said the files in the kilopacks are 'resource' files and not real files.

    This makes sense with what I've seen, it seems there's a lot of files that seem to point to things as opposed to being genuine assets, otherwise there would be hundreds of repetitive files, and they're all rather small.
    It's interesting though, I tried seeing if I could repack some of these files in place of others, but the game would instantly crash while loading.

    I posted on the thread to see if Aluigi might be able to look into a sample of the .megapack files which likely have genuine assets, but it's quite a bit larger..
    If anyone's at all interested in looking at one of the actual .megapacks, here's a link:
    https://dl.dropboxusercontent.com/u/...Mega2.megapack

    I'm definitely quite impressed with the work over here. I know about the ALBS, but what is this CFX file? Where can those be found?
    I know the megapacks for the DLC have another format as well that's not ALBS, and I was hoping Aluigi might be able to crack into that.

    Have you guys been able to figure out a bit more about the individual files in there, I can't be 100% sure my guess is completely accurate.
    Further, now that I have at least the kilopacks opened, the might not need to be cracked into too much, any information is very useful, but I'd hate to bother you all and go to extremities with what I'm asking.

    Again, I do really appreciate your assistance!

  20. #18
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    .

  21. #19
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Don't you worry. It's always a pleassure, and we are learning too in the way.

    But... A few warning words: This forum is about data compression and related, NOT about crack, warez, or any practice that could injury other's rights or work. It's about construct software, not to destroy.
    So, if you are intended to take these files and modify them, or even share them in the net, perharps you might wonder if you are allowed to do that... And i'm wondering if i should help you in that .

    Don't take this as an aggression, just friendly talking about a matter that matters .
    And remember, we are always open to challenges and trying to help out.

    Here is the official position of the site.

  22. #20
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    If you're asking if I intend or plan to try to crack or remove any parts of the software in order to help in illegal sharing of files for my or any others benefit, this I'm absolutely not interested in.
    I'm simply looking through the files to try to better understand how the game works and get to know how it all functions. The Saboteur is my favorite game and I find it to be a great subject to explore.

    I have no interest in harming anything or 'destroy' anything. I'm not entirely sure what you're getting at though, and I don't mean this in a negative way at all either. I understand wares/piracy, that I have no interest in.
    But what else counts as 'destruction' in this context? I mean, I'm just not entirely sure what all is considered bad in this context now, or what I might be infringing.

    If you'd like to clarify further I can more easily tell what I'm trying to do compared to what the community finds acceptable, but I can definitely say I'm not trying to get around any restrictions or otherwise for me or anyone else.

  23. #21
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Of course, told you, don't worry. It's just most times a person try to modify internals of a game, it's because he wants to traspass a limitation impossed by the creators in their own rights. This has seen thousand times around the web.
    But now i know this is not what you are intended to do, i can help whith quiet conscience.



    Edit: this post is in reply to #17. UnknownToaster wrote it just before this one, but as a reply to batelnik's one.
    Last edited by Gonzalo; 16th September 2014 at 01:18.

  24. #22
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Well that's good, I just wanted to be sure I wasn't going to get on the wrong side of anyone here before I continued with anything.

    So anyhow, I'm still not sure where this 'CFX' file comes in, is this one of the smaller 'resource files'?



    I mentioned I posted on the thread in hopes for Aluigi to possibly modify the script he wrote in hopes to be able to extract the .Megapack, which currently extracts 0 files when used on the megapacks.
    I don't know if Aluigi is going to look into it or not as of yet, he very well might get to it, but right now it's more of just waiting to see what happens.

    I decided that in case he doesn't, (the .megapack files are rather large as opposed to the .kilopacks, which currently can be opened, so he might not even bother downloading it.) I decided that it might be wise to see if I can find out any information about the script and differences between the megapack and the kilopack. Obviously there must be an important difference if one is extracting tons of folders and files and one is extracting nothing at all. Of course, I expect the actual content within the archive to also be substantially different, but if I find out whether the issues are stemming from, everyone's a step closer to fixing the issues.

    I know the script has to dig through the 00PM which seems to be the exterior archive, when extracting that brings bins which seem to be identified as these 'ALBS' blocks?
    After extracting those seems to finally get to the folders and resource files inside from what I've gathered. So I went and extracted the 00PM for the Megapack into the bins with the ALBS header.
    I also extracted the type of bins from the kilopack and compared both of the headers in a hex editor.

    The megapack had: ALBS´...1IEH
    The kilopack had: ALBS....Âc=E

    Between the files in from the kilopack, the second part seemed fairly inconstent, all sorts of different character combinations.
    The megapack on the other hand seemed fairly consistent to contain this "1IEH", which might be the next block we'd need to get through?
    Obviously this isn't my expertise at all, but I'm trying to see what I might be able to learn.

    Here's one of the problematic areas, the kilopack is a very small file that's easily sharable, but the megapacks are what contain the majority of the game content. It's not easy to just share them, and I believe might be one of the issues that might deter Aluigi from taking a look into it. There's 3 megapacks that seem to split the content, so here's the smallest one, which still has some size too it unfortunately.
    https://dl.dropboxusercontent.com/u/...Mega2.megapack
    If you'd like to take a look at this, I would very much appreciate it. Of course I completely understand if it's a bit much.

    Anyhow, thanks for all the support so far! I'm hoping I can learn a lot from this.

    Thanks,
    Dan



    Edit: I had an idea, obviously I had to extract the archive to get the bins with the ALBS header, it might be enough just to link one of the bins?
    https://dl.dropboxusercontent.com/u/...00889FFD3D.bin
    Last edited by UnknownToaster; 16th September 2014 at 21:46.

  25. #23
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Yes. There are those ALBS sections... 1004 to be exact.

    Also, 328 bytes after ALBS occurs "AHSM" header, 16 bytes later a file name. AHSM occurs 16234 times, which makes me think that they could be chunks in which the main file was divided to guarantee fast random access to the archive. This could explain why are between 39915 and 40581 raw zlib streams.

    Anyway, i couldn't find any known header or magic number in the uncompressed file/s. Seems to me like Saboteur authors made a custom image format. Data does not look random, so there's no encryption. I'm running out of ideas... Any gentle advice?

    For example: How do i search for BMP-like images without or with unknown header?

  26. #24
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Yes, I noticed this AHSM too, I had assumed ALBS, 1IEH, and AHSM were three different things. But it's just ALBS and AHSM? Further the 1IEH means 1004, or there's 1004 sections of 1IEH?

    So it seems a block that isn't in the kilopacks has been identified.. Do you know how it might be gotten into?
    Have you noticed any other blocks that seem to appear? This definitely seems to be an issue in extracting the megapack with the current script.
    There's even another hurdle in the DLC megapack which is a chunk in place of where the ALBS usually is that we haven't even gone into yet. I was hoping we might learn more about the kilopack and main megapacks first unless you think it may be useful.

    I feel if enough information is gained about the chunk it will be much easier to persuade Aluigi to take time to update his BMS script.
    Otherwise, I'm going to try to learn more about the BMS script and QuickBMS in general in hopes I can be of any use in that.


    Well, I know for a fact that the game uses some .dds and .dxt files.
    The Kilopack looks to contain possibly a few different types of files perhaps? Or do they all seem to be the same?
    Either way, some of these don't look like they'd be a .dds, especially since the .dds files in the game's main folder have DDS as the header and have pretty recognizable formats.

    But I was able to find a file from the archive that still looked to have a similar pattern, but still looks partially compressed?
    https://dl.dropboxusercontent.com/u/...Lt_Details_664
    This is the file that's from the kilopack.
    Here's the dds from the game's main directory
    https://dl.dropboxusercontent.com/u/...ng_text_GR.dds

    Might this be of use at all?

  27. #25
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Confirmed: I dumped the zlib streams compressed, and they're all rounded numbers in size. Exactly 2, 3, 4 kb and so on.

  28. #26
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    Hey! That sounds like good news but I'm not entirely sure I understand.
    Are you confirming that the files extracted from the kilopack are in fact compressed, such as the file that may be a DDS?

    Or are you saying that you were able to extract the AHSM and that they were also zlib compressed?

  29. #27
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Yes, i suggest you to learn about BMS and related, for i'm doing hand researching with known tools like an hex editor and deflate handlers. And i have a theory which i hope i can prove with actual extracting of meaningful data. In the next few days i'll be a little bit busy, but maybe at the weekend we can interchange results if you want to...
    As far as for Luigi is concerned, i'm agree that he might be more inclined to help us if we give him more substantial data to work with. So let's see it later.
    BTW... maybe DDS it's the start of the thread, the tip of the iceberg. Should mention BEFORE!!!

  30. #28
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Hey! That sounds like good news but I'm not entirely sure I understand.
    Are you confirming that the files extracted from the kilopack are in fact compressed, such as the file that may be a DDS?

    Or are you saying that you were able to extract the AHSM and that they were also zlib compressed?
    Well. The complete megapack file is compressed by zlib. This is easily ascertainable by doing this:
    Code:
    precomp -cn -intense FILE
    Wait for a while and look at the console output.
    But, what i think i've found is the way the archive is internally organized. All this files extracted by quickbms from the .kilopack are nothing but pieces of a bigger, meaningful file (twelve in the given example). Megapack are pretty similar in structure, maybe diferent only in header and size.

    But it's just ALBS and AHSM? Further the 1IEH means 1004, or there's 1004 sections of 1IEH?
    1004 is a decimal number. There are one thousand four identical strings sparse by the entire lenght.

    So it seems a block that isn't in the kilopacks has been identified..
    Actually, that "block" type was always in .kilopack. We just didn't realize.
    As far as i can see, "ALBS" sections are the start of the files archived, and the "AHSM" sections are chunks of deflate data in which the main file (AKA "ALBS" section) is divided in the proposal of comfortable handling by the game's engine.
    By decompressing all these chunks in order, we will be able to reconstruct the wanted resource.
    But it's just my guessing. I'm open to any opinion, or furthermore, any proved fact i've overlooked.
    Last edited by Gonzalo; 17th September 2014 at 03:53.

  31. #29
    Member UnknownToaster's Avatar
    Join Date
    Sep 2014
    Location
    Chicago
    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts
    I can see what I can do about learning about QuickBMS, but I don't have a substantial amount of programming experience or much knowledge of the formats. Luckily, I do have access to a large amount of scripts to see for examples thanks to Aluigi's website.

    His page on QuickBMS does insist that a huge amount of background knowledge of programming isn't necessary, but what's more important is understanding of the formats. I've been looking into how to really research and learn about these files. The quick BMS page also says the important pieces of information that must be gained are:

    "- filename
    - offset
    - size
    - optional compressed size if the file is compressed"

    This may come in handy. I am trying to learn about this QuickBMS stuff, but it seems the more pertinent issue involving the QuickBMS is me learning how to study and research these files.
    If I can find out the information about them, I very well may be able to use the information to convert it into a QuickBMS script.



    Further, I do believe your theory may be correct. When I first accessed the extracted files, it seemed like it was the only logical theory since a lot of the files seemed like segments with unintelligible starts and ends on some of them and their incredibly small files, but also importantly the constant overlap of files between folders. You can constantly find the same names and what not. I had assumed they were just pieces of the files, but I dropped the theory when Aluigi said that they were resource files. I looked through one of the only megapacks I could successfully extract and it looked like there was a folder for each npc in the game with a small file that chose which hands, shirt, undershirt, pants, shoes, head, even mouth, etc were. It seemed to make sense that the files might be small files that simply pointed the game which piece to use (to access the 'real' data in the megapacks).

    This is not supported by the fact that when I would open these files they seemed to have what looked to me like important data, not something as simple as a pointer file, meant to be tossed in and thrown away, and also the fact that when I tried to 'replace' one of these files to see how the game would react, the game simply crashed. If the file I replaced was only a fragment of a file, and what I replaced it with was also only a fragment of the file, then of course it would crash.

    I think there's valid reason to suspect they might come together to be full files, especially if this AHSM block has been in here and gone unnoticed.




    Further, if this means that with this theory and any information that comes with it that I may have to find a way to implement the AHSM into the QuickBMS script, this may be well over my abilities, though of course I can try. But on the other hand, if this can be backed up further, Aluigi may be willing to do it himself.

    Let me know what you think, I'll keep looking into this and the QuickBMS.

  32. #30
    Member
    Join Date
    Aug 2014
    Location
    Argentina
    Posts
    536
    Thanks
    238
    Thanked 90 Times in 70 Posts
    Hi again!
    Finally i've managed to extract the data as i intended. But, i'm not sure it should be extracted that way. Many files starts with the same first several bytes but i'm not able to find a single known header.

    Besides that, i've found names in the kilopack file (KP starting now). And names .TGA. Perhaps this is one of the cases when a rather small file is an index containing pointers to the actual data in another bigger one...

    In case anyone is interested, when they are uploaded, here will be the links to the unpacked data for both archives. I also include the SysInternals "strings" command's output for each 'extracted' file.

    Edit:
    https://drive.google.com/file/d/0BzZ...it?usp=sharing
    https://drive.google.com/file/d/0BzZ...it?usp=sharing

    I definitely could use if you want to upload it, the whole game folder to see correlations. Because this seemed like we're trying to assembly a puzzle with only two of the 72 pieces
    Last edited by Gonzalo; 22nd September 2014 at 05:17.

Page 1 of 2 12 LastLast

Similar Threads

  1. Need Help Decompressing a SREP+LZMA archive file ;(
    By CoreGames in forum Data Compression
    Replies: 1
    Last Post: 3rd July 2014, 06:47
  2. Problems identifying file compression
    By Mexxi in forum Data Compression
    Replies: 11
    Last Post: 1st February 2012, 23:03
  3. Decompressing - KGB Archive GTA IV.kgb 2GB into 64kb ?
    By apollo in forum Data Compression
    Replies: 17
    Last Post: 14th April 2011, 11:43
  4. Most efficient/practical compression method for short strings?
    By never frog in forum Data Compression
    Replies: 6
    Last Post: 1st September 2009, 04:05
  5. Does this compression method already exist?
    By Lasse Reinhold in forum Forum Archive
    Replies: 4
    Last Post: 24th August 2007, 13:59

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •