Page 9 of 13 FirstFirst ... 7891011 ... LastLast
Results 241 to 270 of 361

Thread: EMMA - Context Mixing Compressor

  1. #241
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    Yes, I know Stefan, thank you, I read the documentation, that is exactly how EMMA detects endianness on DICOM files. The only problems were with ACR-NEMA 1.0 and 2.0 files, on those I check for the (0008,0010) tag [RecognitionCode] and its content, and in the case of MR and some other files I found, where even that is absent, I search for a valid tag from the Image Presentation Group, in either endianness and with explicit or implicit VR, and if found try to process the tags from then on. The links you provided were quite helpful.

    I just haven't been able to identify what format the "x-ray" file from the Silesia Corpus is in. Matt lists it as DICOM in his benchmark page, but that's wrong, it uses a simple fixed length header format.
    Do you have any idea what format it could be?

    Best regards

  2. #242
    Member
    Join Date
    Aug 2016
    Location
    USA
    Posts
    42
    Thanks
    11
    Thanked 17 Times in 12 Posts
    Then you're on top of things. The x-ray file is definitely not DICOM; I have no idea what the FS_A.3197.img string implies, since there's a lot of software that uses .img as a file extension for images. If I had to guess, it could be Mayo/Analyze format - that uses a simple header and has used .img traditionally. Some image processing software uses that format eventhough it's much more oriented towards 3D images. https://en.wikipedia.org/wiki/Analyz...ging_software) has link to a pdf descrbing the header. The header size does not seem to quite match, but Analyze allows weird offsets and padding so who knows...

  3. #243
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    That was also my best guess, Analyze 7.5, but there are too many discrepancies. I could write a simple parser based on what can be interpreted from examining the file, but then it would probably only work on that file alone, and I don't think it's fair to optimize a general purpose compressor for a single file of a benchmark.
    At least the DICOM parser is useful, even if it's not a common image format. And since I already had the generic image model for the RAW photos and PSD images, it was just a matter of putting it to use.

    Thank you for all your help, best regards

  4. #244
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    EMMA v0.1.17, attachment in first post.

    Code:
    Changes:
    - Improved the parser for DICOM/ACR NEMA 1.0 & 2.0 images
    - New endianness transform for the generic image model
    - New preliminary model for Panasonic .RW2 raw photos (requested by Stephan Busch)
    I've created a new model for raw images for Panasonic cameras using the RW2 format.
    It's still very simple but it is already showing some promising results. I'll keep improving it.
    I've also included the new endianness transform, which allows EMMA to convert little-endian pixel values to big-endian, which helps with the generic model.
    For future reference, with the new transform, forcing the use of the generic image model for x-ray from the Silesia Corpus would give a result of 3.504.941 bytes.

    Results for the new endianness transform:
    Code:
    File: mr, from Silesia Corpus, 9.970.564 bytes
    1.951.715 bytes, EMMA 0.1.16 x86, Preset "Images (Slow)"
    1.913.215 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"
    
    File: PANASONIC_LUMIX_L1.raw, from https://www.rawsamples.ch, 15.041.536 bytes
    7.276.519 bytes, EMMA 0.1.16 x86, Preset "Images (Slow)"
    7.248.864 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"
    Results for the new RW2 image model:
    Code:
    File: panasonic_lumix_g5_15.rw2, from SqueezeChart, 19.788.288 bytes
    14.154.906 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"
    
    File: panasonic_lumix_dmc_gh3_10.rw2, from SqueezeChart, 19.826.176 bytes
    12.873.576 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"

  5. The Following 2 Users Say Thank You to mpais For This Useful Post:

    Darek (18th September 2016),Stephan Busch (18th September 2016)

  6. #245
    Member
    Join Date
    Mar 2016
    Location
    Croatia
    Posts
    183
    Thanks
    76
    Thanked 12 Times in 11 Posts
    I have thousands of GH2 rw2 pictures, let me know if you need some samples.

  7. #246
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    940
    Thanks
    558
    Thanked 380 Times in 284 Posts
    Quote Originally Posted by mpais View Post
    EMMA v0.1.17, attachment in first post.
    For future reference, with the new transform, forcing the use of the generic image model for x-ray from the Silesia Corpus would give a result of 3.504.941 bytes.
    For my testbed there no any changes. According to use generic image model - how you force EMMA to use it? Could you add switch to force this model (on/off)?

    Darek

  8. #247
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    @dado023
    Could you test on some of your photos, to see if they are detected?
    I have several .RW2 files to test, but only one .RAW using the same format as RW2 (PANASONIC_DMCL10.raw from rawsamples.ch), so I'm not sure if the parser will correctly detect those. I'll have to search for some more.

    @Darek
    I simply have some debug code in the PNM parser to force a detection at the correct offset.

    Even if I wanted to give you the option to force the use of the model, you wouldn't be able to decompress the file.

    As you may recall, EMMA is a streaming compressor, meaning that it compresses data as it sees it, and with the lowest possible latency (for a CM compression engine that is still quite high, obviously).

    That means it doesn't do a first pass through the data to detect what types of models to use, and, crucially to the question at hand, it doesn't save any information in the output file related to the segmentation in blocks for different models, as PAQ variants do. It relies solely on its parsers to signal a detection, both when compressing and decompressing. So even when you're decompressing a file, EMMA is still using its parsers, just like when the file was first compressed. So if you were to "override" the parsers by forcing a detection, EMMA would have no way to know that when decompressing, and the output file would be corrupted.

    If you remember, I once told you that I couldn't just use the TIFF parser in paq8pxd, and that is part of the reason why. The parsers in EMMA must be able to correctly detect data when compressing (by analysing the next few bytes), when decompressing (by analysing the previous few bytes) and even have to deal with skips in positions (for instance, the TIFF parser has read the first IFD and knows that the next one is at offset X, and before it sees it, the JPEG parser detects an embedded image, so the JPEG model is called to compress it. All parsers are put on hold, and when we're finished with the JPEG, we resume parsing, and the TIFF parser has to check if we're past offset X). Add to this the fact that the transforms also work like this, and you'll see that keeping everything in sync is fulcral, so any option that could wreck havoc with all this is simply too much work to be worth it.

    Best regards
    Last edited by mpais; 19th September 2016 at 21:35.

  9. The Following User Says Thank You to mpais For This Useful Post:

    Darek (19th September 2016)

  10. #248
    Member
    Join Date
    Dec 2008
    Location
    Poland, Warsaw
    Posts
    940
    Thanks
    558
    Thanked 380 Times in 284 Posts
    Quote Originally Posted by mpais View Post
    @dado023
    Could you test on some of your photos, to see if they are detected?
    I have several .RW2 files to test, but only one .RAW using the same format as RW2 (PANASONIC_DMCL10.raw from rawsamples.ch), so I'm not sure if the parser will correctly detect those. I'll have to search for some more.

    @Darek
    I simply have some debug code in the PNM parser to force a detection at the correct offset.

    Even if I wanted to give you the option to force the use of the model, you wouldn't be able to decompress the file.

    As you may recall, EMMA is a streaming compressor, meaning that it compresses data as it sees it, and with the lowest possible latency (for a CM compression engine that is still quite high, obviously).

    That means it doesn't do a first pass through the data to detect what types of models to use, and, crucially to the question at hand, it doesn't save any information in the output file related to the segmentation in blocks for different models, as PAQ variants do. It relies solely on its parsers to signal a detection, both when compressing and decompressing. So even when you're decompressing a file, EMMA is still using its parsers, just like when the file was first compressed. So if you were to "override" the parsers by forcing a detection, EMMA would have no way to know that when decompressing, and the output file would be corrupted.

    If you remember, I once told you that I couldn't just use the TIFF parser in paq8pxd, and that is part of the reason why. The parsers in EMMA must be able to correctly detect data when compressing (by analysing the next few bytes), when decompressing (by analysing the previous few bytes) and even have to deal with skips in positions (for instance, the TIFF parser has read the first IFD and knows that the next one is at offset X, and before it sees it, the JPEG parser detects an embedded image, so the JPEG model is called to compress it. All parsers are put on hold, and when we're finished with the JPEG, we resume parsing, and the TIFF parser has to check if we're past offset X). Add to this the fact that the transforms also work like this, and you'll see that keeping everything in sync is fulcral, so any option that could wreck havoc with all this is simply too much work to be worth it.

    Best regards
    Many thanks for detailed explanation! I didn't realise that EMMA uses parsers during decompression in that way.

    Best Regards,
    Darek

  11. #249
    Member
    Join Date
    Jul 2014
    Location
    Mars
    Posts
    173
    Thanks
    121
    Thanked 11 Times in 10 Posts
    I wonder if it`s useful to use in Emma - https://github.com/JBontes/FastCode or ASMLib
    Last edited by necros; 25th September 2016 at 02:58.

  12. #250
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    Thank you necros, it seems interesting, I'll look into it.
    Best regards

  13. #251
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    EMMA v0.1.18, attachment in first post.

    Code:
    Changes:
    - Improved model for Panasonic RW2 raw photos
    - Slightly improved record model
    - Small speed optimizations
    - Use UTF8 to store filenames, check for filesizes over 4GiB
    Small improvement for RW2 raw photos, unfortunately Panasonic uses such a dumb way to store the raw data that I can only guess it was meant as an obfuscation attempt to prevent someone from reverse-engineering it. A dedicated compressor for them would probably get better ratios while being orders of magnitude faster.

    Results:
    Code:
    File: panasonic_lumix_g5_15.rw2, from SqueezeChart, 19.788.288 bytes
    14.154.906 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"
    13.834.542 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    
    File: panasonic_lumix_dmc_gh3_10.rw2, from SqueezeChart, 19.826.176 bytes
    12.873.576 bytes, EMMA 0.1.17 x86, Preset "Images (Slow)"
    12.591.935 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"

  14. #252
    Member
    Join Date
    Oct 2016
    Location
    Slovakia
    Posts
    22
    Thanks
    37
    Thanked 3 Times in 3 Posts
    @mpais
    I just a got a new Panasonic GM1 camera, so I can help with providing sample RW2 images / testing. Only 8 GB of RAM on my PC, though.

    Roman

  15. #253
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    Quote Originally Posted by Hacker View Post
    I just a got a new Panasonic GM1 camera, so I can help with providing sample RW2 images / testing. Only 8 GB of RAM on my PC, though.

    Roman
    Thank you Roman, could you test with some images to see if they are properly detected? You just have to chose one of the "Image" presets. If the image is detected, you will see it as "Raw Images" in the parsing info when encoding.

    Best regards

  16. #254
    Member
    Join Date
    Oct 2016
    Location
    Slovakia
    Posts
    22
    Thanks
    37
    Thanked 3 Times in 3 Posts
    @mpais

    Quote Originally Posted by mpais View Post
    Thank you Roman, could you test with some images to see if they are properly detected? You just have to chose one of the "Image" presets. If the image is detected, you will see it as "Raw Images" in the parsing info when encoding.
    Yes it does, one RAW and two embedded JPGs.

  17. The Following User Says Thank You to Hacker For This Useful Post:

    mpais (6th October 2016)

  18. #255
    Member
    Join Date
    Jul 2014
    Location
    Mars
    Posts
    173
    Thanks
    121
    Thanked 11 Times in 10 Posts
    why no srccode?

  19. #256
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    The code is sub-par, too messy, it would take a significant effort to clean it up, and I'd rather spend that time improving or creating new models. As it is now, I only have a few hours for coding on the weekends, so I want to make the most of them. And I honestly don't think it would make any difference to EMMAs development, there just doesn't seem to be much interest for extreme-compression nowadays. CMIX, MCM, paq8px/paq8pxd are all open source, and yet you only see their respective authors making improvements to them.

    Best regards

  20. The Following User Says Thank You to mpais For This Useful Post:

    xinix (12th October 2016)

  21. #257
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    EMMA v0.1.19, attachment in first post.

    Code:
    Changes:
    - Improved the TIFF parser, now supports:
        - Fujifilm .RAF raw images (with the help of a dedicated parser, still not full support)
        - Kodak .KDC raw images
        - Mamiya .MEF raw images
        - Leaf/Aptus/Mamiya .MOS raw images, uncompressed only
        - Pentax .PEF raw images, uncompressed only
    - Slightly improved JPEG model
    EMMA now supports 10 raw image formats, from 9 manufacturers. These latest additions all use the generic image model.
    I can also use this model for old Konica-Minolta .MRW raw images, but it seems other formats would need special models (like a lossless JPEG model).

    Results:
    Code:
    File: fujifilm_xf1_08.raf, from SqueezeChart, 19.768.256 bytes
    14.142.245 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    12.338.561 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: fujifilm_finepix_x100_11.raf, from SqueezeChart, 19.901.392 bytes
    11.725.197 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
     9.410.512 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: fujifilm_x_e1_20.raf, from SqueezeChart, 26.146.816 bytes
    12.019.312 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    10.635.336 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: kodak_easyshare_z990_01.kdc, 22.388.380 bytes
    13.930.711 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    13.111.149 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: kodak_easyshare_z990_17.kdc, 22.312.111 bytes
    16.010.468 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    14.518.564 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_KODAK_EASYSHARE_Z1015-IS.KDC, from rawsamples.ch, 19.869.916 bytes
    14.417.952 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    12.932.008 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_MAMIYA_ZD.MEF, from rawsamples.ch, 36.575.308 bytes
    16.980.428 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    14.203.794 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_LEAF_APTUS_22.MOS, from rawsamples.ch, 43.383.462 bytes
    25.954.658 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    23.297.664 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_PENTAX_STARISTD_SRGB.PEF, from rawsamples.ch, 13.402.105 bytes
    5.575.594 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    4.999.288 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_PENTAX_K100.PEF, from rawsamples.ch, 11.199.172 bytes
    7.201.224 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    6.387.473 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"
    
    File: RAW_PENTAX_STAR_DL2.PEF, from rawsamples.ch, 10.756.156 bytes
    6.588.566 bytes, EMMA 0.1.18 x86, Preset "Images (Slow)"
    5.942.906 bytes, EMMA 0.1.19 x86, Preset "Images (Slow)"

  22. The Following 4 Users Say Thank You to mpais For This Useful Post:

    Hacker (18th October 2016),load (16th October 2016),Mike (16th October 2016),Stephan Busch (17th October 2016)

  23. #258
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    EMMA v0.1.20

    Code:
    Changes:
    - Improved the JPEG parser
    - Completely rewritten the 8-bit grayscale image model
    I've finally had some time to improve EMMA, so I decided to start by rewriting the 8-bit grayscale model.
    I still need to improve the mixing and refinement stages, but that will have to wait.

    So, now for some results. To start, I'll use the 8bpp grayscale images testset from imagecompression.info.
    All compressors were set for maximum compression.
    For paq8px_v75, I used option "-8".
    For EMMA I used the "Images (Slow)" preset.
    For MRP 0.5 I used option "-o" (so the default maximum 100 iterations, plus up to another 100 iterations for experimental predictor optimization)
    For BMF 2.01 I tried options "-s" and "-s -q9" and chose the best result
    GraLIC doesn't have any options (that I'm aware)

    Attachment 4792

    I decided to exclude the results for 2 files (artificial.pgm and zone_plate.pgm) from the totals and averages since those aren't photographic images and heavily skewed the results in EMMA's favor.

    I then tested with the 8bpp grayscale images testset from SqueezeChart, which is composed of 4 medical images and 1 photo (sigma8 ).

    Attachment 4793

    Here we see that EMMA still can't compete with MRP when it comes to medical images, though in the only photo present it did manage the best result.

    I'll see if I can further improve the model in the future, since these tests show there is clearly room for improvement, but for now I'll try plugging it into the 24bpp image model and see what kind of results it provides.

    Best regards

  24. The Following 3 Users Say Thank You to mpais For This Useful Post:

    Hacker (20th January 2017),Mike (17th January 2017),schnaader (17th January 2017)

  25. #259
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    Btw, what's wrong with mod_ppmd?
    Or there was no improvement from it somehow?

  26. #260
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    555
    Thanks
    208
    Thanked 193 Times in 90 Posts
    I was curious how FLIF performs, here are the results:

    Code:
    FLIF 0.2.0 x64 [22 Sept 2016] from GitHub release section
    ---
    			-e		-e -N (non interlaced)	EMMA 0.1.20 x86
    big_building		16.826.233	16.561.534		15.455.114
    big_tree		12.535.456	12.472.192		11.805.757
    bridge			 5.683.792       5.701.144	 	 5.331.410
    cathedral		 2.594.962	 2.557.988	 	 2.401.353
    deer			 6.010.200	 6.102.572	 	 5.611.263
    fireworks		 1.251.854	 1.227.838	 	 1.150.430
    flower_foveon              808.403	   800.734	   	   739.962
    hdr                      1.622.856       1.598.580	 	 1.473.555
    leaves_iso_200		 2.769.085	 2.683.417	 	 2.465.614
    leaves_iso_1600		 3.314.779	 3.269.780	 	 3.076.029
    nightshot_iso_100	 1.807.788	 1.791.173	 	 1.669.225
    nightshot_iso_1600       3.548.165	 3.543.715	 	 3.361.677
    spider_web		 2.354.061	 2.332.989	 	 2.090.452
    ---
    total			61.611.877	61.041.729		56.631.841
    ---
    artificial	   	   484.243	   398.073	   	   273.686
    zone_plate		 2.953.759	 3.317.375	   	   162.434
    http://schnaader.info
    Damn kids. They're all alike.

  27. The Following 2 Users Say Thank You to schnaader For This Useful Post:

    Hacker (20th January 2017),mpais (17th January 2017)

  28. #261
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    There're also
    Code:
        v_printf(1,"   -E, --effort=N              0=fast/poor compression, 100=slowest/best? (default: -E60)\n");
        v_printf(2,"   -Y, --no-ycocg              disable YCoCg transform (use plain RGB instead)\n");

  29. #262
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    @Shelwien

    I never tried mod_ppmd in EMMA. I may think about it when I get around to trying to improve text compression, since I saw it helped a bit with cmix. I'll have to see how much work it is to port it to Delphi. What I did try related to PPM in general was to take the idea of information inheritance and use it in my data structure for ludicrous mode, on text files it gave a small but consistent gain.

    As for FLIF, ycocg at least won't make any difference, since these are grayscale images. I expected FLIF to do much better on the 2 computer generated images, I've never looked into it but it is described as a universal codec, suitable both to photographic and non-photographic content. Could there be some option that helps with that?

    Best regards

  30. #263
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    > I get around to trying to improve text compression

    ppmd is not really for text compression, its just that a byte model is significantly different from paq's bit models,
    and thus can win sometimes.

    > I saw it helped a bit with cmix

    cmix already had a few integrated PPM models, so the gain was very small there
    (Though I also had a suspicion that it simply doesn't get enough weight in the mix, because of n-ary mix and multiple paq models in it).
    But there's much more effect in other cases, eg. http://encode.su/threads/1464-Paq8px...ll=1#post49050

    > I'll have to see how much work it is to port it to Delphi.

    Completely translating it to delphi doesn't seem realistic to me, but I can make a dll of it if you'd use that, or just .obj maybe (for {$L ...}).
    The main function just gets a bit and outputs a probability

  31. #264
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    A dll would probably be the best option, that way it can be dynamically loaded if present in the same directory. That way if not used, it wouldn't have to be included in the total decompressor size for the LTCB or similar benchmarks. But it would need an option to select the memory used, and not knowing much about it, do you think it could give good results using just, lets say, 128MB of memory (in the x86 version of EMMA)? I once made a PPM compressor to test some data structures (mainly different types of trees) to use for keeping the stats, and I remember that even with several pruning strategies, it only got reasonable results using lots of memory, and I'd like to keep memory usage for EMMA in check.

    My next to-do items are improving my color image model (24bpp) and also my color palette image model (8bpp), but if including mod_ppmd is reasonably straightforward I'll try including it in the next version.

  32. #265
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    http://nishi.dreamhosters.com/u/mod_ppmd_v3_dll.rar


    // Allocate a mod_ppmd instance
    // returns a pointer to instance or 0 on error
    void* __stdcall ppmd_Alloc( void );

    // Free memory used by ppmd instance
    void __stdcall ppmd_Free( void* p );

    // Initialize a mod_ppmd instance
    // p = pointer from ppmd_Alloc
    // ppmd_order = ppmd model order (2..256), normal is something like 12
    // ppmd_memory = ppmd model memory size, in megabytes; (1..4000?)
    // ppmd_restore = action type on memory overflow: 0=reset statistics, 1=reduce the tree
    int __stdcall ppmd_Init( void* p, unsigned ppmd_order, unsigned ppmd_memory, unsigned ppmd_restore );

    // Free the memory allocated by ppmd_Init
    void __stdcall ppmd_Quit( void* p );

    // Get the current model size, in MBs
    int __stdcall ppmd_GetUsedMemory( void* p );

    // get the probability of next_bit==0, SCALE is the fixed-point multiplier for probability
    // output: zero's probability value in range 1..SCALE-1, SCALE/2=0.5
    int __stdcall ppmd_Predict( void* p, int SCALE );

    // pass the actual bit value to the model after (de)coding
    void __stdcall ppmd_Setbit( void* p, int bit );

  33. The Following 5 Users Say Thank You to Shelwien For This Useful Post:

    Bulat Ziganshin (18th January 2017),Mauro Vezzosi (17th January 2017),Mike (18th January 2017),mpais (17th January 2017),xinix (18th January 2017)

  34. #266
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    > do you think it could give good results using just, lets say, 128MB of memory (in the x86 version of EMMA)?

    That depends on file size actually... at order-12 it uses 14M for book1.

    Also note that you can actually use multiple instances of ppmd model here,
    with different parameters.

    > I remember that even with several pruning strategies,
    > it only got reasonable results using lots of memory,

    ppmd is pretty reasonable in that sense, and got multiple pruning stategies too,
    though I think r2 is not implemented in this version.

    And here it won't be the main model anyway, so circumstances are considerably different.
    I mean, for mixing with paq it could be better to use settings which would be inefficient
    for a standalone ppmd coder, like high model order with frequent memory resets, for example.

    > but if including mod_ppmd is reasonably straightforward I'll try including it in the next version.

    I hope that it is simple enough.
    Pay attention to the probability-of-zero thing though, I prefer this, while
    paq uses probability-of-one, so mod_ppmd output has to be inverted.
    Just in case, here's the paq wrapper for it:


    static ppmd_Model ppmd_12_256_1;

    void ppmdModel( Mixer& m ) {
    static int init_flag = 1;
    if( init_flag ) {
    ppmd_12_256_1.Init(12,256,1,0);
    // ppmd_6_32_1.Init(6,32,1,0);
    }

    m.add( stretch(4096-ppmd_12_256_1.ppmd_Predict(4096,y)) );
    // m.add( stretch(4096-ppmd_6_32_1.ppmd_Predict(4096,y)) );

    init_flag=0;
    }


    Also here's the processing loop in pmd.cpp example:

    rc_Init();
    for( ofs=0; ofs<_filesize; ofs++ ) {
    c = 0;
    if( ProcMode==0 ) c=get();

    for( i=8,j=0; i!=0; i-- ) {
    bit = ProcMode ? 0 : (c>>(i-1))&1;
    p = ppmd_Predict(PM,SCALE);
    rc_BProcess( p, bit );
    j += j+bit;
    ppmd_Setbit(PM,bit);
    }
    c = j;

    if( ProcMode==1 ) put(c);
    }
    rc_Quit();

  35. The Following 6 Users Say Thank You to Shelwien For This Useful Post:

    Bulat Ziganshin (18th January 2017),Mauro Vezzosi (18th January 2017),Mike (18th January 2017),mpais (17th January 2017),RamiroCruzo (18th January 2017),xinix (18th January 2017)

  36. #267
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    Thank you Shelwien, as soon as I have some time for coding I'll try it out and report the results.
    Could you also provide some info regarding it's license?

  37. #268
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    public domain, because that's original ppmd license.

    Quote Originally Posted by ppmd documentation
    3. LEGAL stuff.

    You can not misattribute authorship on algorithm or code sources, You can
    not patent algorithm or its parts, all other things are allowed and
    welcomed.
    Dmitry Subbotin and Dmitry Shkarin have authorship rights on code sources.
    Dmitry Subbotin owns authorship rights on his variation of rangecoder
    algorithm and I own authorship rights on my variation of PPM algorithm.
    This variation is named PPMII (PPM with Information Inheritance). If You
    use PPMd, our authorship must be mentioned somewhere in documentation on
    your program
    .
    PPMonstr program is distributed for experiments and noncommercial use
    only.
    [...]
    Apr 7, 2000 var.F:
    Michael Schindler`s rangecoder implementation was replaced with
    'carryless rangecoder' by Dmitry Subbotin. Now, PPMd is pure public
    domain program;
    [...]
    AUTHOR SHALL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, SPECIAL, INCIDENTAL,
    OR CONSEQUENTIAL DAMAGES ARISING OUT OF ANY USE OF THIS SOFTWARE. YOU USE
    THIS PROGRAM AT YOUR OWN RISK.
    I'd have to add though, that there's no need to mention Subbotin because his rangecoder
    is not used here in any way, but mod_ppmd is based on ppmd_sh, which is in turn
    a heavily refactored version of ppmd vJr1 made by me.

    Also please don't use mod_ppmd as a standalone coder - the actual coder is
    https://github.com/Shelwien/ppmd_sh/releases (ppmd_sh v9)
    http://compression.ru/ds/ppmdj1.rar (ppmd vJr1)

    mod_ppmd is just a slow wrapper for using ppmd prediction in bitwise coders.

    P.S. Btw, I tried compiling 32-bit modppmd.dll with VS/size_opt and got 13824 bytes, 7592 in .7z. But its 1.5x slower.

  38. The Following 3 Users Say Thank You to Shelwien For This Useful Post:

    Bulat Ziganshin (18th January 2017),RamiroCruzo (18th January 2017),xinix (18th January 2017)

  39. #269
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    521
    Thanks
    198
    Thanked 745 Times in 302 Posts
    I've done some quick tests, using 128MB, order 12, and the option to reduce the tree when the memory is full. Results:

    Code:
     book1
    178.340 bytes, EMMA 0.1.20 x86
    177.154 bytes, EMMA 0.1.20 x86 + mod_ppmd
    
     book2
    115.514 bytes, EMMA 0.1.20 x86
    114.976 bytes, EMMA 0.1.20 x86 + mod_ppmd
    
     world95.txt
    343.561 bytes, EMMA 0.1.20 x86
    341.172 bytes, EMMA 0.1.20 x86 + mod_ppmd
    
     dickens
    1.910.432 bytes, EMMA 0.1.20 x86
    1.898.916 bytes, EMMA 0.1.20 x86 + mod_ppmd
    
     enwik6
    201.419 bytes, EMMA 0.1.20 x86
    200.192 bytes, EMMA 0.1.20 x86 + mod_ppmd
    I'll try other configurations to see if results improve. I'll probably allow selection of the options in EMMA, that way anyone can try tweaking it.
    Best regards

  40. #270
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,269
    Thanks
    200
    Thanked 985 Times in 511 Posts
    That sounds right. Please check decoding though :)
    Code:
    book1       178340   177154  0.665%   1186
    book2       115514   114976  0.465%    538
    world95     343561   341172  0.695%   2389
    dickens    1910432  1898916  0.602%  11516
    enwik6      201419   200192  0.609%   1227
    enwik8drt 16855079 16753949* 0.600% 101130
    
    * (1-x/16855079)*100=0.6
    cmix seems to only have this atm:
      AddByteModel(new PPMD(16, 1680, manager_.bit_context_));

    and paq8pxd18:
      ppmd_12_256_1.Init( 12+(x.clevel>8),210<<(x.clevel>8))<<(x.clevel>13),1,0);
    ppmd_6_32_1.Init( 3<<(x.clevel>8),16<<(x.clevel>8),1,0);

Page 9 of 13 FirstFirst ... 7891011 ... LastLast

Similar Threads

  1. Context mixing file compressor for MenuetOS (x86-64 asm)
    By x3k30c in forum Data Compression
    Replies: 0
    Last Post: 12th December 2015, 06:19
  2. Context Mixing
    By Cyan in forum Data Compression
    Replies: 9
    Last Post: 23rd December 2010, 20:45
  3. Simple bytewise context mixing demo
    By Shelwien in forum Data Compression
    Replies: 11
    Last Post: 27th January 2010, 03:12
  4. Context mixing
    By Cyan in forum Data Compression
    Replies: 7
    Last Post: 4th December 2009, 18:12
  5. CMM fast context mixing compressor
    By toffer in forum Forum Archive
    Replies: 171
    Last Post: 24th April 2008, 13:57

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •