Results 1 to 22 of 22

Thread: PackRAW

  1. #1
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    PackRAW

    I've decided to create a simple utility to compress raw images from digital cameras, as discussed previously on the EMMA thread.

    I'm calling it PackRAW since that pretty much describes what it does (similar to PackJPG, PackMP3, etc) and I honestly couldn't come up with a better name,
    so feel free to suggest another one. It is not an archiver, it compresses a single file at a time, and doesn't store any filenames, dates, attributes, etc.
    Currently it only compresses Sony .ARW raw photos, though it detects Panasonic .RW2 raw photos that will use a future model. It is based on EMMA, so the
    parsing was already reasonably stable, but if you have any image that it doesn't recognize and you feel like it should be considered valid, you can, if you so wish,
    send it to me and I'll try to see what's going on with it.

    It is designed to trade compression ratios for performance, so don't expect it to beat EMMA. It is, however, about 25x faster for ARW encoding in my tests.
    It compresses the SqueezeChart Camera Raw ARW testset from 291.5MB down to 196.8MB, with an average processing time per image of 3.2s (on an i7 5820k@4.4Ghz).

    As always, best regards

    UPDATE: you can always download latest version at ​https://goo.gl/w8WQW7
    Last edited by Bulat Ziganshin; 5th June 2017 at 22:10.

  2. Thanks (6):

    Bulat Ziganshin (28th May 2017),Gonzalo (1st June 2017),Mike (28th May 2017),schnaader (29th May 2017),Shelwien (29th May 2017),Stephan Busch (28th May 2017)

  3. #2
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    It compresses the SqueezeChart Camera Raw ARW testset from 291.5MB down to 196.8MB, with an average processing time per image of 3.2s (on an i7 5820k@4.4Ghz).
    sorry, how much it is in MB/s?

  4. #3
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    5MB/s±0.2 with an i7 5820k@4.4Ghz, DDR4-2400, 480GB SSD

    I thought about making it block based so I could multithread it, but it seems more advantageous to compress several files at once, since I'm guessing most people who
    care about using it probably have hundreds of such raw files.

  5. Thanks (2):

    Bulat Ziganshin (29th May 2017),Stephan Busch (28th May 2017)

  6. #4
    Programmer schnaader's Avatar
    Join Date
    May 2008
    Location
    Hessen, Germany
    Posts
    566
    Thanks
    216
    Thanked 200 Times in 93 Posts
    Nice tool, good job!

    On my machine (i5 520M, 2 x 2.4Ghz, no SSD), speed is about 2.3 MB/s, so it seems to scale with CPU without any other big bottlenecks - though when processing 2 files at once, speed goes down a bit to (2 x) 1,9 MB/s.

    A minor thing I recognized: When unpacking, existing files are overwritten without confirmation - most people won't like this. Usually, get a confirmation or rename the output file is done in that case. Personally, I like the solution the paq family uses: If the file exists, unpack into memory, compare it to the existing file, display "(not) identical".

    Also, you might consider adding JPG recompression for the embedded thumbnails. As far as I know, nobody tried to use a downscaled version of the original image to improve the JPG thumbnail compression so far (although the idea comes up often), this might be a good time to test this.

    Could you elaborate on the open source situation? My favorite solution would be an open source version with a permissive license like BSD, Apache, MIT hosted on GitHub. On the other hand, I'm aware that the legal situation might be tricky because of the proprietary formats and possibly involved reverse engineering.
    http://schnaader.info
    Damn kids. They're all alike.

  7. #5
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    Quote Originally Posted by schnaader View Post
    Nice tool, good job!

    On my machine (i5 520M, 2 x 2.4Ghz, no SSD), speed is about 2.3 MB/s, so it seems to scale with CPU without any other big bottlenecks - though when processing 2 files at once, speed goes down a bit to (2 x) 1,9 MB/s.
    I can try to make it faster, at the expense of even more compression ratio. I'm guessing the drop you saw is also related to cache sizes, it uses about 5.5MB of memory in total, though only about 4MB are for the model itself. I'll also try to keep future models at about the same magnitude in speed, though the RW2 format is somewhat more obfuscated (thus requiring more processing). I don't really know much about digital photography, so I don't know if these formats are widely used or not, but at least a few users here had requested something like this.
    If the compression performance for ARW images is deemed acceptable, I'll "freeze" the model, so that files created with these initial releases may still be unpacked by future versions.

    A minor thing I recognized: When unpacking, existing files are overwritten without confirmation - most people won't like this. Usually, get a confirmation or rename the output file is done in that case. Personally, I like the solution the paq family uses: If the file exists, unpack into memory, compare it to the existing file, display "(not) identical".
    Sure, I'll add an overwrite confirmation request.

    Also, you might consider adding JPG recompression for the embedded thumbnails. As far as I know, nobody tried to use a downscaled version of the original image to improve the JPG thumbnail compression so far (although the idea comes up often), this might be a good time to test this.
    It wouldn't be worth the effort. With EMMA you can compress these raw photos and many other formats, with much stronger models, and it's JPEG model is also quite good, and yet using it usually gives just a further 200KB reduction in file size. So even if using the raw data to enhance the JPEG model gave a 10% relative improvement, it still wouldn't be worth the added complexity and speed degradation. It is something I've considered for EMMA, but since it is a streaming compressor, and the thumbnails usually precede the raw data, it can't be done there either.

    Could you elaborate on the open source situation? My favorite solution would be an open source version with a permissive license like BSD, Apache, MIT hosted on GitHub. On the other hand, I'm aware that the legal situation might be tricky because of the proprietary formats and possibly involved reverse engineering.
    Well, the info I gathered about these formats came almost exclusively from checking the dcraw source code (an incredible tool, by the way), so I don't really know what the legal situation for something like this is. There was once a commercial raw compressor, Rawzor, but I don't know if they had requested legal advice/permission from the manufacturers. In any case, I don't have any commercial aspirations for it, it's just something to give back to the community.

  8. #6
    Programmer Bulat Ziganshin's Avatar
    Join Date
    Mar 2007
    Location
    Uzbekistan
    Posts
    4,507
    Thanks
    742
    Thanked 665 Times in 359 Posts
    Quote Originally Posted by mpais View Post
    5MB/s±0.2 with an i7 5820k@4.4Ghz, DDR4-2400, 480GB SSD
    thank you. i just want to get overall understanding how fast this sort of things is

    I thought about making it block based so I could multithread it, but it seems more advantageous to compress several files at once, since I'm guessing most people who care about using it probably have hundreds of such raw files.
    i think the same

    i also join the request to make it OSS and publish on github

    On my machine (i5 520M, 2 x 2.4Ghz, no SSD), speed is about 2.3 MB/s, so it seems to scale with CPU without any other big bottlenecks - though when processing 2 files at once, speed goes down a bit to (2 x) 1,9 MB/s.
    your cpu has turbo speed of 2.9 ghz, so probably it runs at higher frequency when only one core is in work

  9. #7
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    472
    Thanked 175 Times in 85 Posts
    On my system PackARW runs at the same speed as 7-Zip (1700 kbps) with LZMA:lc8:lp4:pb:4:mt2 while offering about 10% better compression.
    Thank you very much, Marcio. PackARW is very very useful to me.
    Rawzor is outdated and despite many promises of author I doubt that there will be a new version
    which raises the question which tool to use for compressing camera raw that rawzor does not support (including dng, x3f, rw2 and Fuji raf)

    I would like to have support for uncompressed Sony .arw as well - compression can be turned off on Sony a7 and later
    and those .arw seem to be compressed better by using delta:4. If you plan to add support for them,
    PackRAW will be the one and only solution for efficiently compressing .arw format.

    If dcraw is not enough, there are also decoders for almost all camera formats (including dng, x3f, rw2 and Fuji raf) in Klaus Post RawSpeed:
    https://github.com/klauspost/rawspee...velop/RawSpeed

  10. Thanks (2):

    Bulat Ziganshin (31st May 2017),Shelwien (31st May 2017)

  11. #8
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    Quote Originally Posted by Stephan Busch View Post
    On my system PackARW runs at the same speed as 7-Zip (1700 kbps) with LZMA:lc8:lp4b:4:mt2 while offering about 10% better compression.
    Thank you very much, Marcio. PackARW is very very useful to me.
    Rawzor is outdated and despite many promises of author I doubt that there will be a new version
    which raises the question which tool to use for compressing camera raw that rawzor does not support (including dng, x3f, rw2 and Fuji raf)
    I've already done a few minor changes to the arw model, total memory usage is down to 3.6MB (from 5.5MB), so that should help speed on cpu's with smaller
    caches. And it has improved compression on your testset by about 0.5% (1.1MB). What would be more important in your opinion, speed or compression ratio?
    RW2 and RAF I plan on supporting, since they are already supported by EMMA, but as I've said, I don't really know what formats are more popular. I know
    that some already use lossless JPEG-LS compression, so a fast packer with good compression ratio is out of the question.

    I would like to have support for uncompressed Sony .arw as well - compression can be turned off on Sony a7 and later
    and those .arw seem to be compressed better by using delta:4. If you plan to add support for them,
    PackRAW will be the one and only solution for efficiently compressing .arw format.
    Well, if they truly are uncompressed, my generic model should be able to compress them, I'd just have to know how to parse them.

    If dcraw is not enough, there are also decoders for almost all camera formats (including dng, x3f, rw2 and Fuji raf) in Klaus Post RawSpeed:
    https://github.com/klauspost/rawspee...velop/RawSpeed
    I'm aware, I've also checked it out when developing the parsers for EMMA.
    The next model will be for RW2, since some users here have requested it.
    After that I'll try to make a simplified version of my generic image model, which will hopefully allow PackRAW to support as many formats as EMMA.

  12. Thanks:

    Stephan Busch (31st May 2017)

  13. #9
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    472
    Thanked 175 Times in 85 Posts
    When it comes to camera raw formats, the most popular formats are:

    *.arw (Sony) - either compressed or uncompressed
    *.cr2 (Canon) - always compressed (as far as I know)
    *.dng (Pentax, Ricoh, Leica..) - either compressed or uncompressed
    *.nef (Nikon) - either compressed or uncompressed
    *.orf (Olympus) - either compressed or uncompressed

    also popular are:

    *.raf (Fuji) - always uncompressed, sometimes deltafilter helps
    *.rw2 (Panasonic) -interesting encoded because lzma:lc8:lp4:pb4 also helps
    *.srw (Samsung) - always uncompressed, sometimes deltafilter helps
    *.x3f (Polaroid, Sigma) - either compressed or uncompressed

    seldom used formats are:

    *.3fr/.fff (Hasselblad) - always compressed
    *.ari (Arri)
    *.eip (PhaseOne)

    formats not used anymore are:

    *.bay (Casio)
    *.dcs .dcr .drf .k25 .kdc (Kodak)
    *.erf (Epson)
    *.mdc (Minolta, Agfa)
    *.mef (Mamiya)
    *.mos (Leaf)
    *.mrw (Minolta, Konica Minolta)
    *.pef .ptx (Pentax)
    *.srf .sr2 (Sony)

    According to Klaus Post:
    DNG is simple Lossless JPEG. It is the same as my algorithm (predict from pixel of the left), predict leftmost pixel downwards.

    Each residual value is encoded huffman tree -> outputs a value from 1 to 16. This amount of bits is read. This immediate value is converted to a signed value (similar to zigzag), which is added to the pixel on the left. To accommodate for the CFA layout it uses prediction 2 pixels to the left/up.

    This is simple, but as you mention pretty efficient. Canon/Nikon/Pentax use the same algorithm.

    Some cameras have variations, but most are rather simple. Sigma for instance has a bit every 16 pixels to select left or upward predictors and adjust bit length every 4 pixels.
    In my opinion, the most spread raw format is .dng followed by .cr2, .nef, .arw, .orf. And those will adress and attract the most users.
    I would also vote for not compressing embedded jpegs - if you decide later to offer p.ex. viewing plugin (XnView, IrfanView) those embedded jpegs can be easily used for previews.

  14. Thanks (2):

    Bulat Ziganshin (31st May 2017),mpais (31st May 2017)

  15. #10
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW 0.0.2, available at https://goo.gl/w8WQW7

    Code:
    Changes:
    - Tweaked the ARW model:
        - smaller memory footprint (total memory usage is now about 3.6MB, down from 5.5MB)
        - slightly better compression, about 0.5% improvement (1.1MB) on the SqueezeChart testset
    - New model for Panasonic RW2 photos
    - Confirmation request before overwriting files
    - Bug fixes
    Panasonic raw photos (.RW2 and some .RAW) can now be packed. The model gets about 6% worse compression than the one in EMMA,
    but is 20x faster in my tests (about 5MB/s, just slightly slower than the ARW model). Total memory usage is only 1.6MB.
    Last edited by mpais; 4th June 2017 at 00:02.

  16. Thanks (4):

    Bulat Ziganshin (3rd June 2017),load (3rd June 2017),schnaader (4th June 2017),Stephan Busch (3rd June 2017)

  17. #11
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW 0.0.3, available at https://goo.gl/w8WQW7

    Code:
    Changes:
    - Rewritten the compressed ARW model:
        - smaller memory footprint (total memory usage is now about 2.7MB)
        - slightly better compression, about 0.5% improvement (1MB) on the SqueezeChart testset
        - 40% faster on average than v0.0.2
    - New model for uncompressed raw images, available for:
        - Sony .ARW
        - Panasonic/Leica .RAW
        - Olympus .ORF (uncompressed, non-interlaced)
        - Pentax .PEF (uncompressed)
        - Mamiya .MEF
        - Kodak .KDC
        - Leaf Aptus .MOS (uncompressed)
        - Epson .ERF
        - TIFF (uncompressed, 8bpc or higher, up to 4 channels, single page)
    - Original filename is now stored in the packed file
    Could a mod please update the first post with the download link?

    Best regards

  18. Thanks (3):

    Bulat Ziganshin (5th June 2017),Mike (5th June 2017),Stephan Busch (5th June 2017)

  19. #12
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW 0.0.4, available at https://goo.gl/w8WQW7

    Code:
    Changes:
    - Improved detection
    - Parser for Fujifilm RAF images
    - Will now compress all detected images in a file
    - Improved the speed of all the models
    
    Supported formats:
    - Sony .SR2 and .ARW (v2.x only)
    - Panasonic/Leica .RW2 and .RAW
    - Olympus .ORF (uncompressed)
    - Fujifilm .RAF (uncompressed)
    - Adobe .DNG (uncompressed)
    - Pentax .PEF (uncompressed)
    - Mamiya .MEF
    - Kodak .KDC
    - Leaf Aptus .MOS (uncompressed)
    - Epson .ERF
    - TIFF (uncompressed, 8bpc or higher, up to 4 channels, single/multi-page)
    PackRAW can now detect even more formats, and compress them faster. On my test system (i7 5820k@4.4Ghz) it compresses
    RW2 files at about 10MB/s, ARW compressed files at about 13MB/s and uncompressed images at about 15MB/s.
    It will also compress all the images it finds in the file (such as embedded uncompressed thumbnails), up to 256 (arbitrary, can be changed if needed).

    I'm reasonably happy with the current status, and I'm thinking of freezing the models and the bitstream format. Does anyone have any suggestions
    for improvements that may require major changes?

  20. Thanks (2):

    Bulat Ziganshin (11th June 2017),Stephan Busch (11th June 2017)

  21. #13
    Tester
    Stephan Busch's Avatar
    Join Date
    May 2008
    Location
    Bremen, Germany
    Posts
    876
    Thanks
    472
    Thanked 175 Times in 85 Posts
    Wow, thats a huge speed increase compared to previous version. Excellent work.
    This version is now faster than Rawzor and comes close to Rawzor compression;
    on Panasonic .rw2 and Panasonic .raw it outperforms Rawzor compression;
    on many formats like Sony .arw and Adobe .DNG it outperforms all archivers/lossless compressors I know (except Emma).

    I would suggest:

    * support of wildcards
    * progress indicator
    * display more info about input image (maybe bit depth, ratio)
    * alphabetically sort the list of supported formats
    * skip first thumbnail on compression and also metadata (if a plugin for p.ex. XnView is planned)
    Last edited by Stephan Busch; 11th June 2017 at 21:35.

  22. #14
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW only compresses the raw pixel data, all other data (JPEG thumbnails, metadata, etc) is simply stored.
    I'm actually considering a final change to the header, to include an offset for a JPEG thumbnail. This way any
    viewer can simply read this field to know where it can find the JPEG preview. Since my parsers already find most
    embedded JPEGs and simply ignore them, it shouldn't be too hard to choose the largest to use for this.

    I'll add wildcards support only when it reaches a more stable development phase, since those would be more usefull
    when multithreading, to allow it to compress all the photos in a directory by using N cpu cores. And it's hard enough
    debugging as it is, throwing threads into the mix this early would make it a nightmare.

    The rest are just cosmetic changes, I'll see about adding them to the next release.
    Thanks again for your help Stephan.

  23. Thanks:

    Stephan Busch (12th June 2017)

  24. #15
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW 0.1, available at https://goo.gl/w8WQW7

    Code:
    Changes:
    - Improved detection and added support for more formats
    - Tweaked a few settings in the uncompressed model, should get slightly better compression on average
    - PRW files now include a 4-byte offset to a JPEG thumbnail, if available. [found at offset 3, little-endian]
    
    Supported formats:
    - Adobe .DNG (uncompressed)
    - ARRI .ARI
    - Canon .CRW (no header, uncompressed)
    - Epson .ERF
    - GitUp Git2 .RAW
    - Hasselblad .3FR (uncompressed)
    - Kodak .KDC/.RAW (uncompressed)
    - Leaf Aptus .MOS (uncompressed)
    - Mamiya .MEF
    - Minolta .MRW
    - Nikon .NRW/.NEF (uncompressed)
    - Olympus .ORF (uncompressed)
    - Panasonic/Leica .RW2 and .RAW
    - Pentax .PEF/.RAW (uncompressed)
    - Samsung .SRW (uncompressed)
    - SJCAM M20/SJ5000x Elite .RAW
    - Sony .SR2/.ARW (v2.x only)
    - TIFF (uncompressed, 8bpc or higher, up to 4 channels, single page)
    - Xiaomi Yi .RAW
    This release is mostly about finalizing the bitstream format and bug fixing. It has improved support for many
    cameras that require special parameters. I'd like to publicly thank Stephan Busch for all the help with testing
    and finding problematic samples.

  25. Thanks (5):

    Bulat Ziganshin (26th June 2017),load (26th June 2017),Mike (28th June 2017),schnaader (25th June 2017),Stephan Busch (25th June 2017)

  26. #16
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    PackRAW 0.2, available at https://goo.gl/w8WQW7

    Code:
    Changes:
    - New model for 24bpp uncompressed images
    - Fixed a bug in the raw model that could cause data corruption when decompressing
    
    Supported formats:
    - Adobe .DNG (uncompressed)
    - ARRI .ARI
    - Canon .CRW (no header, uncompressed)
    - Epson .ERF
    - Fujifilm .RAF (uncompressed)
    - GitUp Git2 .RAW
    - Hasselblad .3FR (uncompressed)
    - Kodak .KDC/.RAW (uncompressed)
    - Leaf Aptus .MOS (uncompressed)
    - Mamiya .MEF
    - Minolta .MRW
    - Nikon .NRW/.NEF (uncompressed)
    - Olympus .ORF (uncompressed)
    - Panasonic/Leica .RW2/.RWL and .RAW
    - Pentax .PEF/.RAW (uncompressed)
    - Samsung .SRW (uncompressed)
    - SJCAM M20/SJ5000x Elite .RAW
    - Sony .SR2/.ARW (v2.x only)
    - TIFF (uncompressed, 8bpc or higher, up to 4 channels, single/multi-page)
    - Xiaomi Yi .RAW
    I've created a model for RGB 24bpp images, since some raw formats use those for the embedded thumbnails.
    In those files, compression should now be better and faster.

    I've also found a bug in the decompression code of the raw model for v0.1, which caused it to sometimes be unable
    to correctly decode a file. This is now fixed, and v0.2 will happily decompress those files created with v0.1.

    I really need more samples from unsupported cameras, so if anyone has uncompressed raws that aren't supported
    that they'd like to share, I'd really appreciate it.

    Best regards

  27. Thanks:

    Stephan Busch (9th July 2017)

  28. #17
    Member
    Join Date
    Mar 2016
    Location
    Croatia
    Posts
    183
    Thanks
    77
    Thanked 12 Times in 11 Posts
    can PM me your email, so i can send you few Panasonic GH2 raw images, in case you want to test.

  29. #18
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    Thank you dado023, I already have samples for that camera (from photographyblog.com), it should be detected correctly.
    Do you have any photos with it that PackRAW fails to compress?

  30. #19
    Member
    Join Date
    Mar 2016
    Location
    Croatia
    Posts
    183
    Thanks
    77
    Thanked 12 Times in 11 Posts
    To be honest, i don't use PackRaw, but i am glad there is such tool developing, so i rather wanted to contribute with providing test samples

  31. #20
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts

    PackRAW 0.3

    A new version is available at https://goo.gl/w8WQW7

    I've improved the parsing to find more embedded JPEG thumbnails, and have created thumbnail viewing plugins for XnView(MP), both for x86 and x64.

    Best regards

  32. Thanks (3):

    maadjordan (27th August 2017),Stephan Busch (27th August 2017),zubzer0 (27th August 2017)

  33. #21
    Member
    Join Date
    May 2008
    Location
    Kuwait
    Posts
    334
    Thanks
    36
    Thanked 36 Times in 21 Posts
    I am getting zero size. can you check your upload

  34. #22
    Member
    Join Date
    Feb 2016
    Location
    Luxembourg
    Posts
    523
    Thanks
    198
    Thanked 750 Times in 304 Posts
    @maddjordan
    Done, thanks for the warning.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •