Results 1 to 7 of 7

Thread: improving brotli

  1. #1
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts

    improving brotli

    I compressed 474 HTML files (1-300 KB each; 14.583.273 bytes in total) separately with the following compressors (results given in bytes):
    2365252 durilca -m256 -t2 -o16
    2452696 paq8pxd15 -s5
    2663859 lpaq8 -5
    2834461 ppmd -o16 -m256
    2852153 brotli 11
    3245649 lzma -d22
    3610971 gzip -9

    The first idea to improve brotli is a preprocessor called Lossless Static HTML Transform (LSHT; based on XWRT) which was run before given compressors. LSHT is described in http://www.informatica.si/index.php/...e/view/253/250. LSHT gives improvement on all tested compressors except durilca. Brotli and LZMA are now almost equal:
    2433836 LSHT3 + durilca -m256 -t2 -o16
    2261884 LSHT3 + paq8pxd15 -s5
    2444510 LSHT3 + lpaq8 -5
    2545032 LSHT0 + ppmd -o16 -m256
    2743616 LSHT0 + brotli 11
    2751982 LSHT0 + lzma -d22
    2936360 LSHT0 + gzip -9

    The second idea for improving brotli is Visually Lossless Static HTML Transform (VLSHT) described at http://link.springer.com/chapter/10....642-04409-0_23. VLSHT is lossy but visually lossless what means that the HTML document layout will be modified, but the document displayed in a browser will provide the exact fidelity with the original. As expected results are better than LSHT:
    2324611 VLSHT3 + durilca -m256 -o16
    2172780 VLSHT3 + paq8pxd15 -s5
    2329893 VLSHT3 + lpaq8 -5
    2426316 VLSHT0 + ppmd -o16 -m256
    2601133 VLSHT0 + brotli 11
    2608640 VLSHT0 + lzma -d22
    2777325 VLSHT0 + gzip -9

  2. #2
    Expert
    Matt Mahoney's Avatar
    Join Date
    May 2008
    Location
    Melbourne, Florida, USA
    Posts
    3,257
    Thanks
    307
    Thanked 797 Times in 489 Posts
    Interesting. Is this published software? How does it compare with XWRT?

  3. #3
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    AFAIR, LSHT is basically XWRT with added support for HTML end tag omission. I added this technique in unpublished (I have not finished testing) XWRT 3.5. The version 3.4 is still better than brotli without XWRT:
    2852153 brotli 11
    2772903 xwrt_v3.4 -0 -f10000 +d -c -w + brotli 11
    2743616 LSHT0 + brotli 11
    Last edited by inikep; 14th November 2015 at 18:54.

  4. #4
    Member
    Join Date
    Jun 2015
    Location
    Switzerland
    Posts
    926
    Thanks
    256
    Thanked 331 Times in 203 Posts
    What is the impact of these transforms on the decoding time?

  5. #5
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by Jyrki Alakuijala View Post
    What is the impact of these transforms on the decoding time?
    I will do in-memory experiments and I'll let you know.

  6. #6
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    In my experiments decompression speed is 60-90 MB/s, but it's 8-years old unoptimized code with many unnecessary options so I think that it can reach 200 MB/s.

  7. #7
    Programmer
    Join Date
    May 2008
    Location
    PL
    Posts
    309
    Thanks
    68
    Thanked 173 Times in 64 Posts
    Quote Originally Posted by Matt Mahoney View Post
    Is this published software?
    Unfinished XWRT 3.5 with the "+h" option for HTML optimization is here:
    https://github.com/inikep/XWRT/tree/XWRT3.5
    Last edited by inikep; 17th August 2016 at 21:21.

  8. Thanks (2):

    Bulat Ziganshin (29th July 2016),Skymmer (22nd November 2015)

Similar Threads

  1. Brotli
    By willvarfar in forum Data Compression
    Replies: 274
    Last Post: 17th June 2020, 23:45
  2. [Java] Improving efficiency of arithmetic decompression
    By rhulcomputerscience in forum Data Compression
    Replies: 1
    Last Post: 17th September 2015, 19:31
  3. Improving CM state machines
    By Mat Chartier in forum Data Compression
    Replies: 3
    Last Post: 3rd July 2013, 17:54
  4. Improving LZ78/LZW?
    By RichSelian in forum Data Compression
    Replies: 7
    Last Post: 19th September 2011, 19:05
  5. Improving RC4 (MC1 cipher proposal)
    By encode in forum The Off-Topic Lounge
    Replies: 14
    Last Post: 5th August 2010, 21:57

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •