Some top of the class open source algorithms and programs that can be reused:
Archive parsers and extractors:
*libzpaq
*7-zip code for about two dozen different types.
*The unarchiver for "more formats than I can remember", "stuff I don't even know what it is", in its author words.
*QuickBMS: supports tons of file formats, archives, encryptions, compressions (over 500), obfuscations and other algorithms. Currently, +2100 plugins to open different archives. Mostly games but also normal packers, like balz.
Specific compressors and recompressors:
Uncompressed audio:
*TTA: very fast while maintaining good compression.
*Optimfrog: stronger/slower
*Wavpack: the one used on zipx
*FLAC: "the fastest and most widely supported lossless audio codec" according to its authors.
*ALAC.
JPG images:
*Lepton: fastest, weakest.
*PackJPG: medium, no arithmetic.
*Paq model: strongest, slowest, no progressive.
MP3 audio:
*PackMP3: ~15% savings.
MP2 audio:
*unpackMP2+grzip:m3 (as in fazip): ~19% savings, 2-3x faster than packMP3.
Deflate, bzip and LZW (gif):
*precomp
zlib:
*Anti-z
Microsoft algorithms:
*wimlib (not yet implemented a working recompressor but code ready to use)
General purpose codecs:
Asymmetric:
*LZMA - Deprecated in favour of LZMA2
*LZMA2
*Radyx: LZMA2 with a more parallelizable match finder, can fit a larger dictionary in the same RAM so helpful with ~2-4gb machines and large archives.
*CSArc: faster than LZMA2, still good compression and good filters too.
*BSC: A little stronger/slower than LZMA.
*ZSTD: very efficient on fast compression.
*LZO: hellishly fast compression.
*GLZA: good on text, not so much on binary.
Symmetric:
*MCM: fast cm
*Grzip: bwt
*ppmd: good and fast on text, not so much on binary
*paq* family: best ratios, worst speed.
Filters:
Dedupe:
*Per file: as in WIM or squashfs files.
*Bulat's rep: Very fast and efficient; memory hungry.
*zpaq's hash based: works best at large distances and can be reused in an incremental run.
*rzip
*zstd new implementation
Executables:
*BCJ2
*E8E9
*Dispack
Delta:
*Bulat's
*Igor's
Text:
*XWRT
*FA's lzp and dict
Data rippers (used to identify, for example, a JPG image embedded in an unknown container and process it with a corresponding algorithm):
*paq8px detection code for uncompressed audio and bitmaps, exe code, gif, jpeg and zlib
*precomp detection code for gif, jpeg, mp3, pdf bitmaps, deflate and bz2 streams
*extrJPG (from the author of packJPG)
*Dragon UnPACKer / Hyper Ripper: 23 formats supported: AVI;BIK;BMP;DDS;EMF;GIF;IFF;JPEG;MIDI;MOV;MPEG Audio;OGG;PNG;TGA;VOC;WAV;WMF;XM and a few more prone to false positives. Pretty slow if the container is unknown.
Those are just a few. Feel free to add your thoughts.