I don't remember if I posted this before, but a lot of threads were deleted anyway:
https://github.com/weissenberger/gpuhd
https://www.researchgate.net/publica...coding_on_GPUs
It's interesting that they compare it to the Zstd CPU-based Huffman decoder.
The need for CUDA seems like a serious constraint in the middle of AMD's new dawn... I wonder what it buys you over using OpenCL or whatever the Windows equivalent is (DirectX Math?)