It is not exactly a data compression topic, but there are related problems here, like creating a few separate streams for lossy video compression of a single file, such that any subset of them would allow to decode the video, and the more streams we have, the better quality.

Probably many of you have met some fountain codes, also referred as rateless erasure codes – encoding a message as some number of packets, such that any large enough subset of them would allow to decode the message (a smaller subset is usually useless). Older approaches were usually based on Reed-Solomon method: that coefficients of degree d polynomial can be inferred from values in any d+1 points. In 2002 Luby has started more efficient LT codes: building packets as XOR as subsets of data blocks, chosen in a pseudorandom way with optimized probability distribution for subset sizes.

It allows to omit the costly problem of retransmission: we just wait until we obtain a large enough set of packets. However, there is usually also another issue: a noise damaging our packets (making them rather useless for pure fountain codes). We can protect against it by adding some forward error correction (FEC): attach some redundancy to each packet, which will allow to handle some noise level. However, the exact noise pattern is chosen stochastically, so sometimes we unnecessarily use too much redundancy, and sometimes insufficiently – losing the whole packet.

This suboptimality can be handled by enhancing the concept of rateless erasure codes to allow also for some noise level for received packets. Assuming binary symmetric channel (BSC): that each bit has epsilon independent probability of being flipped (0 <-> 1), the theoretical informational content of such damaged packet is 1 – h(epsilon) bits/position (where h(p) = -p lg(p) – (1-p) lg(1-p) is Shannon entropy). So theoretically we could perform a joint correction/reconstruction of such packets if the sum of "1 – h(epsilon)" exceeds the message size:

I have just implemented such joint reconstruction codes (JRC), reconstructing the message from all available scraps of information, using correction trees approach (like sequential decoding for convolutional codes, but with larger: 64 bit state): https://github.com/JarekDuda/JointReconstructionCodes

It is in an early experimental stage now, but I would be grateful for any comments.

The reconstruction is much more demanding than for pure fountain codes (without FEC), using a larger number of packets for this purpose could be impractical – so it might be beneficial to concatenate it with some fountain codes.

Have you met something like that before?

What real life applications could you think of?