1. Grouped LZW, Vol. 2

Hi all,
​in this episode I am glad to tell about an attempt of my own to dispense
the problem previously encountered in Grouped (ROLZ) LZW:
the fixed size of the groups (dictionaries attached to context).
A way to proceed is illustrated by Nakano, Yahagi, Okada but I started
from a different consideration.
Every time a symbol occurs in a text, it gains an increasingly number of
children and the chance for it to reappear is more and more smaller,
while an entropy stage which takes its output assigns shorter codes. To
conciliate this two opposite I settled down a schema where symbols
belong to a set of lists entitled for the number of children, and each
list is organized as LRU.
A symbol now will be emitted by its list and rank inside it,
respectively via the Witten Cleary arithmetic coder and the Elias delta.
I choosed an AC because it is the solely that can mimic closely the
fatherhood distribution among symbols, but this constraint put me in front
of the necessity to interleave its sequence.
After a complicated period I realized that the solution must be based on
two facts: (i) the encoder has to be ahead of two symbols beacause the
decoder needs to start with 16 bit; (ii) the variables high and low (which
define the focus interval) are in lockstep between the just mentioned two
sides. The rest, you can see it through the source below.
Actually at the moment the compressor is terrible, both in terms of
speed and ratio, but I made it public as the interleaver could be of
some interest.
I have in mind to improve the performance of the compressor imposing on
it a more canonical context apparatus, that should curtail the lists at
the expense of memory consumption.
I hope to be back soon,
greetings, Marco Borsari

2. ​I finished to elaborate Lens with a standard context machinery,
now every node in the trie has a presence in all the recency
lists associated to the characters of an order 1 set: in case of an
actually encountered context the node is placed at the head of the list,
for other ones at the tail.
This is necessary to keep the consistency of the specific system of the
Lens series (see above) to transmit a symbol.
I conserved a single AC statistic for the fatherhood instead to reply it
for every context because in the mean the differences should be zero.
I have been forced to decouple the LRU for the full dictionary from the
list for zero children because it would be difficult to make a choice
regarding the various rank of the symbols in respect to their contexts.
Unfortunately, though the compression ratio is better than that of
Lens3, it remains far worse than simply emitting the appropriate bit
index as in classical LZW.
So I must admit this is a sterile path and I stop it here.

Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•