# Thread: Generalized Unary Coding

1. ## Generalized Unary Coding

I was looking into unary coding while working on a compression project for artificial neurons, partially spurred on by the neat research into birdsong/HVC.

On my journey I found that the Wikipedia page for unary coding has a dedicated subsection for "Generalized unary coding" by Subhash Kak.

https://en.wikipedia.org/wiki/Unary_coding

I read the description but came away with no concrete understanding of its nature or utility. It reads like an arbitrary fixed length encoding, and even requires "markers" for "higher integers." Am I reading this correctly and its hogwash, or is there some deeper usefulness that I'm missing here?

2. "Generalized unary coding requires that the range of numbers to be represented to be pre-specified because this range determines the number of bits that are needed."
It appears useless for compression, maybe it was designed for some other application in mind. Since this system requires a limit on the numbers in the included 0-15 example, we could have just encoded 4-bit wide binary codes instead. Regular unary and universal codes don't impose a limit on integer size, while this does. I'm also struggling to grasp why this is considered a variant of unary, let alone an improvement.

3. Originally Posted by Lucas
It appears useless for compression, maybe it was designed for some other application in mind. Since this system requires a limit on the numbers in the included 0-15 example, we could have just encoded 4-bit wide binary codes instead. Regular unary and universal codes don't impose a limit on integer size, while this does. I'm also struggling to grasp why this is considered a variant of unary, let alone an improvement.
I think we share the same struggle about it being "considered a variant of unary"; especially after I just gave reading his paper a go. ("Generalizing Unary Coding", DOI: 10.1007/s00034-015-0120-7)

He mentions the possible uses in data compression related to neural networks - with the idea that a "spatial form" could represent an array of neurons.

What I'm started to realize though, after looking through the Wikipedia page history, is this might be a case of author self-promotion: there seems to be a lot of edits over time adding his journal articles into the reference material for unary and neural network related pages.

I'm honestly having trouble parsing this sentence from his paper:

The usefulness of unary coding in certain situations arises out of the fact that theHamming distance between numbers increases linearly with respect to the differencebetween them.

4. After a bit more reading it doesn’t appear to be a compression method at all, we are taking it a bit too literally. It appears to be a transform into the unary domain with finite width codes. Seems kinda dumb at first until you consider optimizations, lookup tables can be used to generalize the unary domain of integers using this method.

It appears they made this to get into that domain for biological nueral networks (songbirds). It’s not directly applicable to compression, but is essentially a preprocessor for changing domains, so it can be thought of as a model which can be used for compression, rather than a direct compression method itself.

It would be nice if they explained the unary domain better though. As to why it is better for training biological NN’s, I’m not sure why...

5. Originally Posted by Lucas
It appears useless for compression, maybe it was designed for some other application in mind. Since this system requires a limit on the numbers in the included 0-15 example, we could have just encoded 4-bit wide binary codes instead. Regular unary and universal codes don't impose a limit on integer size, while this does. I'm also struggling to grasp why this is considered a variant of unary, let alone an improvement.

Right, it is better to just write it as a binary number, since the range or length in bits is needed anyway.

My variable-length code is more compact i think. Only increases in bitlength (by 1) after all possible values are used up for that bitsize. This is Rice coding generalized, i think.

Code:
```/*
Filename:   ucodes.h (universal codes.)
Written by: Gerald Tamayo, 2009
*/
#include <stdio.h>
#include <stdlib.h>

#if !defined(_UCODES_)
#define _UCODES_

/* Unary Codes. */
#define put_unary(n) put_golomb((n),0)
#define get_unary()  get_golomb(0)

/* Exponential Golomb coding */
#define put_xgolomb(n) put_vlcode((n), 0)
#define get_xgolomb()  get_vlcode(0)

/* Elias-Gamma coding.
Note: don't pass a zero (0) to the encoding function: only n > 0 */
#define put_elias_gamma(n) put_xgolomb((n)-1)
#define get_elias_gamma()  get_xgolomb()

/* Golomb Codes. */
void put_golomb( int n, int mfold );
int  get_golomb( int mfold );

void put_vlcode( int n, int len );
int  get_vlcode( int len );

#endif```

I just use my put_vlcode() function for short codes. Last time i checked, this is related to Rice codes.

Code:
```/*
Filename:    ucodes.c (universal codes.)
Written by:  Gerald Tamayo, 2009
*/
#include <stdio.h>
#include <stdlib.h>
#include "gtbitio.h"
#include "ucodes.h"

/* Golomb Codes.

We divide integer n by (1<<mfold), write the result as a
unary code, and then output the remainder as a binary number,
the bitlength of which is exactly the length of the unary_code-1.

In the implementation below, mfold is an exponent of two:
mfold = {0, 1, 2, ...} and (1<<mfold) is thus a power of two.
Each 1 bit of the unary code signifies a (1<<mfold) *part* of
integer n. In *exponential* Golomb coding, each 1 bit signifies
succeeding powers of 2. (We allow a length/mfold of 0 to encode
n as a plain unary code.)
*/
void put_golomb( int n, int mfold )
{
int i = n >> mfold;

while ( i-- ) {
put_ONE();
}
put_ZERO();
if ( mfold )
put_nbits( n%(1<<mfold), mfold );
}

int get_golomb( int mfold )
{
int n = 0;

while ( get_bit() ) n++;
n <<= mfold;
if ( mfold )
n += get_nbits(mfold);

return n;
}

/* The following variable-length encoding function can write
Elias-Gamma codes and Exponential-Golomb codes according
to the *len* parameter, which can be 0 to encode integer 0
as just 1 bit. */
void put_vlcode( int n, int len )
{
while ( n >= (1<<len) ){
put_ONE();
n -= (1<<len++);
}
put_ZERO();
if ( len ) put_nbits( n, len );
}

int get_vlcode( int len )
{
int n = 0;

while ( get_bit() ){
n += (1<<len++);
}
if ( len ) n += get_nbits(len);
return n;
}```

Treat the number as a binary number (not decimal number) and transform it into a variable-length code.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•