# Thread: New layer 0 - compression algorithm

1. ## New layer 0 - compression algorithm

Hello everybody,

I want to ask you if you consider usefull this new method of compression like an first layer of sets of agtorithms for a compression program.
This method is a little similar to an arithmetic compression but use instead of big precision numbers, some fractions.
Taking into consideration the Stern-Brocot tree than can generate all the Farey fractions, this can be seen like a binary tree.
For all the fractions from left to consider 0 and from right 1.
From the Stren-Brocot tree we can observe that for example the fraction 1/100 have around 100 bits from the binary tree cosidered.
So if we want to compress a file, to code first it's binary representation using the Stern-Brocot codes.
For example instead 100 bits from the file to memo only a fraction - numitor and denomitor.
The only question now is when to stop reading from the file. In this case there can be taken into consideration more technics - first mach, best mach and so on.
It is usefull to take for example the frequency from the dictionary and to read from the Stern-Brocot tree the length of the code produced for each frequency of chars / total number of chars.
With these length to read from the file exactly this numbers of bits, in order, from the length numbers, generated from the dictionary.
In this way the compression rate will be very good.
Only after this transform to apply all layers from a compression program.

Thank you,
best regards,

Hello Igor and everybody,
I tried to write some c-language codes to show the compression with the Stern-Brocot tree compression.

### PRG_1

/************************************************** **********************************
@ Program that generates all the fractions from a level "h", of the Stern-Brocot Tree
@ and tests if any binary codes of a given length can be memo using instead of it, the
@ fraction from the Stern-Brocot tree if it is visited like the code says
@
@ Output the maximum legth of the binary representation of any fraction of the SB tree
@ for a given level
@
@ For a detailed description of the classic algorithm, please visit the site:
@ www.cut-the-knot.org/blue/b-tree.shtml
************************************************** **********************************/

/************************************************** **********************************
INCLUDES
************************************************** **********************************/
# include "stdio.h"
# include "malloc.h"
# include "math.h"
# include "conio.h"
/************************************************** **********************************
TYPES
&
VARIABLES
************************************************** **********************************/
typedef unsigned long int UL_int;
typedef unsigned char ui8_t;
UL_int h;
struct nod{ UL_int num,den; struct nod * s, * d;};
typedef struct nod NOD;
/* Output file wich stores the optimized positions */
FILE * pf;
/* Variables that store the number of nodes */
UL_int N, c_init;

int min = 30000;
//Maxim number that be memo using "h" bits
UL_int MAX;
//Maxim of the length of codes from the tree till level "h"
UL_int maxim = 0;
/************************************************** **********************************
FUNCTIONS
************************************************** **********************************/
/*
*/
/* Functions for comparing the binary representations */
/*
*/
void BIN_comp(UL_int a, UL_int b)
{
UL_int nr_cif_a, nr_cif_b;

nr_cif_a = (UL_int) (log(a)/log(2.0)) + 1;
if(maxim < nr_cif_a)
maxim = nr_cif_a;

nr_cif_b = (UL_int) (log(b)/log(2.0)) + 1;
if(maxim < nr_cif_b)
maxim = nr_cif_b;
}
/*
*/
/* Function for the generation of the Stern-Brocot tree with "h" levels */
/*
*/
NOD * SB_tree(NOD * r, UL_int x, UL_int y, UL_int c)
{
NOD * p, * q;

if((r->den >= MAX-1)||(r->num >= MAX-1))
{
if(min>c)
min=c;
}
if(c==h-1)
{
/* Print all the fractions from a read level */
//printf("%lu/%lu ",r->num,r->den);
BIN_comp(r->num,r->den);
return r;
}
else
{
p =(NOD *)malloc(sizeof(NOD));
p->s = NULL;
p->d = NULL;
p->num = x;
p->den = x+y;
r->s = SB_tree(p,x,x+y,c+1);
q =(NOD *)malloc(sizeof(NOD));
q->s = NULL;
q->d = NULL;
q->num = y;
q->den = x+y;
r->d = SB_tree(q,y,x+y,c+1);
}
}
/************************************************** **********************************
MAIN ()
************************************************** **********************************/
void main(void)
{
pf=fopen("out.txt","w+");
printf("Enter the level in the tree: ");
scanf("%lu",&h);

MAX = pow(2,h-2);
printf("\nMAX that could be stored for NOMITOR and DENOMITOR is: %lu\n",MAX);

/* The root is treated separatly */
/* Generation of the STERN-BROCOT tree fractions */

//This is the level of the current fraction in the tree
c_init = 1;

/* Print the number of optimised positions */
//N = (UL_int) pow(2,h);
//printf("\nThe number of optimised positions is: %lu / %lu", c_opt, N);
//printf("\nPercent = %.2f %% optimised",(c_opt*100/(float)N));

printf("END...\n");

if(min==30000)
{
printf("All the possible binary codes of LENGTH of %d",h);
printf(" can be optimised with Stern-Brocot - Farey(n) fractions\n");
printf("MAX_LENGTH of codes = %lu",maxim);
}
else
printf("\nLEVEL minim = %d",min);

getche();

fclose(pf);
}
/************************************************** **********************************
END
************************************************** **********************************/

#####################################
### PRG_2
This program read an text 0/1 file not binary, only for example and generates and number the fractions from SB and the maximum length of the codes read from the file.

# include <stdio.h>
#include <values.h>
# include <math.h>

FILE * pf;
char c;

unsigned long MAX;
unsigned long a,b,la,lb,ra,rb;
unsigned long contor;
unsigned long length;
unsigned long MAX_length;

int last_enter;

void main(void)
{
pf=fopen("IN.txt","r");

a = 1;
b = 2;
contor = 0;
length = 0;

MAX_length = 0;

while (!feof (pf))
{
last_enter = 0;

MAX = 16384 ;

fscanf(pf,"%c",&c);

if (c=='0')
{
la = a;
lb = a+b;

a = la;
b = lb;
}
else if(c=='1')
{
ra = b;
rb = a+b;

a = ra;
b = rb;
}

length ++;

if((a>MAX)||(b>MAX))
{
if(length > MAX_length)
MAX_length = length;

a = 1;
b = 2;

last_enter = 1;
contor ++;

length = 0;
}
}//while

//For the last fraction
if(last_enter==0)
contor++;

printf("\nContor = %lu",contor);
printf("\nMAX length of code read from file= %lu",MAX_length);

fclose(pf);

}

#### PRG_3
This program generates an text 0/1 file of a given length.

# include <stdio.h>
# include <stdlib.h>
# include <values.h>

FILE * pf;

unsigned long i,val,LENGTH;

void main(void)
{
pf=fopen("IN.txt","w");

randomize();

printf("Enter the length of the file: ");
scanf("%lu",&LENGTH);

for(i=0;i<LENGTH;i++)
{
val = random(MAXINT);
if((val % 2) ==0)
fprintf(pf,"0");
else
fprintf(pf,"1");
}
fclose(pf);
}

Thank you,
best regards,

2. Benchmark it and we can talk.

FYI you can't encode symbols more efficient than arithmetic coding. By definition, encoder must not take into account previous symbols. Only model takes care of recalculating probabilities.

3. on 7-zip forum, he proposed to use this before lzma in order to increase compression

4. I do not really understand.
To help an LZ-class compression algorithm, what is usually done is to reorder input in a more-compressible format.
Sounds like BWT.

But here, the output from the function seems to be something entirely different. This is not re ordering, rather producing another input.
Without example, i don't get it how it can help LZ match sequences.

5. oh, man. it's just one more perpetum mobile. and he believes that by adding this bit-messing before any compression algo, compression ratio will increase

6. ## RLE with fractions and than rar, smaller than rar?

Hello all, I want to ask you if you consider ok the algorithm combining the RLE algo with some fraction algorithm?
I tried a second version of algorithm but at this moment it contains a lot of garbage code because I was inspired from some heap trees; It can be found on:
http://bocut-appl.110mb.com/index.ph...mpress-ver-2-0

Is it RLE a good compression that let's say deserve the "effort" of combining it with an other algorithm?

The new algo was inspired by the extended Stern-Brocot tree from the Farey sequences