(English is not my native language. Also I don’t have a university degree so please be a little patient if I made some obvious mistakes. (In the real world I am just a programmer and I never studied on an university.))
While fooling around with self-written Turing Tarpits, I found out that if you randomly generate a valid program certain patterns emerge more often than others.
The most common patterns were:
0
1
0000… (infinite length)
1111… (infinite length)
00
11
01
10
010101… (infinite length)
101010… (infinite length)
And so on.
I foolishly tried to construct a language that would create unique patterns for every unique program. Unsurprisingly I could not construct a language that has that ability.
While searching on the internet if such program exist I found the binary lambda calculus. The current Wikipedia-page does not mention it but the one I read was mentioning that it was used for research on the minimum description length and Kolmogorov complexity when encoded using the de Bruijn index notation. But I was not happy with the results. Not only did BLC generate non-unique programs, the de Bruijn index seemed not exactly to match up with the idea of MDL. (It generates a lot of ‘programs’ that are longer than the output)
I wasn’t able to find better languages to approach Kolmogorov complexity than the BLC language. (I hope they do not exist or otherwise my work is for nothing.)
Anyway… After a while I constructed a set of I think they are called axioms. The probably already exist, but I write them down anyway.
1. Any piece of information can be expressed by a string of bits.
2. Any outcome of an algorithm can be expressed by a string of bits.
3. So using 1 and 2, information and algorithms are related. (Not surprising when you read Solomonoff's theory and the relation with Kolmogorov)
4. Most constructible binary strings of a certain size can be considered random because there is no know program that is smaller than the string itself.
5. It is hard to construct a binary string of a certain size that is as large as the smallest program that can generate such string. (4 and 5 are a nice paradox)
6. Almost all strings that are found in the real world are constructed by smaller programs.
7. The perfect language is a language that has the following properties:
a. Ever unique program generates unique output
b. The description length is smaller or equal to the message length
c. The index of the programs from the language are ordered/sorted on the complexity of the outcome. Or in other words: it is not a simple counter. (you can generate any outcome by simply count through all the options)
8. It is useful to try to find the perfect language because:
a. Following axiom 5, most real world strings have a smaller Kolmogorov complexity.
b. The best approximation of the perfect language would be the closest thing to approach Kolmogorov complexity.
c. The best approximation of the perfect language is the best approximation of (universal) artificial intelligence. (following Hutters description of universal artificial intelligence)
9. Using these axioms in the back of my head I deducted (and I hope I am right on this) that every information or algorithm output can be expressed with a Boolean truth table with the number of inputs = ceiling(log2(length of information)). The undefined parts of the truth table can be defined as ‘don’t care’s.
The next step that I took was to find a way to generate every possible truth table for i’ inputs. For 0 to 2 inputs the boolean expressions were predictable but with 3 or more inputs the boolean expressions needed to iterate through all possible outcomes became unpredictable. Or in other words I could not construct an algorithm that was linear and iterated through all the possible outcomes without repetition.
I was happy with the results of the experiments. Using boolean expressions the simplest ones indeed generated the simplest truth tables. I don’t have any scientific proof but instinctively and visually it looked pretty good. It also showed that I do not have a formal definition of complexity. Is (A and B) more or less complex than (A or B)? And is (A and B and C) more or less complex than (A and D)?
I also saw that there was something funny about the boolean expressions needed to construct every possible outcome. A lot of truth tables were just a rotated version of other truth tables. That gave me the idea to make the truth tables one step more abstract: Truth rings. It’s just a truth table where the beginning and the end are connected and where information can be read as having a certain starting point on the ring and where you rotate the ring until you have read the information you want. (If you use the mental image of a tape from a turing machine: The tape is of a certain length and the end and the beginning are connected. So when you reach the end of the tape you start again on the beginning.)
Also the Truth rings could be abstracted to pattern rings to avoid inverted version of the truth rings.
Examples:
- Pattern ring 1,2,1,3 can be translated to the two truth rings 0110111 and 1001000.
- 2,3,1,1 can be translated to 0011101 or 1100010.
Information with n’ length consisting of the string 01010101010101… can be reduced to a truth ring ‘01’ which can be reduced to the pattern ring 1,1. To retrieve the information from a pattern ring the starting point, the length and the ‘polarity’ of the information must also be saved. In this case there is no need to store both the starting point and the polarity. One is sufficient. I don’t have hard proof for this but I am very sure that it is possible to save every string of information as a pattern ring, with starting point and polarity without breaking axiom 7b. The length is another discussion but one can say that with normal information stored in a computer the information length is also stored separately.
Axiom 10: The outcome of the perfect language consists of pattern rings ordered on increasing (yet undefined) complexity. To avoid repetition pattern rings must be ‘prime’ = Pattern rings that can be constructed using just repeating a pattern ring with a lower complexity are not allowed in the perfect language.
Creating a program that generates prime pattern rings is not difficult.
The results:
For 0 inputs: 0 (special case when information just consists of N zeros of ones)
For 1 input:
1,1
For 2 inputs:
2,2
3,1
For 3 inputs:
4,4
5,3
6,2
7,1
3,1,2,2
3,2,1,2
3,2,2,1
3,3,1,1
4,1,1,2
4,1,2,1
4,2,1,1
5,1,1,1
2,1,1,2,1,1
2,1,2,1,1,1
2,2,1,1,1,1
3,1,1,1,1,1
(For 4 bits information the pattern rings are in the attachment.)
In other words you just need 20 prime pattern rings to construct every (256 different) truth table for 3 inputs. If the perfect language starts counting at ring 0 and loops through every ring with every possible rotation and polarity you have 256 different programs ordered on complexity for 8 bits of information or every small algorithm that results in 8 bits complexity.
But did we win something? For 8 bits this seems not to be the case and it does not look impressive at all. But… With axiom 5 in mind I expect that when the amount of information increases the delta between the index from the perfect language and the index of the information (when using simple counting) will on average increase. Or in other words with big pieces of real world information the expected average program index in the perfect language is increasing interestingly smaller.
But now I am stuck on generating the prime pattern rings. How do we iterate through the rings in linear time? And is it possible to find an algorithm that is not NP(P) that converts any piece of information or algorithm to a prime pattern ring? Or even better: is it possible to find an algorithm that can not only find the prime pattern ring but also the index in the perfect language?
Let's first start with finding an algorithm to generate prime pattern rings. Because it is a ring, the starting point does not matter. Also complexity order is still not defined very well, so if you find an algorithm that generates prime pattern rings in another sequence it can be that your algorithm has a better definition of complexity. Also I think the smallest possible program that generates prime pattern rings is the most correct one on the definition of complexity!