Hello to all members!
I?m working on a small packer for exe intros for the demoscene using neural networks. The code is quite small (asm code <320 bytes) and its efficiency is comparable to the crinkler packer (but well, often slightly worse around 1% under 4k-8k). This code will be released open-source once I managed to have something fully working. Although, i?m facing some problems that i want to share with you:
1) The first problem is more relative to the current technique used in crinkler, an exe-linker-packer for small intros under 4k (and can still work well under 16k). Suppose that we have N models and i want to use a subset K (more efficient) of these models and assign them static weights for a small amount of data to compress (compare to fully neural network, static weights are quite efficient under 16k of data, well, when the type of data is not changing too much?)
n0i n1i are respective counter for bit 0 and 1 for the current model (counters are stored in a hash for each model)
The final prediction will simply be : p(1) = sum _0_K ( n1i * 2 ^ wi ) / sum _0_K ( ( n0i + n1i ) * 2 ^ wi )
My questions are :
- How can we determine the best subset model K from N models?
- How can we determine the best weight for the subset model K ? (range for wi is 0-7)
I tried several empirical methods, using errors, prefiltering with neural network?etc. with no success. What do you think? Is there any good pragmatic paper on this subject? (I?m not a statistician but a programmer, so everything that is not somehow applied in a simplified algorithm is quite hard for me to transcribe?)
(Not to mention that parameters for models ? well in this case, this is only a byte per model - are taken into account into the final compressed size).
2) Same question regarding neural networks. I have N models and I want to select the best subset. The only difference is that I don?t have to manage the weights? I?m using currently a dichotomy algorithm with sorting on weights (only summing weights > 0 ) and it?s working quite well? but not sure it?s the best approach.