I´d like to ask you if hexadecimal interpretation is collision-free.
I´d like to ask you if hexadecimal interpretation is collision-free.
Last edited by CompressMaster; 4th June 2019 at 21:56. Reason: typo
Depends on interpretation.
A nibble value (4 bits) can be bijectively mapped to [0-9A-F] hex digit.
So a byte (two nibbles) can be mapped to two hex digits.
But some kinds of hex syntax can be redundant.
For example, hex constants in C/C++ (0x...) allow for any number of leading zeroes, support both [a-f] and [A-F] for 10..15,
and there're also optional type suffixes.
This kind of syntax is redundant - provides multiple ways to encode the same binary data.
CompressMaster (10th June 2019)
I mean something like this - if I will have 257 1-byte text files with only one string - all latin + non-latin (like "ф") characters extracted from charmap, there are more than 257 characters and the collision must happen in principle. So, how binary compiler knows the difference that there is "A" instead of "ф" encoded under same hex value? And I´m afraid that even some characters are encoded with more that two nibbles (3 or 4) - so the collision probability is quite far - how compiler knows that encode "AD" instead of "4D8A"?
So, not every file can be expressed in hexadecimal interpretation and compiled back to binary?
For larger files (10 bytes or so), the probability of collisions is quite, quite far, but it´s possible in general.
1) You can't have 257 different 1-byte files, since 1 byte has only 256 different values.
2) Currently there's no direct mapping of binary data (byte values) to characters.
Some codes have special meaning (end-of-line etc), some of others might be ignored or replaced by a texteditor.
https://en.wikipedia.org/wiki/Character_encoding
3) Single character in text is not always encoded with one data byte.
In your example with "A"/"ф", "A" = 41 and "ф" = D1 84 (utf8)
4) You can disassemble any file to asm db 0x?? hex syntax and assemble it back losslessly.
CompressMaster (10th June 2019)
CompressMaster (11th June 2019)
Thanks Shelwien! Now it works as expected. But I´ve prior used binary-hexadecimal (and vice versa) CMD converter, but it does not work properly because I´ve specified input in hexadecimal instead of binary. My mistake, sorry.