Results 1 to 7 of 7

Thread: Hexadecimal collisions

  1. #1
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    199
    Thanks
    58
    Thanked 15 Times in 15 Posts

    Hexadecimal collisions

    I´d like to ask you if hexadecimal interpretation is collision-free.
    Last edited by CompressMaster; 4th June 2019 at 20:56. Reason: typo

  2. #2
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts
    Depends on interpretation.
    A nibble value (4 bits) can be bijectively mapped to [0-9A-F] hex digit.
    So a byte (two nibbles) can be mapped to two hex digits.

    But some kinds of hex syntax can be redundant.
    For example, hex constants in C/C++ (0x...) allow for any number of leading zeroes, support both [a-f] and [A-F] for 10..15,
    and there're also optional type suffixes.
    This kind of syntax is redundant - provides multiple ways to encode the same binary data.

  3. Thanks:

    CompressMaster (10th June 2019)

  4. #3
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    199
    Thanks
    58
    Thanked 15 Times in 15 Posts
    I mean something like this - if I will have 257 1-byte text files with only one string - all latin + non-latin (like "ф") characters extracted from charmap, there are more than 257 characters and the collision must happen in principle. So, how binary compiler knows the difference that there is "A" instead of "ф" encoded under same hex value? And I´m afraid that even some characters are encoded with more that two nibbles (3 or 4) - so the collision probability is quite far - how compiler knows that encode "AD" instead of "4D8A"?

    So, not every file can be expressed in hexadecimal interpretation and compiled back to binary?

    For larger files (10 bytes or so), the probability of collisions is quite, quite far, but it´s possible in general.

  5. #4
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts
    1) You can't have 257 different 1-byte files, since 1 byte has only 256 different values.

    2) Currently there's no direct mapping of binary data (byte values) to characters.
    Some codes have special meaning (end-of-line etc), some of others might be ignored or replaced by a texteditor.
    https://en.wikipedia.org/wiki/Character_encoding

    3) Single character in text is not always encoded with one data byte.
    In your example with "A"/"ф", "A" = 41 and "ф" = D1 84 (utf8)

    4) You can disassemble any file to asm db 0x?? hex syntax and assemble it back losslessly.

  6. Thanks:

    CompressMaster (10th June 2019)

  7. #5
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    199
    Thanks
    58
    Thanked 15 Times in 15 Posts
    Quote Originally Posted by Shelwien View Post
    4) You can disassemble any file to asm db 0x?? hex syntax and assemble it back losslessly.
    Is there some CMD software for that?

  8. #6
    Administrator Shelwien's Avatar
    Join Date
    May 2008
    Location
    Kharkov, Ukraine
    Posts
    3,982
    Thanks
    298
    Thanked 1,309 Times in 745 Posts

  9. Thanks:

    CompressMaster (11th June 2019)

  10. #7
    Member CompressMaster's Avatar
    Join Date
    Jun 2018
    Location
    Lovinobana, Slovakia
    Posts
    199
    Thanks
    58
    Thanked 15 Times in 15 Posts
    Thanks Shelwien! Now it works as expected. But I´ve prior used binary-hexadecimal (and vice versa) CMD converter, but it does not work properly because I´ve specified input in hexadecimal instead of binary. My mistake, sorry.

Similar Threads

  1. Dedup collisions in obnam
    By Matt Mahoney in forum Data Compression
    Replies: 6
    Last Post: 24th March 2014, 04:03

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •