CATEGORY: MITRE CWE-194: Unexpected Sign Extension REFERENCE: https://cwe.mitre.org/data/definitions/194.html EXPLANATION: The developer casts the pointer to the binary data to a 'const char' pointer in order to process the data byte-wise within the body of the 'for' loop. With every iteration, pointer 'p' is dereferenced and passed to 'snprintf'. However, 'snprintf' is a variadic function and for all variadic functions, arguments smaller than 'int' are automatically promoted to 'int' before they are passed to the called function. Thus, 'snprintf' receives an 'int', not a 'char' which means that 'snprintf' actually converts a now (typically) 4-byte integer to a 2-digit hexadecimal string representation. This leads to erroneous behavior on platforms where 'char' is signed, because the value of bit 7 (0 or 1) is propagated up to bit 31 (so-called sign extension). CONSEQUENCES: The '%X' format specifier interprets the (int) promoted (potentially signed) argument as an 'unsigned int', then converts every byte of it to a hex byte, starting at the most significant byte until the least significant byte is reached. Leading all-zero nibbles are dropped from the output: 00 00 00 4B is converted to string "4B" 00 00 00 07 is converted to string "7" Since the format specifier that is passed to 'snprintf' also comes with a minimum field-width specifier of '%02X', the resulting string is at least two characters long, zero-padded if necessary: 00 00 00 4B is converted to "4B" 00 00 00 07 is converted to "07" The same rules apply to bytes where bit 7 is set. The only difference is that due to sign extension the upper bits become all one. Here's an example for a byte with a bit pattern 1111.0011 (0xF3). Because of promotion and sign-extension, bits 8 to 31 are all set to one: FF FF FF F3 is converted to "FFFFFFF3" But because of the fact that the second argument to 'snprintf' is set to three (maximum length of the resulting string plus one extra character for the string terminating '\0'), only the first two characters from the most-significant byte are returned: FF FF FF F3 is converted to "FF" and the remaining six characters "FFFFF3" are dropped. Thus, on a platform where 'char' is signed (which is quite typical), all bytes whose 7th bit is set are output as "FF". BUGFIX: The most straightforward fix is to explicitly declare 'p' as a pointer to 'unsigned char': const unsigned char* p = (const unsigned char*) data; NOTES: The test data of the unit test doesn't include a case where the bit 7 of a byte is set, which is unfortunate: const unsigned char TEST_DATA[] = { 0x00, 0x11, 0x22, 0x33, 0x1F }; Had the test data contained a value of e. g. '0xF3', the error would have shown immediately, at least on platforms where 'char' is signed. Note, however, that a test data value of '0xFF' would not have exposed the bug; the resulting string would have been correct ("FF"), albeit for the wrong reasons. REMARKS: The way this routine converts bytes to hex strings does work (provided it's implemented correctly), but it's quite inefficient. On the one hand, there are 'len' calls to 'snprintf', on the other hand there are also 'len' calls to 'strcat', which consumes more and more execution time as the combined string gets longer and longer with every iteration. A more efficient (and more flexible) real-world implementation could be this: std::string to_hexstring(const void* data, size_t len, size_t maxLen=32) { auto nibble_to_hexchar = [](uint8_t nibble) -> char { return (nibble < 0x0A) ? ('0' + nibble) : ('A' + nibble - 0x0A); }; const unsigned char* p = static_cast(data); std::ostringstream ss; size_t bytesToPrint = std::min(len, maxLen); for (size_t i = 0; i < len; ++i) { ss << nibble_to_hexchar(p[i] >> 4) << nibble_to_hexchar(p[i] & 0x0F); } if (bytesToPrint < len) { ss << ".. (" << (len - bytesToPrint) << " bytes more)"; } return ss.str(); }