Pointers in C, Part III: The Strict Aliasing Rule

“Know the rules well, so you can break them effectively.”
— Dalai Lama XIV

One of the lesser-known secrets of the C programming language is the so-called “strict aliasing rule”. This is a shame, because failing to adhere to it takes you (along with your code) straight into the realm of undefined behavior. As no one in their right mind wants to go there, let’s shed some light on it!

POINTER ALIASING DEFINED

First of all, we have to clarify what “aliasing” really means, or rather aliasing of pointers. Take a look at this example:

Here, ‘p1’ and ‘p2’ are aliased to the same object ‘value’; that is, they point to the same object. If you update ‘value’ through ‘p1’:

a read through ‘p2’ will reflect this change:

Because of the possibility of aliasing, a C compiler is prevented from applying certain optimizations. Consider:

You might think that any decent compiler would generate simplified code equivalent to this:

It’s not a matter of decency — the compiler just can’t do this optimization! Here’s the assembly output that clearly shows that the return value is loaded from memory:

The optimization is not possible because the caller could call ‘silly’ like so:

In this case, ‘x’ and ‘y’ are aliased to the same ‘value’, which means ‘silly’ must return 1 not 0. Consequently, ‘*x’ must be read from memory, every time. Period.

ROOM FOR IMPROVEMENT

If you think about it, even though it may happen, pointer aliasing won’t happen very often in practice. Why waste so much potential for optimization for the uncommon case? Most likely, the folks from the C standards committee had the same line of thinking. They introduced rules that state when pointer aliasing must not happen. Enter the strict aliasing rule.

To facilitate compiler optimization, the strict aliasing rule demands that (in simple words) pointers to incompatible types never alias. Pointers to compatible types (like the two ‘int’ pointers ‘x’ and ‘y’ in ‘silly’) are assumed to (potentially) alias. Let’s make the pointer types incompatible (‘short*’ vs. ‘int*’):

As you can see, this time no load from memory is performed — 0 is returned instead. The optimization is possible because the compiler assumes that aliasing is not allowed in this case.

VIOLATIONS

But what happens if pointers to incompatible types nevertheless alias? After all, this can happen quite easily. Maybe not in the ‘silly’ example, but in real-world production code:

In an attempt to convert data stored in a buffer (maybe read over a network connection) into a high-level structure, a pointer to ‘struct measurements_t’ is aliased with a pointer to a ‘uint_8’. Since both types are incompatible (pointer to struct vs. pointer to ‘uint8_t’) this code is a violation of the strict aliasing rule. Experienced C developers most likely recognized immediately that this code yields undefined behavior, but they would have probably attributed it to struct padding and alignment issues. The real reason, as we know by now, is a violation of the strict aliasing rule.

THE FINE PRINT

So what exactly is the strict aliasing rule and what does “type compatibility” mean? Here’s an excerpt from the ISO C99, standard, chapter 6.5:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.


Such Standardeese is often hard to digest, so let me try to clarify it a bit. Aliased pointer access is fine if:

1. The pointed-at types are identical. Note that typedefs are just type aliases and don’t introduce new types:

2. The pointed-at types are identical apart from the “signed-ness” (e. g. ‘int’ vs. ‘unsigned int’).
3. The pointed-at types are identical apart from qualification (e. g. ‘const int’ vs. ‘int’).
4. The rule “an aggregate or union type that includes one of the aforementioned types among its members” is highly confusing and probably doesn’t mean much. Check this out for details.
5. The pointed-at types are different, but the pointed-at type through which the access is made is a pointer to character:

Conversely, aliased pointer access is not defined if the pointed-at types are fundamentally different. Note that this includes pointers to structs that are identically defined but have different tag names:

CONCLUSION

The strict aliasing rule was introduced to give the compiler vendors some leeway regarding optimizations. By default, the compiler assumes that pointers to (loosely speaking) incompatible types never alias. As a consequence, you, the programmer, have to make sure that this rule is obeyed.

Here’s some disquieting news: a lot of existing code isn’t conforming to the strict aliasing rule, but the code works (or appears to work) fine anyway. As an example, the ‘convert’ function above, which aliases a struct to an array of bytes might work fine on an Intel x86-based platform, which supports unaligned memory access. However, if you use ‘convert’ on an ARM-based platform, you will get a “bus error” exception that will crash your system. In other cases, nonconforming code just works by coincident, with a particular compiler, or a particular compiler version at a particular optimization level.

To me, knowing about the strict aliasing rule is as important for every systems developer as knowing about the other systems programming “secrets” like alignment, struct padding, and endianness.

FURTHER READING

I’m indebted to the authors of the following two blog post. From the first post, I shamelessly stole the idea for the ‘silly’ function, which the author originally called ‘foo’, which I found silly :-) The second post gives a much more detailed coverage of the strict aliasing rule and also discusses the interesting technique of “casting through a union”.

A GCC Compiler Mistake

“Most of the evil in this world is done by people with good intentions.”
— T.S. Eliot

Errors, defects, bugs, blunders — when we talk about software-related errors, we often use terms loosely and synonymously — but there are differences. For instance, in his book “The Design of Everyday Things“, Donald A. Norman makes a clear distinction between “mistakes” and “slips”:

“Errors come in several forms. Two fundamental categories are slips and mistakes. Slips result from automatic behavior, when subconscious actions that are intended to satisfy our goals get waylaid en route. Mistakes result from conscious deliberations.”

In short: mistakes are the result of faulty ideas whereas slips are errors made when implementing an idea. Usually, slips are not just easy to make, but also easy to fix. Fixing mistakes is typically much harder.

One of the easiest slips to make in C/C++ is to inadvertently do a boolean test on an assignment expression:

which is equivalent to:

While in some rare cases this is exactly what the developer had in mind, in 99% of all cases it’s not. Hence, boolean-testing assignments is explicitly banned by many C/C++ coding standards and frowned-upon by most developers.

But what’s all the fuzz about, you might ask. If an unlucky developer forgets to type the second ‘=’, any decent 21st century compiler surely generates a warning, doesn’t it? Well, the answer is, as we shall see, both, yes and no.

If you compile the example above with GCC (I’ve tried version 5.4.0) using options ‘-W -Wall’, you do get a warning:

warning: suggest parentheses around assignment used as truth value

GCC’s reasoning is this: if developers really wanted to truth test the assignment (there are still people out there who do, as strange as this may sound), they need to put an extra pair of parentheses around the assignment, to show their intend:

Requiring an extra set of parentheses seems to be a neat idea, but it’s the devil in disguise. For one thing, it reminds me of Sledge Hammer saying “Trust me, I know what I’m doing” (which was usually entailed by disaster), for another, it doesn’t work reliably. In order to explain, I first need to put the same slip in a slightly more complicated expression:

In this case, you not just get a warning, your compiler will refuse to compile this code. Why? According to C’s precedence rules, the assignment operator has lower priority than the ‘&&’ operator, which means that the code is equivalent to

The C language standard says that the result of an ‘&&’ expression is a so-called “rvalue” and an rvalue is more or less read-only. Thus, assigning ‘d’ to it is just not possible and GCC is right when it barks:

error: lvalue required as left operand of assignment

A slip that doesn’t compile is a kind slip, you might think, but read on. We only got lucky by accident, so to speak.

Many coding standards, like MISRA, for instance, require that you put parentheses around subexpressions to clearly show what precedence you have in mind, instead of relying on obscure operator precedence rules. Hence, instead of

you have to write

MISRA exists to make coding errors unlikely, but if a MISRA-abiding developer forgets the second ‘=’, he’s out of luck, at least if he’s using GCC:

Now the devil reveals himself: since the parentheses are properly placed, there is no attempt to assign to an rvalue, so there won’t be a compile-time error and because of GCC’s “parentheses feature” mentioned above, GCC doesn’t issue a warning, either.

Early in my career as a software developer, I read the aforementioned book “The Design of Everyday Things” and I believe it left a mark on me. One of the book’s unforgettable key lesson is this:

“When you have trouble with things — whether it’s figuring out whether to push or pull a door or the arbitrary vagaries of the modern computer and electronics industry — it’s not your fault. Don’t blame yourself: blame the designer.”

GCC’s “extra parentheses” feature is far from neat design — it’s rather bad design that doesn’t work in all contexts and gives developers a false sense of security. It was deliberately put in and correctly implemented but the idea was wrong from the outset. Thus, it’s not a slip, but obviously a mistake.

Dangerously Confusing Interfaces IV: The Perils of C’s “safe” String Functions

“It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.”
–Mark Twain

Buffer overflows are among the most frequent causes of security flaws in software. They typically arise in situations such as when a programmer is 100% certain that the buffer to hold a user’s name is big enough — until a guy from India logs in. Thus, well-behaved developers always use the bounded-length versions of string functions. Alas, they come with differing, dangerously confusing interfaces.

THE GOOD

Let’s start with ‘fgets‘:

No matter what users type into their terminals, ‘fgets’ will ensure that ‘user_name’ is a well-formed, zero-terminated string of at most 29 characters (one character is needed for the ‘\0’ terminator). The same goes for the ‘snprintf‘ function. After executing the following code

‘buffer’ will contain the string “The”, again, properly zero-terminated.

Both functions follow the same, easy-to-grasp pattern: you pass a pointer to a target buffer as well as the buffer’s total size and get back a terminated string that doesn’t overflow the buffer. Awesome!

THE BAD

In order to copy strings safely, developers often reach for ‘strncpy‘ to guard themselves against dreaded buffer overruns:

Unfortunately, this is not how ‘strncpy’ works! We assumed that it followed the pattern established by ‘fgets’ and ‘snprintf’ but that’s not the case. Even if ‘strncpy’ promises that it never overflows the target buffer, it doesn’t necessarily zero-terminate it. What it does is copy up to ‘sizeof(buffer)’ bytes from ‘user_name’ to ‘buffer’ but if the last byte that is copied is not ‘\0’ (i. e. ‘user_name’ comprises more than ‘sizeof(buffer)’ characters), ‘strncpy’ leaves you with an untermiated string! A traditional approach to solve this shortcoming is to enforce zero-termination by putting a ‘\0’ character as the last element of the target buffer after the call to ‘strncpy’:

Using ‘strncpy’ without such explicit string termination is almost always an error — a rather insidious one, as your code will work most of the time but not when the buffer is completely filled (i. e. your Indian colleague “Villupuram Chinnaih Pillai Ganesan” logs on).

Boy, oh boy is this inconsistent! ‘fgets’ and ‘snprintf’ give you guaranteed zero-termination but ‘strncpy’ doesn’t. A clear violation of the principle of least surprise. Apparently, ‘strncpy’ fixes one safety problem and at the same time lays the foundation for another one.

THE UGLY

Can it get worse? You bet! How do you think ‘strncat‘, the bounded-length string concatenation function, behaves? Ponder this code:

But this is wrong, of course: the third argument to ‘strncat’ (let’s call this argument ‘n’) is not the size of the target buffer. It is the maximum number of characters to copy from the source string (‘string2’) to the destination buffer (‘buffer’). If the length of the source string is greater or equal to ‘n’, ‘strncat’ copies ‘n’ characters plus a ‘\0’ to terminate the target string. Confused? Don’t worry, here’s how you would use it to avoid concatenation buffer overruns:

Yuck! What’s the likelihood that people remember this correctly?

THE REMEDY

Even if the different interfaces and behaviors of the bounded-length string functions in the C API make sense for certain use cases (or made sense at some point in time), the upshot is that they confuse programmers and potentially lead to new security holes when in fact they were supposed to plug them. What’s a poor C coder supposed to do?

As always, you can roll your own versions of bounded/safe string functions or use my safe version of ‘strcpy’. If you rather prefer something from the standard library, I’d suggest that you use ‘snprintf’ as a replacement for both, ‘strncpy’ and ‘strncat’:

Looks like ‘snprintf’ is the swiss army knife of safe string processing, doesn’t it? The moral is this: use whatever you’re comfortable with, but refrain from using ‘strncat’ or ‘strncat’ directly.

More dangerously confusing interfaces…

Playgrounds Revamped

“Play is the highest form of research.”
— Albert Einstein

Many years ago, I wrote about the importance of having playgrounds, that is, easy-to-access try-out areas for carrying out programming-related experiments with the overall goal of exploring and learning.

Recently, I’ve reworked my C++ playground and uploaded it to GitHub. Compared to my previous C++ playground, the new one comes with the following major advantages:

  1. Shared access to playgrounds from multiple computers — since it is based on a Git repository.
  2. Every experiment has its own subdirectory — the top-level playground directory stays clean and clearly arranged.
  3. Unit test support through Google Test — running ‘make’ not just builds the experiment but also executes contained unit tests.

Once cloned and installed, you can start a new experiment is this:

‘pg-setup’ will create a directory called ‘init_within_loop_body’ along with a ‘Makefile’ and a ‘init_within_loop_body.cpp’ source file. Plus, if you have defined your ‘EDITOR’ environment variable properly, it will open ‘init_within_loop_body.cpp’ in your favorite editor for you. All that’s left to do is add your experiment’s code to the testcase template:

Now, just type/execute ‘make’ (either from within your editor or from the command-line) and your code will be compiled and run:

Pointers in C, Part II: CV-Qualifiers

“A teacher is never a giver of truth; he is a guide, a pointer to the truth that each student must find for himself.”
— Bruce Lee

In part I of this series, I explained what pointers are in general, how they are similar to arrays, and — more importantly — where, when, and why they are different to arrays. Today, I’ll shed some light on the so-called ‘cv qualifiers’ which are frequently encountered in pointer contexts.

CV-QUALIFIER BASICS

CV-qualifiers allow you to supplement a type declaration with the keywords ‘const’ or ‘volatile’ in order to give a type (or rather an object of a certain type) special treatment. Take ‘const’, for instance:

‘const’ is a guarantee that a value isn’t (inadvertently) changed by a developer. On top of that, it gives the compiler some leeway to perform certain optimizations, like placing ‘const’ objects in ROM/non-volatile memory instead of (expensive) RAM, or even not storing the object at all and instead ‘inline’ the literal value whenever it’s needed.

‘volatile’, on the other hand, prevents optimizations. It’s a hint to the compiler that the value of an object can change in ways not known by the compiler and thus the value must never be cached in a processor register (or inlined) but instead always loaded from memory. Apart from this ‘don’t optimize’ behavior, there’s little that ‘volatile’ guarantees. In particular — and contrary to common belief — it’s no cure for typical race condition problems — It’s mostly used in signal handlers and to access memory-mapped hardware devices.

Even if it sounds silly at first, it’s possible to combine ‘const’ and ‘volatile’. The following code declares a constant that shall not be inlined/optimized:

Using both ‘const’ and ‘volatile’ together makes sense when you want to ensure that developers can’t change the value of a constant and at the same time retain the possibility to update the value through some other means, later. In such a setting, you would place ‘MAX_SENSORS’ in a dedicated non-volatile memory section (ie. flash or EEPROM) that is independent of the code, eg. a section that only hosts configuration values*. By combining ‘const’ and ‘volatile’ you ensure that the latest configuration values are used and that these configuration values cannot be altered by the programmer (ie. from within the software).

To sum it up, ‘const’ means “not modifiable by the programmer” whereas ‘volatile’ denotes “modifiable in unforeseeable ways”.

CV-QUALIFIERS COMBINED WITH POINTERS

Like I stated in the intro, cv-qualifiers often appear in pointer declarations. However, this poses a problem because we have to differentiate between cv-qualifying the pointer and cv-qualifying the pointed-to object. There are “pointers to ‘const'” and “‘const’ pointers”, two terms that are often confused. Here’s code involving a pointer to a constant value:

Since the pointer is declared as pointing to ‘const’, no changes through this pointer are possible, even if it points to a mutable object in reality.

Constant pointers, on the other hand, behave differently. Have a look at this example:

The takeaway is this: if the ‘const’ keyword appears to the left of the ‘*’, the pointed-to value is ‘const’ and hence we are dealing with a pointer to ‘const’; if the ‘const’ keyword is to the right of the ‘*’, the pointer itself is ‘const’. Of course, it’s possible to have the ‘const’ qualifier on both sides at the same time:

The same goes for multi-level pointers:

Here, ‘v’ is a regular (non-‘const’) pointer to ‘const’ pointer to a pointer to a ‘const’ integer.

Yuck! Sometimes, I really wish the inventors of C had used ‘<-‘ instead of ‘*’ for pointer declarations — the resulting code would have been easier on the eyes! Consider:

versus

So

would read from right to left as “v is a POINTER TO const POINTER TO const int”. Life would be some much simpler… but let’s face reality and stop day-dreaming!

Everything I said about ‘const’ equally applies to pointers to ‘volatile’ and ‘volatile’ pointers: pointers to ‘volatile’ ensure that the pointed-to value is always loaded from memory whenever a pointer is dereferenced; with ‘volatile’ pointers, the pointer itself is always loaded from memory (and never kept in registers).

Things really get complicated when there is a free mix of ‘volatile’ and ‘const’ keywords with pointers involving more than two levels of indirection:

Let’s better not go there! If you are in multi-level pointer trouble, remember that there’s a little tool called ‘cdecl‘ which I showcased in the previous episode. But now let’s move on to the topic of how and when cv-qualified pointers can be assigned to each other.

ASSIGNMENT COMPATIBILITY I

Pointers are assignable if the pointer on the left hand side of the ‘=’ sign is not more capable than the pointer on the right hand side. In other words: you can assign a less constrained pointer to a more constrained pointer, but not vice versa. If you could, the promise made by the constrained pointer would be broken:

If the previous statement was legal, a programmer could suddenly get write access to a read-only variable:

Again, the same restrictions hold for pointers to ‘volatile’. In general, pointers to cv-qualified objects are more constrained than their non-qualified counterparts and hence may not appear on the right hand side of an assignment expression. By the same token, this is not legal:

ASSIGNMENT COMPATIBILITY II

The rule which requires that the right hand side must not be more constrained than the left hand side might lead you to the conclusion that the following code is perfectly kosher:

However, it’s not, and for good reason, as I will explain shortly. But it’s far from obvious and it’s a conundrum to most — even seasoned — C developers. Why is it possible to assign a pointer to non-const to a pointer to ‘const’:

but not a pointer to a pointer to non-const to a pointer to a pointer to ‘const’?

Here is why. Imagine this example:

Graphically, our situation is this. ‘ppc’ points to ‘p’ which in turn points to some random memory location, as it hasn’t been initialized yet:

Now, when we dereference ‘ppc’ one time, we get to our pointer ‘p’. Let’s point it to ‘VALUE’:

It shouldn’t surprise you that this assignment is valid: the right hand side (pointer to const int) is not less constrained than the left hand side (also pointer to const int). The resulting picture is this:

Everything looks safe. If we attempt to update ‘VALUE’, we won’t succeed:

But we are far from safe. Remember that we also (indirectly) updated ‘p’ which was declared as pointing to a non-const int and ‘p’ was declared as pointing to non-const? The compiler would happily accept the following assignment:

which leads to undefined behavior, as the C language standard calls it.

This example should have convinced you that it’s a good thing that the compiler rejects the assignment from ‘int**’ to ‘const int**’: it would open-up a backdoor for granting write access to more constrained objects. Finding the corresponding words in the C language standard is not so easy, however and requires some digging. If you feel “qualified” enough (sorry for the pun), look at chapter “6.5.16.1 Simple assignment”, which states the rules of objects assignability. You probably also need to have a look at “6.7.5.1 Pointer declarators” which details pointer type compatibility as well as “6.7.3 Type qualifiers” which specifies compatibility of qualified types. Putting this all into a cohesive picture is left as an exercise to the diligent reader.

________________________________
*) Separating code from configuration values is generally a good idea in embedded context as it allows you to replace either of them independently.

The Right to Choose

Shame on me! For the first time in a decade, I forgot to celebrate Towel Day. The fact that my towel proved so incredibly useful a couple of days earlier makes me feel even guiltier.

Douglas Adams is admired by most of his fans because he was not just tremendously funny, but also a shrewd freethinker who loved to show people how limited and simple-minded their views and beliefs are (anyone remember the Total Perspective Vortex?). But actually, he himself had his eyes opened one day by another person: in an interview with BBC, Douglas Adams disclosed that one of the most influential, eye-opening books he had read was “The Blind Watchmaker” by Richard Dawkins.

In his book, Richard Dawkins demonstrates that by Darwin’s Theory of Evolution and natural selection enormous complexity can arise out of even the simplest building blocks and that there really is no need for a Creator. Douglas Adams and Richard Dawkins were so like-minded that it’s no wonder they later became close friends.

At the end of one of his talks, Richard Dawkins presents a slide titled “The Illogic of Default”, which demonstrates the ill-reasoning of many Creationists:

1. We have theory A and theory B
2. Theory A is supported by loads of evidence
3. Theory B is supported by no evidence at all
4. I can’t understand how theory A explains X
5. Therefore theory B must be right

This exactly describes what happend to me one day when a muslim taxi driver tried to “prove” to me that there indeed must be a god. “See”, he said, “the theory of evolution just can’t be right. If we humans are really decedents of animals like apes and dogs, why are there still apes and dogs around?”.

I refrained from arguing with my taxi driver, even though it was obvious that he didn’t understand the Theory of Evolution at all. How could I resist the temptation?

I absolutely admire Richard Dawkins for his wit and I recommend that you watch all of his Youtube talks and videos, but there is a problem with atheists like him: sometimes, they are just as annoying as followers of any other conviction when they try to foist their views on others. Even if we know that Creationist’s arguments are utterly wrong from a logical and scientific point of view, it doesn’t help: it’s a documented fact that a significant part of humankind has a strong desire for spirituality. And, as we all know, whenever strong desires are involved, appealing to logic doesn’t work: drug addicts do know that drugs ultimately kill them but they nevertheless don’t quit.

Likewise, deep in their hearts, many believers in god assume that there is no god, at least not one like the one that is depicted in the holy scriptures, but they need a god for their mental well-being, anyway. Trying to prove to them that there is no Creator is a) futile and b) hurts them — not physically but emotionally. This is why I didn’t argue with the taxi driver. I strive to follow the Golden Rule, which — incidentally — also appears in many holy scriptures: always treat others the way you want to be treated.

So why I and many other people prefer to have their eyes opened, others don’t. Everybody should be free to choose the color of the pill they take.