« Posts under Code

Pointers in C: Part VII, Being Relaxed About The Strict Aliasing Rule

“I am free, no matter what rules surround me. If I find them tolerable, I tolerate them; if I find them too obnoxious, I break them. I am free because I know that I alone am morally responsible for everything I do.”
― Robert A. Heinlein

The largely unknown “Strict Aliasing Rule” (SAR) has the potential to send tears to the eyes of even the most seasoned C/C++ developers. Why? Because of it, a lot of the code they have written over the years belongs to the realm of “undefined behavior”.

Despite its name, the term “undefined behavior” itself is well-defined by the C language standard: it’s “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements. Which means anything can happen: your program could randomly crash or even send suggestive emails to your boss.


Let’s start with the code snippet that I used in my original post on SAR:

Here, data that has been received into a buffer (‘data’) is converted into a high-level data structure (‘measurements’). From the compiler’s point of view, what ‘data’ refers to is just a single ‘uint8_t’ but we access it through a pointer to type ‘struct measurements_t’. What we’ve got here is a clear violation of SAR, which entails undefined behavior.


“But, Ralf”, you might respond, “this can’t be true. I write code like this every day and it works flawlessly, even in safety-critical systems like medical devices!”

This doesn’t surprise me in the least. “Undefined behavior” can — get this — also mean “works flawlessly”. But there are no guarantees, whatsoever. It might work on one platform, with a particular compiler or compiler version, but might fail on another platform, or with a different compiler version. Hence, to err on the truly safe side (which you should, especially if you work on safety-critical systems), you should use truly safe alternatives.

One obvious and time-proven approach is to do such so-called type punning through unions. It works by storing data via a member of one type and reading it via another member of a different type:

The receiving function would store byte-wise into the ‘receive_buffer.data’ array, while high-level functions would use the ‘receive_buffer.measurements’ member. This will work reliably in any version of C, but it might fail in C++.

Bulletproof type-punning, one that works in both, C and C++, uses ‘memcpy’. ‘memcpy’!? ‘memcpy’, that’s right:

Believe it or not, there’s a high probability that your compiler will optimize-out the call to ‘memcpy’. I’ve observed this, among others, with ‘gcc’ and ‘clang’, but I’ve also seen compilers always call ‘memcpy’, even for the smallest amounts of data copied, regardless of the optimization level used (Texas Instruments ARM C/C++ compiler 19.6, for instance). Nevertheless, this is my go-to type-punning technique these days, unless performance is paramount. (You first have to prove that your code really impacts overall performance by profiling. Otherwise, your optimizations are like buying Dwayne Johnson an expensive hair brush — it doesn’t really harm, but it’s not of much use, either.)


Sometimes, you have to use SAR-breaking casts, if only to maintain social peace in your team. So how likely is it that your compiler will do something obscene?

VERY unlikely, at least in this example. Let me explain.

First of all, compiler vendors know that most developers either haven’t heard about SAR or at least don’t give a foo about it. Therefore, they usually don’t aggressively optimize such instances. This is particularly true for compilers that are part of toolchains used in deeply (bare-metal) embedded systems. However, ‘gcc’ as well as ‘clang’, which are used in all kinds of systems, take advantage of SAR from optimization level 2 on. (You can explicitly disable SAR-related optimizations regardless of the optimization level by passing the ‘-fno-strict-aliasing’ option.)

Second, what ‘convert’ is doing is pretty much well-behaved. Sure, it aliases the ‘data’ and ‘measurements’ pointers, but it never accesses them concurrently. Once the ‘measurements’ pointer has been created, the ‘data’ pointer is not used anymore. If the caller (or the whole call-chain) are equally well-behaved, I don’t see a problem (don’t trust me!).

Third, there’s no aliased read/write access. Even if ‘data’ and ‘measurements’ were used concurrently, it wouldn’t be a problem, as long as both are only used for reading data (don’t trust me on this one, either!). By contrast, this I consider harmful:

To the compiler ‘data’ and ‘measurements’ are two totally unrelated pointers to unrelated memory areas. The original value of ‘data[0]’ might be cached in a register and not refetched from memory, hence the ‘assert’ might fail. In general, this is what will most likely happen when SAR is violated in contexts where it does matter: instead of suggestive emails being sent to your boss, you are much more likely got get stale values (which of course could lead to crashes later on).


Let’s get real about SAR. Here are some relaxed, pragmatic rules on how to deal with the Strict Aliasing Rule:

0. Fully understand SAR
1. Try hard to adhere to SAR
2. Type-pun using ‘memcpy’
3. If you can’t, disable SAR-related compiler optimizations
4. If you can’t, avoid concurrent, aliased read/write access

But don’t assume that just because you didn’t get a ticket for speeding in the past, you will never ever get a ticket for speeding. What you’re doing is against the law. If you get busted someday, don’t whine and don’t complain I didn’t warn you. Rather own your failures and move on.

Breakin’ rocks in the hot sun
I fought the law and the law won
I fought the law and the law won
I fought the law and the law won
I fought the law and the law won

(with apologies to Sonny Curtis)

Bug Hunting Adventures #14: Bitmap [BM]adness (Solution)

It’s a given fact of life that something that’s deemed totally safe in one environment may be totally unsafe in another. Every German who has ever used an American sauna knows what I’m talking about.

Similar (but far less embarrassing!) traps lurk in situations where you reuse perfectly working C++ code in a C environment. Some time ago, I integrated a little home-grown C++ library into a plain C project. However, instead of the expected, proven functionality I got plenty of core dumps. After some assembly-level debugging, I came to the conclusion that I had found a compiler bug. Code along these lines

was compiled to this:

Why the heck did the compiler insert an offset of 4 instead of 1?

The answer to this question, which is also the answer to our bug hunting adventure, can be found here.

Bug Hunting Adventures #14: Bitmap [BM]adness

“What’s the meaning of goodness if there isn’t a little badness to overcome?”
― Anne Revere

The code below is part of a C graphics processing library, which parses data in the venerable bitmap (BMP) file format. A bitmap file consists of a two parts: a header and the pixel data block. More specifically, a bitmap file is laid-out like this:

Offset Size Content
0 1 Character ‘B’
1 1 Character ‘M’
2 4 Size of the bitmap file
6 4 Reserved
10 4 Offset to the first byte of the pixel data (ofs)
14 n Info block
ofs m Pixel data

All multi-byte integer values (like the bitmap file size and the offset to the pixel data) are stored in little-endian format.

The function ‘bmp_pixel_data’ takes a pointer to a bitmap file data and returns a pointer to the bitmap’s pixel data area within the bitmap. The size of the pixel data area is returned via the ‘size’ out parameter. In case the provided bitmap file data is malformed, a NULL pointer is returned and the ‘size’ out parameter is set to zero.

As always, the code compiles cleanly without warnings (at ‘-W -Wall’), but when the function ‘bmp_pixel_data’ was put to use, it failed miserably. Where did the programmer goof?


Simplified: The Weasel That Teaches Evolution

“Weaseling out of things is important to learn. It’s what separates us from the animals… except the weasel.”
— Matt Groening

It’s Towel Day again! What a great opportunity to reflect upon Life, the Universe, and Everything. In this installment of the “Simplified” series, I want to tackle nothing less than Darwin’s theory of evolution. Why? Because I seriously believe that the world would be a better place if people finally understood it.

One common misconception, for example, is that some falsely believe that evolution occurs in a linear fashion, from single-celled organisms to homo sapiens. Such individuals, like the taxi driver that I mentioned in a previous post, think that they can smash the theory of evolution by posing this cunning question:

“If we humans are really decedents of animals like apes and dogs, why are there still apes and dogs around?”

Just in case this reasoning sounds conclusive to you as well: In reality, there is no single line of evolution, it’s rather a tree with many, many branches. On these branches, the universe performs experiments and decides which species survive, through natural selection. Consequently, humans did not evolve from apes, they are rather cousins whose common ancestor was neither ape nor human.

Another fallacy is Fred Hoyle’s famous Boeing 747 analogy. Hoyle was a brilliant scientist, no doubt, yet he refused to accept that complex life-forms can emerge by chance:

“A junkyard contains all the bits and pieces of a Boeing-747, dismembered and in disarray. A whirlwind happens to blow through the yard. What is the chance that after its passage a fully assembled 747, ready to fly, will be found standing there?”

A variant of this argument is the infinite monkey theorem: a monkey typing random letters at a typewriter will never be able to produce Shakespeare’s works.

In 1986, evolutionary biologist Richard Dawkins set out to dispel such misunderstandings by implementing a computer program that works—in Dawkins’ own words—like this:

“It […] begins by choosing a random sequence of 28 letters, […] it duplicates it repeatedly, but with a certain chance of random error – ‘mutation’ – in the copying. The computer examines the mutant nonsense phrases, the ‘progeny’ of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.”

Here’s a more detailed explanation of his weasel program: it starts with a string of 28 random letters. Next, it creates N offspring strings by copying the original 28 random letters N times. When being copied, the chance of a letter being changed into another (random) letter is P. Now that we have a set of N new strings, we compare each string letter by letter against “METHINKS IT IS LIKE A WEASEL” (a quote from Shakespeare’s Hamlet, by the way) and pick the one with the most character matches (the highest match score). This one is deemed the “fittest” and kept as the survivor of the first generation; the other N-1 strings are discarded. We repeat the whole process by creating again N offspring from the survivor, then pick a new survivor and so on until the survivor finally matches “METHINKS IT IS LIKE A WEASEL”.

Let’s choose N to be 100 (one hundred offspring per generation) and P to be 5%, as in Dawkins original experiment. How many generations would it take until we finally reach “METHINKS IT IS LIKE A WEASEL”? Just guess a number, I’ll wait here…

Answer: about 50 generations! To me, this number was so ridiculously small that I decided to implement the weasel program myself. Please check it out here. Through command-line parameters, you can change the values of N and P, as well as the target phrase. Toying around with this program is so eye-opening, you suddenly realize that evolution is possible, that complexity can emerge by mutation and survival of the fittest.

Interestingly, if you set P to a high value (say 20%), you are likely to never arrive at the target phrase. A high value of P simulates an early universe, or any environment with a high natural radioactivity level. In one experiment where I set P to 20%, it took 800 generations, while a P value of 1% reached the target phrase after only 90 generations.

I also played with much longer target phrases with a length of more than 150 letters. What I’ve found is that the longer the phrase gets, the more detrimental a high mutation probability becomes. If you think about it, that’s not surprising, as the likelihood that already matching letters are replaced again with non-matching letters increases. Conclusion: the evolution of complex life-forms requires an environment that provides the right mutation probability—neither too low, nor too high. But while a low mutation probability can always be compensated by time (evolution will just progress slower), a high mutation probability stifles evolution entirely.

Anyway, while the weasel program is a blatant simplification of real life, I think it’s a great tool for demonstrating that random variation combined with non-random cumulative selection can produce complex structures. This is what evolution is all about; not monkeys hacking away at keyboards and winds blowing through airplane junkyards.

Share and Enjoy!

Coding Horrors: Sometimes They Come Back

“Within its gates I heard the sound
Of winds in cypress caverns caught
Of huddling trees that moaned, and sought
To whisper what their roots had found.”
― George Sterling, A Dream of Fear

This month’s post is a guest post by Giles Payne, a teammate from a project at Giesecke & Devrient Mobile Security almost two decades ago. At the time, we were trying to teach good coding practices by distributing so-called “Coding Horrors”, bad code examples, that showed what not to do. The other day, Giles sent me such a “Coding Horror” that he came across recently. I couldn’t help but talk him into blogging about it. Giles, take it away!

Thank you, Ralf! Let’s dive right into the code — for fun please take a minute to try and work out what this little piece of Java is doing.

(5 minute pause for head scratching.)

Were you able to work it out? No, I didn’t think so. So what is going on?

This code comes from a kind of simple browser-like application that needs to handle a stack of displayed pages. In some cases it will manage a history of the pages, in other cases it doesn’t care about the history — hence the “USE_PAGE_HISTORY” descriptor.

But what exactly is it doing? Well, what this particular code does is change the meaning of the equals comparator for PageIds to always return true when USE_PAGE_HISTORY is false. (Face palm.)

In an attempt to pinpoint exactly what is so horrendous about this I’ve written up a few basic “programming principles” that I feel have been grievously violated here.

Principle #1: Don’t be language snob.

If you’re programming Java, then program Java, if you’re programming Visual Basic, then program Visual Basic. The person that wrote this code was obviously heavily into functional programming languages. So totally into functional programming languages, that the thought of writing code in a non-functional programming language like Java drove him or her to distraction — or more exactly to use the “Functional Java” library. “Functional Java” is basically an attempt to turn Java into a functional programming language, which it doesn’t really manage to do. What it does do is turn your code into cryptic, impenetrable gobbledygook like this.

Principle #2: Don’t bury important logic deep in non-obvious abstractions.

The usePageHistory moniker is pretty easy to understand. If I see code like

it’s pretty easy to follow. If I had lots of places with the same if-else logic, I might separate the logic into two classes like PageStackBasic and PageStackWithHistory. Hiding the logic like whether you’re using history or not inside the equals comparison operator is a recipe for disaster because nobody is ever going to think of looking there (see next principle).

Principle #3: Don’t override standard operators in non-standard ways.

Some languages like C++ let you override arithmetic/boolean operators — others don’t. There is a lot of debate among language designers about whether allowing operator overriding is a good thing or a bad thing. On the plus side there are some very cool things you can do, for example if you have a class representing matrices then you can override the * operator to do matrix multiplication. On the minus side, they let you write stuff like this. While this code probably works now — it’s the kind of code that probably will stop working one day and when it does, it will take out an entire country’s air-traffic control system*. While it probably seemed like a clever or elegant piece of code to the person that wrote it — it’s a maintenance programmer’s nightmare.

Principle #3 is actually picked up on in the highly amusing “How to write unmaintainable code“. It’s basically a back-to-front version of “Coding Horrors” where bad coding practices are encouraged as means to achieving a job for live. (Oh my gosh, this is a Fake Surgeon’s gold mine! — Ralf’s note.)


*) Relax — this code is not part of an air-traffic control system. My point was simply that when bad code like this blows up, the impact usually tends to be massive.

Bug Hunting Adventures #13: Prime Sums (Solution)

The challenge suffers from what I call a “chain of blunders”, where one blunder leads to another. Here are the exact details, in the traditional format.

The first who got close to the true nature of this bug was reader Shlomo who commented directly on the post, but I held back his comment in order not to spoil the fun for others. (Unfortunately, I couldn’t tell him, because he used a bogus email address—boo!). Christian Hujer, hacker extraordinaire, gave the most precise and extensive account on LinkedIn. While many found the blunder in the Makefile (Joe Nelson was the first), it was apparently such a good smokescreen that many people didn’t look any further. To me, the root blunder that started the chain of blunders is in the C language itself, which should have never allowed implicit zero-initialization of constants in the first place (which was corrected in C++).

Some believed that the preincrement of the loop-counter was the culprit as it would skip the first prime, but that’s not the case. The expression after the second semicolon gets evaluated always at the end of the loop body:

is equivalent to

Substitute ++i or i++ for <e> — there’s no difference!

On a general note, guys, please register by entering your email address in the top right corner to ensure that you will get automatic notifications for new posts as soon as they’re published. I also (usually) announce new posts on LinkedIn, but mostly hours if not days later. Nevertheless, connecting with me on LinkedIn is always a good idea and highly encouraged. Your subscriptions, likes, praise, and criticism keep me motivated to carry on, so don’t hold back!