Pointers in C: Part VII, Being Relaxed About The Strict Aliasing Rule

“I am free, no matter what rules surround me. If I find them tolerable, I tolerate them; if I find them too obnoxious, I break them. I am free because I know that I alone am morally responsible for everything I do.”
― Robert A. Heinlein

The largely unknown “Strict Aliasing Rule” (SAR) has the potential to send tears to the eyes of even the most seasoned C/C++ developers. Why? Because of it, a lot of the code they have written over the years belongs to the realm of “undefined behavior”.

Despite its name, the term “undefined behavior” itself is well-defined by the C language standard: it’s “behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements. Which means anything can happen: your program could randomly crash or even send suggestive emails to your boss.

THE PROBLEM

Let’s start with the code snippet that I used in my original post on SAR:

Here, data that has been received into a buffer (‘data’) is converted into a high-level data structure (‘measurements’). From the compiler’s point of view, what ‘data’ refers to is just a single ‘uint8_t’ but we access it through a pointer to type ‘struct measurements_t’. What we’ve got here is a clear violation of SAR, which entails undefined behavior.

SAFE ALTERNATIVES

“But, Ralf”, you might respond, “this can’t be true. I write code like this every day and it works flawlessly, even in safety-critical systems like medical devices!”

This doesn’t surprise me in the least. “Undefined behavior” can — get this — also mean “works flawlessly”. But there are no guarantees, whatsoever. It might work on one platform, with a particular compiler or compiler version, but might fail on another platform, or with a different compiler version. Hence, to err on the truly safe side (which you should, especially if you work on safety-critical systems), you should use truly safe alternatives.

One obvious and time-proven approach is to do such so-called type punning through unions. It works by storing data via a member of one type and reading it via another member of a different type:

The receiving function would store byte-wise into the ‘receive_buffer.data’ array, while high-level functions would use the ‘receive_buffer.measurements’ member. This will work reliably in any version of C, but it might fail in C++.

Bulletproof type-punning, one that works in both, C and C++, uses ‘memcpy’. ‘memcpy’!? ‘memcpy’, that’s right:

Believe it or not, there’s a high probability that your compiler will optimize-out the call to ‘memcpy’. I’ve observed this, among others, with ‘gcc’ and ‘clang’, but I’ve also seen compilers always call ‘memcpy’, even for the smallest amounts of data copied, regardless of the optimization level used (Texas Instruments ARM C/C++ compiler 19.6, for instance). Nevertheless, this is my go-to type-punning technique these days, unless performance is paramount. (You first have to prove that your code really impacts overall performance by profiling. Otherwise, your optimizations are like buying Dwayne Johnson an expensive hair brush — it doesn’t really harm, but it’s not of much use, either.)

BUT I REEELLY, REEELLY MUST USE CASTS

Sometimes, you have to use SAR-breaking casts, if only to maintain social peace in your team. So how likely is it that your compiler will do something obscene?

VERY unlikely, at least in this example. Let me explain.

First of all, compiler vendors know that most developers either haven’t heard about SAR or at least don’t give a foo about it. Therefore, they usually don’t aggressively optimize such instances. This is particularly true for compilers that are part of toolchains used in deeply (bare-metal) embedded systems. However, ‘gcc’ as well as ‘clang’, which are used in all kinds of systems, take advantage of SAR from optimization level 2 on. (You can explicitly disable SAR-related optimizations regardless of the optimization level by passing the ‘-fno-strict-aliasing’ option.)

Second, what ‘convert’ is doing is pretty much well-behaved. Sure, it aliases the ‘data’ and ‘measurements’ pointers, but it never accesses them concurrently. Once the ‘measurements’ pointer has been created, the ‘data’ pointer is not used anymore. If the caller (or the whole call-chain) are equally well-behaved, I don’t see a problem (don’t trust me!).

Third, there’s no aliased read/write access. Even if ‘data’ and ‘measurements’ were used concurrently, it wouldn’t be a problem, as long as both are only used for reading data (don’t trust me on this one, either!). By contrast, this I consider harmful:

To the compiler ‘data’ and ‘measurements’ are two totally unrelated pointers to unrelated memory areas. The original value of ‘data[0]’ might be cached in a register and not refetched from memory, hence the ‘assert’ might fail. In general, this is what will most likely happen when SAR is violated in contexts where it does matter: instead of suggestive emails being sent to your boss, you are much more likely got get stale values (which of course could lead to crashes later on).

NO PUN INTENDED

Let’s get real about SAR. Here are some relaxed, pragmatic rules on how to deal with the Strict Aliasing Rule:

0. Fully understand SAR
1. Try hard to adhere to SAR
2. Type-pun using ‘memcpy’
3. If you can’t, disable SAR-related compiler optimizations
4. If you can’t, avoid concurrent, aliased read/write access

But don’t assume that just because you didn’t get a ticket for speeding in the past, you will never ever get a ticket for speeding. What you’re doing is against the law. If you get busted someday, don’t whine and don’t complain I didn’t warn you. Rather own your failures and move on.

Breakin’ rocks in the hot sun
I fought the law and the law won
I fought the law and the law won
I USED SOME CASTS FOR TYPE PUN
I fought the law and the law won
I fought the law and the law won

(with apologies to Sonny Curtis)

Bug Hunting Adventures #14: Bitmap [BM]adness (Solution)

It’s a given fact of life that something that’s deemed totally safe in one environment may be totally unsafe in another. Every German who has ever used an American sauna knows what I’m talking about.

Similar (but far less embarrassing!) traps lurk in situations where you reuse perfectly working C++ code in a C environment. Some time ago, I integrated a little home-grown C++ library into a plain C project. However, instead of the expected, proven functionality I got plenty of core dumps. After some assembly-level debugging, I came to the conclusion that I had found a compiler bug. Code along these lines

was compiled to this:

Why the heck did the compiler insert an offset of 4 instead of 1?

The answer to this question, which is also the answer to our bug hunting adventure, can be found here.

Bug Hunting Adventures #14: Bitmap [BM]adness

“What’s the meaning of goodness if there isn’t a little badness to overcome?”
― Anne Revere

The code below is part of a C graphics processing library, which parses data in the venerable bitmap (BMP) file format. A bitmap file consists of a two parts: a header and the pixel data block. More specifically, a bitmap file is laid-out like this:

Offset Size Content
0 1 Character ‘B’
1 1 Character ‘M’
2 4 Size of the bitmap file
6 4 Reserved
10 4 Offset to the first byte of the pixel data (ofs)
14 n Info block
ofs m Pixel data

All multi-byte integer values (like the bitmap file size and the offset to the pixel data) are stored in little-endian format.

The function ‘bmp_pixel_data’ takes a pointer to a bitmap file data and returns a pointer to the bitmap’s pixel data area within the bitmap. The size of the pixel data area is returned via the ‘size’ out parameter. In case the provided bitmap file data is malformed, a NULL pointer is returned and the ‘size’ out parameter is set to zero.

As always, the code compiles cleanly without warnings (at ‘-W -Wall’), but when the function ‘bmp_pixel_data’ was put to use, it failed miserably. Where did the programmer goof?

Solution

Simplified: The Weasel That Teaches Evolution

“Weaseling out of things is important to learn. It’s what separates us from the animals… except the weasel.”
— Matt Groening

It’s Towel Day again! What a great opportunity to reflect upon Life, the Universe, and Everything. In this installment of the “Simplified” series, I want to tackle nothing less than Darwin’s theory of evolution. Why? Because I seriously believe that the world would be a better place if people finally understood it.

One common misconception, for example, is that some falsely believe that evolution occurs in a linear fashion, from single-celled organisms to homo sapiens. Such individuals, like the taxi driver that I mentioned in a previous post, think that they can smash the theory of evolution by posing this cunning question:

“If we humans are really decedents of animals like apes and dogs, why are there still apes and dogs around?”

Just in case this reasoning sounds conclusive to you as well: In reality, there is no single line of evolution, it’s rather a tree with many, many branches. On these branches, the universe performs experiments and decides which species survive, through natural selection. Consequently, humans did not evolve from apes, they are rather cousins whose common ancestor was neither ape nor human.

Another fallacy is Fred Hoyle’s famous Boeing 747 analogy. Hoyle was a brilliant scientist, no doubt, yet he refused to accept that complex life-forms can emerge by chance:

“A junkyard contains all the bits and pieces of a Boeing-747, dismembered and in disarray. A whirlwind happens to blow through the yard. What is the chance that after its passage a fully assembled 747, ready to fly, will be found standing there?”

A variant of this argument is the infinite monkey theorem: a monkey typing random letters at a typewriter will never be able to produce Shakespeare’s works.

In 1986, evolutionary biologist Richard Dawkins set out to dispel such misunderstandings by implementing a computer program that works—in Dawkins’ own words—like this:

“It […] begins by choosing a random sequence of 28 letters, […] it duplicates it repeatedly, but with a certain chance of random error – ‘mutation’ – in the copying. The computer examines the mutant nonsense phrases, the ‘progeny’ of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.”

Here’s a more detailed explanation of his weasel program: it starts with a string of 28 random letters. Next, it creates N offspring strings by copying the original 28 random letters N times. When being copied, the chance of a letter being changed into another (random) letter is P. Now that we have a set of N new strings, we compare each string letter by letter against “METHINKS IT IS LIKE A WEASEL” (a quote from Shakespeare’s Hamlet, by the way) and pick the one with the most character matches (the highest match score). This one is deemed the “fittest” and kept as the survivor of the first generation; the other N-1 strings are discarded. We repeat the whole process by creating again N offspring from the survivor, then pick a new survivor and so on until the survivor finally matches “METHINKS IT IS LIKE A WEASEL”.

Let’s choose N to be 100 (one hundred offspring per generation) and P to be 5%, as in Dawkins original experiment. How many generations would it take until we finally reach “METHINKS IT IS LIKE A WEASEL”? Just guess a number, I’ll wait here…

Answer: about 50 generations! To me, this number was so ridiculously small that I decided to implement the weasel program myself. Please check it out here. Through command-line parameters, you can change the values of N and P, as well as the target phrase. Toying around with this program is so eye-opening, you suddenly realize that evolution is possible, that complexity can emerge by mutation and survival of the fittest.

Interestingly, if you set P to a high value (say 20%), you are likely to never arrive at the target phrase. A high value of P simulates an early universe, or any environment with a high natural radioactivity level. In one experiment where I set P to 20%, it took 800 generations, while a P value of 1% reached the target phrase after only 90 generations.

I also played with much longer target phrases with a length of more than 150 letters. What I’ve found is that the longer the phrase gets, the more detrimental a high mutation probability becomes. If you think about it, that’s not surprising, as the likelihood that already matching letters are replaced again with non-matching letters increases. Conclusion: the evolution of complex life-forms requires an environment that provides the right mutation probability—neither too low, nor too high. But while a low mutation probability can always be compensated by time (evolution will just progress slower), a high mutation probability stifles evolution entirely.

Anyway, while the weasel program is a blatant simplification of real life, I think it’s a great tool for demonstrating that random variation combined with non-random cumulative selection can produce complex structures. This is what evolution is all about; not monkeys hacking away at keyboards and winds blowing through airplane junkyards.

Share and Enjoy!

Coding Horrors: Sometimes They Come Back

“Within its gates I heard the sound
Of winds in cypress caverns caught
Of huddling trees that moaned, and sought
To whisper what their roots had found.”
― George Sterling, A Dream of Fear

This month’s post is a guest post by Giles Payne, a teammate from a project at Giesecke & Devrient Mobile Security almost two decades ago. At the time, we were trying to teach good coding practices by distributing so-called “Coding Horrors”, bad code examples, that showed what not to do. The other day, Giles sent me such a “Coding Horror” that he came across recently. I couldn’t help but talk him into blogging about it. Giles, take it away!

Thank you, Ralf! Let’s dive right into the code — for fun please take a minute to try and work out what this little piece of Java is doing.

(5 minute pause for head scratching.)

Were you able to work it out? No, I didn’t think so. So what is going on?

This code comes from a kind of simple browser-like application that needs to handle a stack of displayed pages. In some cases it will manage a history of the pages, in other cases it doesn’t care about the history — hence the “USE_PAGE_HISTORY” descriptor.

But what exactly is it doing? Well, what this particular code does is change the meaning of the equals comparator for PageIds to always return true when USE_PAGE_HISTORY is false. (Face palm.)

In an attempt to pinpoint exactly what is so horrendous about this I’ve written up a few basic “programming principles” that I feel have been grievously violated here.

Principle #1: Don’t be language snob.

If you’re programming Java, then program Java, if you’re programming Visual Basic, then program Visual Basic. The person that wrote this code was obviously heavily into functional programming languages. So totally into functional programming languages, that the thought of writing code in a non-functional programming language like Java drove him or her to distraction — or more exactly to use the “Functional Java” library. “Functional Java” is basically an attempt to turn Java into a functional programming language, which it doesn’t really manage to do. What it does do is turn your code into cryptic, impenetrable gobbledygook like this.

Principle #2: Don’t bury important logic deep in non-obvious abstractions.

The usePageHistory moniker is pretty easy to understand. If I see code like

it’s pretty easy to follow. If I had lots of places with the same if-else logic, I might separate the logic into two classes like PageStackBasic and PageStackWithHistory. Hiding the logic like whether you’re using history or not inside the equals comparison operator is a recipe for disaster because nobody is ever going to think of looking there (see next principle).

Principle #3: Don’t override standard operators in non-standard ways.

Some languages like C++ let you override arithmetic/boolean operators — others don’t. There is a lot of debate among language designers about whether allowing operator overriding is a good thing or a bad thing. On the plus side there are some very cool things you can do, for example if you have a class representing matrices then you can override the * operator to do matrix multiplication. On the minus side, they let you write stuff like this. While this code probably works now — it’s the kind of code that probably will stop working one day and when it does, it will take out an entire country’s air-traffic control system*. While it probably seemed like a clever or elegant piece of code to the person that wrote it — it’s a maintenance programmer’s nightmare.

Principle #3 is actually picked up on in the highly amusing “How to write unmaintainable code“. It’s basically a back-to-front version of “Coding Horrors” where bad coding practices are encouraged as means to achieving a job for live. (Oh my gosh, this is a Fake Surgeon’s gold mine! — Ralf’s note.)

________________________________

*) Relax — this code is not part of an air-traffic control system. My point was simply that when bad code like this blows up, the impact usually tends to be massive.

PPSD: The Fake Surgeon

“I just trust my intuition taking into account the psychology of things. Therefore, I am not persuaded by facts, but by behaviors.”
― Maria Karvouni

In his classic book on software engineering, The Mythical Man-Month, Fred Brooks presents a radically different approach to organizing development teams, which he coined the “Surgical Team.” His idea is based upon three key insights:

1. Great software developers are easily 10 times more productive than average developers.
2. Conceptual integrity is the key to the success of every software product, especially large software products.
3. The bigger the team, the more communication paths are required, which is detrimental to conceptual integrity.

Consequently, according to Brooks, a software team should consist of a single super-programmer (the “surgeon”) accompanied by nine “assistants,” give or take. The surgeon is not only the boss of the team, but he also does the main work; that is, thinking and programming, all for the sake of minimal communication and conceptual integrity. The assistants have clearly defined roles, all aimed at making the surgeon’s life easier and boosting his productivity: for example, they support him as backup programmer, editor, tester, and secretary.

To grade A developers, this approach looks quite appealing, as they finally get the resources, recognition, and status that they crave so much, while at the same time still being allowed to do what they love to do most: programming.

Alas, for various reasons the whole idea was stillborn (boo-hoo!), but that’s not what I want to focus on today. For the rest of our discussion, you just have to bear in mind these facts about Brooks’ analogy:

1. The surgeon is a talented and experienced developer.
2. The surgeon is the boss and has the company’s official mandate.
3. The assistants respect the surgeon and fully accept their roles as subordinates.

A Fake Surgeon, by contrast, is a wannabe surgeon—a surgeon that has no official mandate and his imagined assistants neither accept him nor their ascribed roles.

Early in my career, I was part of a fake surgical team. The Fake Surgeon was very prolific, worked ridiculous hours, and never went for lunch. Why? Because he wanted to make himself indispensable by keeping a massive head start on the rest of the team.

Sometimes, he would leave the office late at night for only a couple of hours, to take a quick nap and a shower. When the other team members arrived in the morning, they couldn’t believe their eyes: so much had changed overnight! They really didn’t know how or where to contribute. Instead of filling them in, the Fake Surgeon’s solution was to assign his “assistants” only non-critical, menial work because, according to him, he was way too busy to explain what he did during last night’s blaze of glory. This tactic ensured that he remained the keeper of the crown jewels. Managers, sales, customers—they only talked to him, because he was the authoritative source of everything, thus isolating the rest of the team even more.

He put up such an act, always looking so busy and stressed, that you didn’t dare disturb him. If you finally got up the courage, he’d only give you short, superficial, sometimes even misleading replies.

Reading his code was far from easy—it left you puzzled and gave rise to many questions; questions, that you were too afraid to ask. Obviously, mysterious, unreadable code was yet another element of his cunning plan: job security by code obscurity.

APPEARANCE

Regarding the choice of clothing, a Fake Surgeon is pretty average, at least clothes do not define him. By and large, he doesn’t spend much time on looks; instead, he invests his time speeding ahead of everyone else. It’s highly likely that he appears rushed, always hustling about the office, never socializing over coffee, just swiftly refilling his mug before running back to his keyboard.

PERSONALITY TRAITS

Most likely, the reasons for a Fake Surgeon’s behavior are deeply rooted in his low self-esteem and fragile ego. Deep down inside, he’s aware of his limited skills and is in constant fear of losing his job. To compensate for his insecurity, he tries to establish a regime of control, keep as much of his knowledge to himself, and make sure that his enigmatic code can only be understood by him and nobody else. A self-confident programmer, by contrast, knows that he’s indispensable because he shares information and writes clean code that can be extended by others.

Of course, such a pathological type of developer can’t thrive all by himself. Every Fake Surgeon forms an unhealthy symbiosis with an extremely weak manager who condones such egoistic behavior. In most instances, the Fake Surgeon’s boss doesn’t see through the Fake Surgeon’s ulterior motives and might even praise the Fake Surgeon for his unusual commitment. After all, the Fake Surgeon takes care of everything and thus gives the boss plenty of time to work on his own career.

What ends up happening, in almost all cases, is that the Fake Surgeon sooner or later takes advantage of his distinguished position and extorts the company by asking for a sizable pay raise. Usually, the company gives in, must give in, because they can’t afford to lose such an important person. What these companies fail to recognize is that they already lost him, anyway. Fake Surgeons are sly and anticipate that once they’ve made such a move, their days might be numbered. Hence, they use this promotion just for negotiation purposes and try to land an even better paid job with a competing company. A company, where the Fake Surgeon can rightly claim, “In my previous job, I did everything by myself” and where nobody knows his selfish, dirty little secret. Depending on whether his need for security and power are ultimately met at the new job, he might settle down and mellow out; if not, he might establish yet another cycle of abuse.

RATING

According to the Q²S² framework, a Fake Surgeon’s rating is 5/2/1/1.

TOOLING

A Fake Surgeon likes tools, especially tools that give him a strategic advantage over others. Most likely, some of the tools that he (or the team) employs were created by himself, such that he’s the only person who can operate and extend them. He also fancies rarely used libraries and programming languages and exerts quite a significant amount of effort in mastering them. To a Fake Surgeon, that’s time wisely invested, as it allows him to race ahead on steroids while the rest of the team is limping along.

CONCLUSION

Information hiding is a sound and proven software design principle but it doesn’t carry over to developers working on a team. A Fake Surgeon is the opposite of a team player—he’s a ball hog. He’s a charlatan that dupes everyone into believing that he’s a real surgeon. All he has in mind, however, is advancing his career at the expense of others, often putting the fate of whole companies at risk. He’s such a nuisance to work with that often valuable team members (who never got a chance to prove themselves) leave the toxic environment he created.

A strong manager is required to prevent a company from being taken hostage by such an individual. A strong manager wouldn’t fall prey to a Fake Surgeon’s ensnarement. Instead, he will make knowledge sharing a priority and ensure that everyone leaves the office on time. Managers, beware red flags: If somebody works excessive overtime, tell this person immediately that they should rather use their time to enable others to contribute. This way, the person’s time will yield a high return on investment and reduce the overall risk for the project. If he doesn’t understand, ask him to seek his luck elsewhere. Don’t shed any tears over this person: the only thing that you lost is a problem.