« Posts under PC-Lint related

Introduction to PC-Lint

I’ve uploaded my presentation “An Introduction to PC-Lint” to Slideshare. Share and Enjoy!

The True Value of PC-Lint

nobugs

“An ounce of prevention is worth a pound of cure.”

— Benjamin Franklin

When you ask folks who sell static analysis tools why you should buy their expensive products, they all will tell you “to find bugs, of course!”. Very likely, they will show you a diagram that displays the exponential cost growth of fixing a bug, depending on the stage where it is detected. Also very likely, they will brag that — compared to PC-Lint — they have a very low “false positive” rate.

Fair enough, but the story doesn’t end here. Take this code, for instance:

The code, as it stands, is 100% error-free. Yet, PC-Lint will not like it for several good reasons. For instance:

  1. There are no virtual functions in Base, so what’s the point deriving from it? Was it an oversight?
  2. There is no default constructor, and hence you cannot put objects of type Base in standard library containers
  3. The Base constructor is not explicit, so plain integers might get silently converted into Base objects
  4. The add method could be declared const; adding const-correctness later is usually difficult
  5. The ADD macro is not parenthesized and neither are its parameters; this gives rise to all sorts of operator precedence problems
  6. Base’s destructor is not virtual, which means that the behavior is undefined when somebody deletes Derived objects through a pointer to Base
  7. The iostream header file is not used and hence not needed; removing the #include statement improves clarity and compilation times

So there are no bugs. But are these issues flagged by PC-Lint really false positives?

Too me, the reported warnings are a sign of the poor quality of this code; this code is full of latent bugs, just waiting to become alive in the future or in somebody else’s hands. Shady code like this increases technical debt, makes maintenance harder, riskier, and more expensive.

I want such issues reported and resolved before I check in my code, actually, before I even execute it. Right after a successful compile I pay attention to PC-Lint’s feedback and resolve real bugs, latent bugs and any bad coding practices. I don’t want to get a ticket three weeks later from a software QA guy, after I’m done with debugging and when the mental distance has become large. So large ¬†that it would take a lot of effort to recall what I was thinking at the time of writing. I want quick and easy desktop checking such that my bad code is never seen by anyone but me.

Finding bugs and code smells at the earliest possible time and thus keeping maintenance cost lost; not just focusing on finding bugs, but preventing bugs — today and tomorrow — in the first place. That’s the true value of PC-Lint.

 

Bitten by the Python

pythonIt’s amazing how one can get used to something nice, up to a point where you don’t recognize it anymore. Sometimes, you don’t recognize it even when its not there anymore — at least initially.

After a night of Python coding, I was back at my embedded C++ project, when I wrote code similar to this:

The expression that I used to check whether the item index is in range is valid in both, Python and C++. Alas, only in Python it does what it should; in C++ it’s a clear bug. Obviously, the if statement should have looked like this:

Duh! Of course the comparison operator doesn’t work like this in Java/C#/C/C++, but while Java and C# compilers reject this code, WindRiver’s Diab C/C++ compiler happily accepts it.

I wasn’t so much annoyed about the fact that the Diab compiler accepted it, since it is syntactically correct according to the rules of C and C++; what upset me was the “happily” — it didn’t produce any warning about my nonsense!

There is, as I found out later, a warning level that would have reported my mistake, but this warning level was not enabled in our makefile, probably because it would have generated many false positives.

Even though I always pay attention to compiler warnings before flashing my code to the target, I had to find this bug the hard way: by using a debugger. I guess that I wasted more than one hour in total to track it down.

What I expect from compiler vendors is that (a) by default, warnings are on and that (b) this default warning level includes all warnings that are easy to detect and are almost always bugs or latent bugs, including (but not limited to):

– Use of unitinitalized variables
– Test of assignment, eg. if (a = b)
– Expressions with no side-effects, eg. a == b;
– Returning addresses of automatic variables
– Forgetting to return a value from a non-void function
– My stupid comparison mistake

I don’t expect all C/C++ compilers to be as thorough as PC-Lint. Ironically, most of today’s compilers are already able to report the aforementioned issues (and many more), but they do this only at higher warning levels that nobody usually enables.

Dangerously Confusing Interfaces II

confused.jpgLast week was a sad week for me. A bug in my code made it into the final version that was shipped to an important customer.

When something like this happens, it is almost always due to fact that there is a higher-level “bug” in the software development process, but I don’t want to go there. Instead, I want to focus on the technicalities. Once more, I got bitten by another instance of a dangerously confusing interface. Let me explain.

I’ve always had a liking for interfaces that are self-evident; that is, one knows immediately what’s going on by just looking at how the interface is used — without having to consult the documentation. Let me give you a counter example, an interface that is far from what I desire:

The interface to ‘sort_temperatures’ is not at all self-explanatory. What is clear is that it sorts ‘values_count’ values, which are obviously temperature values, but what the heck does the ‘true’ argument stand for? In order to find out, you have to look at the declaration of ‘sort_temperatures’ and/or its API documentation:

Now it is clear what the boolean parameter is for, but at the cost of having to take a detour. Maybe you got so distracted by this detour that you forgot what you originally were about to do.

Programmers often make this mistake when designing interfaces. They use boolean parameters when they have two mutual exclusive modes of operation. This is easy for the implementer of the routine, but confusing to the ones who have to use it. Contrast this with this alternative:

By replacing the boolean parameter with symbolic constants you not only make the code more readable, you also open it up for future extension: adding more modes becomes straightforward.

Now have a look at this code and try to guess what it does:

That’s fairly simple: ‘calc_hash’ calculates a SHA-256 checksum over ‘mydata’ (which is ‘mydata_len’ bytes in size) and stores it into the provided buffer ‘hash’. The length of the hash is stored to ‘hash_len’ via call-by-reference.

But is this code correct? The answer is — you guessed it — no. If you can’t spot the bug, you are in good company. You can’t see it with just the information given. ‘calc_hash’ interface is not self-evident.

I wrote this code more or less one year ago. It contains a bug that remained dormant until the product was in the hands of our customer. And it is there because of a silly interface.

The last (pointer) parameter ‘hash_len’ actually serves as both, an input AND output parameter. When you call ‘calc_hash’ it is expected that ‘*hash_len’ contains the size of the provided ‘hash’ buffer; on return ‘*hash_len’ will contain the actual number of bytes used by the hash algorithm; that is,the size (or length) of the SHA-256 checksum stored in ‘hash’. The whole idea behind this is that ‘calc_hash’ (or rather its author) wants to offer protection against buffer overruns — for cases where the provided ‘hash’buffer is not large enough to accommodate the checksum.

So the problem here is that ‘hash_len’ (being a stack variable) is not properly initialized to ‘HASH_SHA256_LEN’; it’s value is more or less arbitrary. If it is by chance greater or equal to 32 (the value of the ‘HASH_SHA256_LEN’ symbolic constant) everything is fine and the checksum is correctly calculated. If it is not, ‘calc_hash’ returns ‘false’ and an error is reported.

For as long as a year — by sheer coincidence — ‘*hashLen’ was never below 32 (which is not that unlikely, given that ‘size_t’ can accommodate values ranging from 0 to 4,294,967,296); but in the hands of the customer — and very much in line with Murphy’s Law — it happened.

OK, accuse me of not having initialized ‘*hashLen’ properly, accuse me of not having read the API documentation carefully. Maybe I did read the API documentation and then I was interrupted. I don’t know. But what I know for sure is that this bug would have never happened if the interface had been clearer.

The first problem with ‘calc_hash’ is that ‘hash_len’ is an IN/OUT parameter, which is unusual. I’m not aware of any function in the C/C++ standard library (or the POSIX libraries) which makes use of IN/OUT parameters. Since the input value (not just the output value) is passed by reference, neither the compiler nor static analysis tools like PC-Lint are able to detect its uninitialized state. One obvious improvement is to pass the length of the buffer by value:

Granted, there is now one more argument to pass (‘hash_buf_len’), but if the unlucky programmer ever forgets to initialize it, the compiler will issue a warning.

But let’s not stop here. I’d like to pose the following question: what good is the hash buffer length check, anyway?

In my view, it is not at all necessary. The length of a particular checksum is constant and known a priori — that’s why the hash buffer is allocated statically like this:

What additional benefit does a developer get by providing the same information again to the ‘calc_hash’? Isn’t this redundancy that just begs for consistency errors?

And what use is the output value that tells how long the checksum is? Again, this should, no it MUST be known a priori, there should never be a mismatch between what the caller of ‘calc_hash’ expects and what is returned. Of course, there can be error conditions but if ‘calc_hash’ fails, it should return ‘false’ and not a length different to what the caller expects.

Note that ‘calc_hash’ is not at all comparable to functions in the C API that add an additional output buffer length parameter like ‘strncpy’ or ‘snprintf’. These functions carry the length parameter for completely legitimate safety reasons because the total length of the output is usually not known a priori (for instance, as some input may stem from a human user and one has little control over how many characters (s)he will enter).

Based on these arguments, I dismiss the ‘hash_len’ parameter altogether and propose the following simplified interface:

and would use it like this:

Easy to read, hard to get wrong. It is impossible to forget to initialize a variable that just isn’t there.

Using PC-Lint in a Linux Environment

PC-Lint is my favorite non-FLOSS tool. Not only does it find bugs and portability issues early in the development cycle: by using it regularly and listening to its words developers can significantly improve their C/C++ programming skills.

This post details how to run PC-Lint (which is normally intended for DOS/Windows environments) in Linux, saving developers from having to buy FlexeLint, the much more expensive Unix/Linux version.
»Read More

Habits for Safer Coding

bear.jpgIf you have been coding in C and C++ for some time, you know that it is easy to introduce bugs: type in a single wrong character and you wreak havoc. It is therefore a good idea to avoid using certain (dangerous) language features or at least using them in a way that is safer.

My favorite example is always putting braces around blocks, even if the block only consists of a single statement:

I’ve witnessed cases where really experienced programmers omitted these two “redundant” characters for brevity only to be later faced with this classic bug:

It is so easy to make this mistake — especially during merging — that MISRA (Guidelines for the Use of the C++ Language in Critical Systems) made curly braces around if/else/do/while blocks a “required rule”. Note that by following this rule, you are also protected from “dangling else” mistakes:

In general, I follow such “best practices” if they meet these criteria:

– They must solve a real problem
– They must not require any thinking

The “always-put-braces-around-blocks” rule does solve a real problem: I’m not aware of any compiler (not even Sun’s Java compiler) that warns you about the mistake shown above. (Of course, static code analysis tools (like PC-Lint) do check for these errors, and many, many more, but unfortunately only few developers use them.)

Does it require any thinking? I don’t think so. The only thing I have to remember is this: whenever I use ‘if’, ‘else’, ‘for’, ‘while’, ‘do’ I put curly braces around my code. That’s all!

Maybe you have come across this convention:

Advocates of this style reason as follows: in C/C++ you can easily — accidentally — use an assignment operator when you actually wanted to use an equality operator. Since you cannot assign to a constant, the compiler will refuse to compile your buggy code:

I must admit that I’ve never liked this convention. One reason is that it looks unnatural. Another is that it lacks symmetry; that is, it is not consistent with other comparison operators. What if you need to change the comparison some day to also include values greater than MAX_RETRIES?

Or what if MAX_RETRIES is changed from a compile-time constant to a “normal” variable whose value is obtained from a config file? Do you swap then? In fact, this convention doesn’t provide any help for mistakes in variable/variable comparisons, which occur also very frequently in day-to-day programming:

For my taste, the utility of this convention is rather limited and it requires too much thinking: “If I compare a compile-time constant (or a literal) against a variable using the equality operator, I put the constant first”. Isn’t this a clear hindsight rule? It’s a bit like attaching a tag saying “always lock the door” to your key.

Why not, instead, turn this into an habit: “If I do an equality comparison, I double check that there are two equal signs”. That’s not any more difficult to adopt, doesn’t impair readability and has the additional benefit of working in all cases — not just when comparing against constants.

I’m prepared to follow even conventions that look funny if they solve a real problem but I don’t think that this convention does: today, most compilers produce warnings when you make this kind of mistake (for GCC it’s -Wparenthesis, which is part of -Wall).

In C, there are numerous ways to make silly mistakes (in C++ there are many, many more). Instead of using questionable practices that address one tiny area, it is much better to enable all compiler warnings or — even better — use static code checkers like PC-Lint. In my opinion, that’s the most worthwhile habit for safer coding of all.