Introduction to PC-Lint

I’ve uploaded my presentation “An Introduction to PC-Lint” to Slideshare. Share and Enjoy!

Free-Riding the Team Train

I usually start my day with a thirty-minutes run in the morning. Since I prefer running through the countryside it is not unlikely that I meet people walking their dogs. Sometimes — for reasons I cannot fathom –, these folks walk their dogs on the wrong side of the road (or rather country lane).

But let’s be fair. Even in Germany there is no law that requires pedestrians to walk on a certain side of a country lane; but what drives me nuts is that when they see me coming against them, most of them don’t dodge, let alone change to the other side of the lane — they just stay where they are. Almost all of them expect me to move, to overtake the wrong-way dog walkers.

Sometimes, just for the fun of it, I refuse to obey and stop right in front of them, which completely baffles them. Then, we stand there, deadlocked for a couple of seconds until I finally decide to give in.

I believe that what the dog walker subconsciously assumes is this: “the other one is faster, so he should move, it’s less effort for him than it is for me“. Even if this might be true in some cases, it is by no means generally true. Just because one is going faster doesn’t imply that doing something extra is easier for this person than for anyone else.

Probably, this is just all too human behavior, to get by with spending as little (effort) as possible. Especially members of sufficiently advanced societies expect that the strong protect and support (or at least are considerate of) the weak: “Those who can should support those who can’t“.

While I fully subscribe to this principle, at least in general, I have observed that this argument is often misinterpreted (misused) by slackers to mean “those who can should support those who don’t want“, an attitude that is totally unacceptable to me, neither in a society, nor in a company, and least in a team.

As another case in point, in 1913, French agricultural engineer Maximilien Ringelmann discovered that people — when working collaboratively on a given task (like pulling a rope) — excert a lot less effort than when acting alone. The bigger the group the bigger the tendency to hide behind others, to free-ride, to prefer taking over giving. Today, social psychologist call this phenomenon “social loafing” and it not only takes place in societies, but also within companies and teams.

In a company, the “Ringelmann effect” does not only impact day-to-day project work. For instance, if you call for a meeting, the more people you invite, the less people will come prepared and even fewer will make substantial contributions. Some “strategists” even exploit this fact by inviting a large party to a meeting where only a few protagonists actually make decisions. The large size of the group then gives an illusion of quality and broad acceptance of the decision, when in fact most people were just daydreaming. If there is one lesson to be (re-)learned from the Ringelmann effect it is this: “less is more”.

Here is my definition of teamwork: a group of self-motivated, self-directed individuals who share a common vision work together to achieve a common goal. If somebody needs support from a team member than never because one doesn’t want to do a task — only because one is temporarily unable to do the task oneself. Further, it is the responsibility of the supported person to keep the support to an absolute minimum and to learn and grow from it — not only to become independent of others but to be able to help others who are in need some day. The ultimate state of any team member is neither dependence, nor independence, but rather interdependence.

Making up for slackers, on the other hand, is not only a waste of time: sooner or later it drains the motivation and morale of even the most self-motivated people and encourages them to loaf as well. While I see a lot of value in so-called “B players“, I have zero tolerance for free-riders.

Surprised Again, In and Out!

depressed

“No tears in the writer, no tears in the reader.
No surprise in the writer, no surprise in the reader.”
– Robert Frost

My first real programming language was Pascal, actually Turbo Pascal.

Turbo Pascal was not just Pascal — it was Pascal on steroids: small, fast, and yet affordable to everyone, including students and hobbyists. While the version numbers increased from 1.0 to 6.0 I wrote dozens of little tools, games and applications. For a long time, I had seen no need to switch to a crude, low-level language like C; Pascal had so many advanced features like a strong type system and sets.

One of the features I really loved was the ‘in’ operator, that allowed to test for set membership:

type 
    days = (Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday);
var
    allday : set of days;
    workday : set of Monday .. Friday;
    today : days;
begin
    ...
    if today in workday then
        writeln('I have to go to work today...')
    end
...

Having an ‘in’ operator at your disposal means that there is no need to write lengthy if-then-else or switch-case abominations — you can express your intentions with ease and elegance.

I got so accustomed to Pascal’s sets and especially the ‘in’ operator that I missed it terribly in almost every programming language that I learned in the years to follow, including C, C++, Java, even Perl. I was really pleased to see that my next programming language — Python — had it.

But what was my point, again? Ah, yes…

The folks at E.S.R.Labs do an internal coding competition every once in a while. The current assignment has to do with finding best strategies for knocking-out your opponent in a Tron-like game. The game framework itself is based on Node.js/JavaScript but since I’m much more familiar with Python, I decided to do my algorithm prototyping in Python and later port everything back to JavaScript.

Actually, the transformation from Python to JavaScript was a lot easier than I thought, still my JavaScript version didn’t work as expected. After some (read: way too much) debugging, I found the problem; later, I decided that I had to write a blog post about it, as part of my therapy to get over my trauma.

Let’s first have a look at a snippet of the original Python code. I used the ‘in’ operator to check whether the value of ‘field’ at position ‘x’ and ‘y’ was either ‘FIELD_FREE’ or ‘FIELD_OTHER’:

# Go down.
x = me_x + 1
y = me_y
if y < len(field[x]) and field[x][y] in [FIELD_FREE, FIELD_OTHER]:
    to_visit.append(x)
    to_visit.append(y)
    if first_round:
...

Now for the JavaScript version. As you will surely agree, it is almost identical:

// Go down.
x = me_x + 1;
y = me_y;
if (x < field.length && field[x][y] in [Field.FREE, Field.OTHER]) {
    to_visit.push(x);
    to_visit.push(y);
    if (first_round) {
...

But as we all know looks can be deceiving. With regards to the ‘in’ operator, the designers of the JavaScript language grossly violated the “Principle of Least Surprise“: Everyone (not just people with a Pascal background) would assume that the ‘in’ operator tests whether a particular value is contained in a given set. But not so in JavaScript — not at all! In JavaScript, the ‘in’ operator does what you would expect least.

In JavaScript the ‘in’ operator tests whether its left-hand argument is a valid index into the right-hand argument. That is, the expression

field[x][y] in [Field.FREE, Field.OTHER]

would evaluate to true if the value at position (x,y) is an integer value in range [0..1], because only 0 and 1 are valid indices into a set (actually an array) containing just two elements. If the value at position (x,y) equaled the value of ‘Field.FREE’ or ‘Field.OTHER’ the result would be false, unless — by sheer coincidence and hard luck — the symbolic constants ‘Field.FREE’ and ‘Field.OTHER’ happen to have an integral value of either 0 or 1. Which they had in my case, at least in part:

Field = {
    FREE: 0,
    YOU: 1,
    OTHER: 2,
    WALL: 3
}

Hence, the JavaScript version didn’t fail loudly and obviously, but in very subtle ways.

The main reason why it took me so long to track this bug down was that the last thing I questioned was the behavior of the ‘in’ operator. When I finally proved by debugging that ‘in’ worked different to my expectations, my initial thought was that there must have been a bug in my version of Node.js. But deep in my heart I knew that this was extremely unlikely. So I did a search on the Internet and found out about the shocking truth.

Why on earth did the designers of JavaScript implement the ‘in’ operator the way they did? It works kind of as expected with dictionaries (aka. associative arrays) but using a dictionary would have been an ugly and unnatural choice in my case.  For regular arrays, the ‘in’ operator is worse than useless: it deludes programmers into using it but then doesn’t live up to their expectations.

Then, I would rather have no ‘in’ operator at all.

Password Sins of Omission — Will They Ever Learn?

“Where have all the passwords gone, long time passing?
Where have all the passwords gone, long time ago?
Where have all the passwords gone?
bad admins stole them everyone.
Oh, when will they ever learn?
Oh, when will they ever learn?”

(With apologies to Peter, Paul, and Mary)

It’s been quite a while but yesterday it happened again: after I had registered with a new electricity provider I was sent an email showing my password — right next to the words “Your Password”. There was also a link that I was supposed to click to verify my personal data, including name, address, date of birth and a lot more things that I certainly don’t want share with the world.

What amazes me most about this incident is that even today some companies are completely oblivious to how easy it is to automatically scan emails for keywords on their way from sender to receiver.

But even if the probability of someone sending around your credentials in plaintext is a lot lower today that ten years ago, the chances of being faced with poor password systems are still quite high. Here is my personal list of the grossest password-related blunders.

1. Storing user passwords. This is probably the greatest of all password sins, especially since it may spawn many other password-related follies. Don’t store passwords, not even encrypted. Instead, salt the password with a random salt and then hash it with a strong hash function before you save it. This way, no rogue admin can run away with your users’ passwords and no crazy customer service team is able to send around mails disclosing credentials.

2. Gratuitously limiting the password character set. I’ve come across password systems that only accept numbers (i. e. PINs), others only a mix of characters and numbers, some would accept special characters like %, &, and ? but not the dollar sign. I don’t see any reason why it’s not possible to use the whole unicode character set; at the very least, all printable ASCII characters should be considered valid. If I wanted to attack a site the very first thing I would do would be to set up an account and find out about their password character limitations in order to prune my search space.

3. Gratuitously limiting the password length. Some folks don’t seem to realize that with passwords, size matters: longer is stronger. An easy-to-type passphrase like “theluckgreensunsdontkiss” is much more secure than the complex, ugly, it’s-so-hard-to-remember-that-I-need-to-write-it-down “B!A7n4$”. While there is good reason to set a limit on the lower bound (I personally would require a minimum of eight characters) it is outright dangerous (read: stupid) to place an upper bound on the length. Be careful: If a site demands a password length of 6 to 8 characters, it is a clear sign that they take security lightheartedly. Maybe they even store your passwords and shorter passwords consume less space on server hard disks. Yikes!

4. Not telling users about limitations. Limiting the password length and character set is by itself a horrible practice, only made worse by not telling users what the limits are beforehand. It is quite annoying if you attempt to set a new password only to be rejected with an error message saying that your password is not in line with the password policy. This has happened to me many times and in one case there was a link to the policy that was broken.

5. No “show password” option. I vividly remember the day when I had to choose a password for a smartcard-based access system at a company I worked for. At the company’s registration office was a terminal that asked me to select a new password for my card. Since there were quite some (gratuitous) limitations (at least one number, a single capital letter, no number at the beginning, no underscore, just to name a few, and they were not displayed beforehand, of course), I had to try several times making minor variations to my favorite password until it was finally accepted. Later, when I had to use it for the first time, I couldn’t remember it anymore. I’m not sure, but I guess that if I had been able to see my weird password in plaintext after I set it, my chances of remembering it would have greatly increased. Normally, I know if somebody is staring at my screen (and this is rarely the case), so why not add a “show password” check box that I can tick? This would also help me in cases where caps lock was on or when I inadvertently switched to a foreign keyboard layout.

6. Setting constraints on the login name. Why does the login have to be an email address (which an attacker is likely to know or which might change in the future), or a number, or an eight character word in all uppercase? One of my web hosters forces me to sign in with my 12 digit customer ID that they assigned to me years ago; I always have to look it up and I always curse the web hoster. Users should have absolute freedom regarding their choice of login name — not only when they create the account; also at any point later in time.

7. Not encrypting passwords before sending them to the server. There are still some sites (and email providers) out there that ask for your credentials without establishing a secure channel first. What that means is that your login name and password are transferred in plaintext to the server and every random wiretapper is able to use it to place adverts to all kinds of shady products in your name. I don’t know how many bloggers are out there using WordPress, but it must be hundreds of thousands; and almost all of them login through a login page that doesn’t use any form of encryption. Even if the NSA is able to read your SSL-protected traffic anyway, it doesn’t mean that you want everybody else to read it, too.

8. Not (or wrongly) using a brute-force attack countermeasure. To prevent brute force attacks a good password system should track the number of failed login attempts. My bank does this but in a rather stupid way: after three unsuccessful tries, your account will be locked and you have to personally go to your local branch office to request a password reset. My suggestion: from three unsuccessful attempts on, have the user wait 10 minutes for every further attempt; additionally present a Captcha to defend against automated denial-of-service attacks.

So here you have it and this is just my personal tip of the iceberg. In order to get real security, password systems need to be both: secure from a technical point of view and user-friendly; at least, they shouldn’t haphazardly limit freedom. The best password policy is worthless (actually dangerous) if it requires users to invent next-to-impossible-to-remember passwords that will end up written down on sticky notes.

Since there are so many sites out there that carelessly deal with passwords and personal user data, I think it is a about time for a law that requires companies to follow certain minimum standards. We just can’t rely on good will and common sense. We have done this for quite some time now and it demonstrably hasn’t work out.

Circular Adventures V: The Christmas Edition

“And the Grinch, with his Grinch-feet ice cold in the snow,
stood puzzling and puzzling, how could it be so?
It came without ribbons. It came without tags.
It came without packages, boxes or bags.
And he puzzled and puzzled ’till his puzzler was sore.
Then the Grinch thought of something he hadn’t before.
What if Christmas, he thought, doesn’t come from a store?
What if Christmas, perhaps, means a little bit more?”

– Dr. Seuss
“How the Grinch Stole Christmas”

Oh well, oh well, it’s Christmas time again. Year after year, in this dark season, people contemplate the circle of life, watch reruns on TV, and rush to shopping malls to buy overpriced items for their loved ones; it is not unlikely, that — as soon as the shops are open again — these loved ones rush to return all the stuff that they don’t really need and buy something cool instead. You can also be sure that the level of madness increases every year — due to a well-known effect that economists call “inflation”.

Since so many things recur around Christmas, this is the perfect time for me to share more “circular” thoughts with you.

But first some background on the origins of Christmas. The reason why people celebrate Christmas on December 25 has little to do with the birth of Jesus, but rather with an ancient Roman cult called “Sol Invictus” (which means something like “unconquerable or invincible sun”). When the Julian calendar system was introduced around BC 45, the winter solstice occurred on December  25 (as opposed to today, where it happens on either December 21 or 22); this day was celebrated as the birth of the sun: on December 25, the sun would come back and rise to its full power over the next months.

It is believed that early Christians also took part in these festivities and celebrated together with the pagans by kindling lights. When the “Sol Invictus” cult was finally replaced by Christianity around AD 300, Christians decided to keep that special day but celebrate the birth of Jesus Christ, instead. (Today, it is assumed that Jesus Christ was born somewhere around March, BC 4.)

But now back to more technical stuff. You might have observed that the weekday of a particular date within a year gets advanced by one weekday in the following year. That is, if first of August is a Tuesday this year it will be a Wednesday next year. But how come?

Being a veteran circular adventurer by know, you should be able to come up with an answer yourself — at least you should try. Don’t cheat. Don’t read on. Think about it.

OK, here you go. A regular year (no leap year) has 365 days and a week comprises seven days. 365 mod 7 is 1, which means that after 52 7-day weeks, there is still this one day left in the old year to be filled with a weekday. You can think of it like this: the old year nibbles away one weekday from the new year and thus weekdays in the new year are “rotated left” by one. For leap years, the “rotate value” is two, since 366 mod 7 equals 2.

Regardless of your convictions, regardless of how strange they might look to adherents of other convictions, regardless of whether or what you celebrate, I wish all of you, dear readers of my blog, a Great Time and the best for the New Year.

More circular adventures…

Code Kata 3: The Perfect Match

karate.jpgIt was certainly a mix of emotions that I experienced when my then 13 year old daughter came by to talk about programming: a weird mix of pride, curiosity and — suspicion. Normally, she didn’t care at all about technical stuff and even less about what her father is doing for a living. So what the heck was she up to?

She told me about a cool “game” that she and her friends had been playing: calculating the match factor for two persons. Obviously, she wanted to try out many different person/person combinations and she had figured that this cried for some kind of automation.

I thought this was a nice opportunity to demonstrate that programming can be both: useful and fun, so we sat down and she explained how the algorithm worked.

1. Write down the names of the two persons, in capital letters, e. g.

H A R R Y   S A L L Y

2. Start at the first character, cross out all occurrences of this character and write down how many occurrences there are:

H A R R Y   S A L L Y
1

(since there is only one ‘H’)

3. Repeat this with the next character

H A R R Y   S A L L Y
1 2

(since there are two ‘A’s)

4. Repeat with the next (not yet crossed-out) character until all characters are crossed-out

1 2 2 2 1 2

(H:1, A:2, R:2, Y:2, S:1, L:2)

5. Now add up the number of occurrences. Add the first digit and the last digit and write down the sum; next the second and the last-but-one. Repeat until finished:

3 3 4

(1 + 2 = 3, 2 + 1 = 3, 2 + 2 = 4)

6. If the result (interpreted as a single value) is greater than 100, repeat step 5 until it is less or equal to 100.

7 3

(3 + 4 = 7, 3 + 0 = 3)

7. The result is the match value (73%) for the given two persons.

I implemented the first version in Perl but since I had so much fun, I decided to reimplement it in C++. Being that shell freak that I am I chose this user interface:

$ perfectmatch harry sally
73

OK, so now it is your turn. Implement this algorithm in a programming language of your liking.

You’ll get the same advice that I gave to my daughter then: Don’t be sad if you don’t get the results that you wish for. Even Romeo and Juliet have a match factor of only 56% and yet they are together in eternity.

Epilogue

Almost three years have past since that day and this was the only case where my daughter showed some interest in programming. This leads me to think that she found out — already at such an early age — that no computer will ever solve the really important problems of human beings.

Being Tolerant Towards NULLs

black_hole

KING LEAR: “What can you say to draw a third [of the kingdom] more opulent than your sisters? Speak.”
CORDELIA: “Nothing, my lord.”
KING LEAR: “Nothing!”
CORDELIA: “Nothing.”
KING LEAR: “Nothing will come of nothing; speak again.”

Around the year 2000, when the mindless “outsourcing-to-India-will-solve-all-our-problems” hype was near its top, I saw an interview with an Indian minister on TV. When the reporter asked, why there were so many talented software engineers in India, the minister replied: “Well, the number zero was invented in India, and programming is all about ones and zeros…”.

Now, this could have been a good joke, but trust me: that man was dead serious about his statement. My first reaction was anger, my second compassion: this poor guy clearly didn’t know what he was talking about. At least on the matter of programming, he was the human equivalent of a null device.

Some months ago, I worked on a Java program. In one of the dialogs the user was asked to specify a path to a file in an edit field; the default value of this edit field came from a property file which I used to store previous path entries made by the user:

// Get last entered filename from property file.
String propFileName = myProperties.getProperty(PROP_KEY_FILENAME);
String fieldFileName;

// If property exists, use it.
if (propFileName != null) {
    fieldFileName = propFileName;
// Otherwise, default filename is empty string.
} else {
    fieldFileName = "";
}

// Put filename into text field.
myTextField.setText(fieldFileName);

Even though the code above can be shortened by using the ternary operator, I thought that this ‘null’ handling business unnecessarily cluttered-up my code. All I wanted was just this: “if there is a value in the property file, use it; otherwise leave the edit field as it is (ie. empty)”. I wanted to write something that was easy on the eye, like this:

myTextField.setText(myProperties.getProperty(PROP_KEY_FILENAME));

But had I done so, I would have gotten a dreaded NullPointerException in cases where there was no filnename property in the property file. Darn! I really wish that sometimes APIs could cope with NULL values; that is, silently ignore them.

Since the early days of databases we have used NULL values to signal the absence of a real value. In SQL, it is well defined how NULL values are interpreted in the context of SQL operators (for instance, if you perform a logical OR operation like TRUE OR NULL, you will get TRUE as a result). In contemporary programming languages, NULL values are often treated like orphans.

In C, for instance, using NULL values usually puts you in the realm of undefined behavior:

int* dest = NULL;
memcpy(dest, src, 42);

Depending on the platform you are on, this might either do nothing or lead to a core dump — you cannot tell for sure. Java is more strict in this respect: if you pass a NULL reference to the VM it will complain by throwing a NullPointerException:

byte[] dest = null;
System.arraycopy(src, srcOfs, dest, destOfs, 42); // throws NullPointerException.

But wouldn’t it be equally valid to expect that nothing happens in these cases? If you copy 42 to nowhere (or from nowhere), why shouldn’t this be valid? In this case, NULL would behave like a black hole: it sucks up everything and there is no way to get anything out of it.

In OOP, people often apply the Null Object Pattern to get no-op behavior. But wouldn’t it be much nicer if methods would automatically do nothing if invoked on NULL references?

FileLogger* logger = NULL; // Do not log by default.
if (LoggingGloballyEnabled()) {
    logger = new FileLogger("logs/mylog.log");
}
...
logger.log("Hello there!"); // If logger is NULL, ignore.

Now that I’ve thought about it for a while, what I really want is that APIs and/or higher-level languages behave like /dev/null in Unix: if there is nothing, do nothing. I believe this is just another variant of the Rule of Silence:

dd if=/dev/null of=blah.txt bs=1 count=42 # Copy 42 bytes out of null device -> No-op!
dd if=file.txt of=/dev/null bs=1 count=42 # Copy 42 bytes from file.txt to null device -> No-op!
dolphin > /dev/null # Ignore output of dolphin command.

Contrary to what most people believe, doing ‘nothing’ is often harder than one might expect. Here are some examples:

  • Setting/printing a NULL string: don’t output anything
  • Writing to a file through a file pointer that is NULL: don’t write, don’t complain.
  • Writing NULL to a properly opened file: don’t write anything, leave file pointer unchanged.
  • Opening a file when filename is NULL: don’t open it, just return NULL.
  • Using memcpy where dest is NULL: don’t copy, just return NULL.
  • Using memcmp where either src or dest is NULL: return -1 (or +1).
  • Reading 42 bytes from a socket where dest array is NULL: receive 42 bytes and discard them.

Even though I’m convinced that NULL-tolerant APIs would often simplify a programmer’s life, for a low-level programming language like C, performance is everything and hence being NULL-tolerant is probably not an option. But higher-level languages (and dynamic languages and high-level APIs implemented in lower-level languages) could certainly benefit.

The “Sofware-Project-as-a-Ladder” Metaphor

ladderThe world of software development is a world full of metaphors. Since its early days, software development has been compared to fields like writing, construction, and even rugby.

Metaphors are important for understanding unfamiliar topics and, yes, even in the 21st century, software development — with its own set of peculiarities — is still unfamiliar to many, including people who have worked in the software industry for decades.

Today, I would like to contribute yet another analogy by comparing a software project to a ladder.

A classic wooden ladder is made of two rails and multiple rungs, mounted together with wood glue.

THE RAILS

No matter what kind of ladder you build — short or long — you need two firm rails that act as a framework for the rungs.

In software development, the first rail corresponds to the people and the second to the process.

It is fairly obvious (and much has been written elsewhere about the fact!) that you need the right people in order to succeed. Right means highly motivated, proactive developers with the required skill set for the job at hand.

But you also need the right process that makes sure that developers are going in the right direction. The process doesn’t need to be heavy and bureaucratic (it shouldn’t be, in fact), but it must ensure that work progresses steadily towards the final product and that no important aspect of the product is overlooked. In short, a systematic process is essential to remove the “chance factor” from a software project.

THE RUNGS

Once you have the rails you can add rungs. Rungs will ultimately determine how long your ladder will be. Every rung gets you higher and closer to your goal.

In software development, rungs correspond to practices. Practices like continuous integration, unit testing, refactoring, static analysis, just to name a few. While the process has long-term, far-reaching influence on a software project, practices are micro-processes, steps that are taken by developers on a daily or hourly basis. Processes are strategic, practices are tactical. Each rung on a ladder only has a small impact on the overall length of the ladder, but it is the sum of all rungs that makes up a ladder.

THE GLUE

To get a sturdy wooden ladder, you have to put wood glue inside the notches. This way, you ensure that the rungs will stay in place, even under load.
Even if you have the best-quality rails and rungs, your ladder will sooner or later fall apart without glue.

In a software development project, glue corresponds to the environment that an organization provides to the team: quality of offices and technical equipment, but also issues like work appreciation, salary, and overtime — in short: how people are treated. These factors influence developer motivation and productivity and whether they will stay with the project and the company. In a bad environment sustained quality work is impossible.

I know that every analogy breaks down at some point but what I like about the “Ladder” metaphor is that it doesn’t favor any particular element. For a successful project, one that delivers a great product on time, everything needs to be right: people, process, practices, and environment.

The True Value of PC-Lint

nobugs

“An ounce of prevention is worth a pound of cure.”

– Benjamin Franklin

When you ask folks who sell static analysis tools why you should buy their expensive products, they all will tell you “to find bugs, of course!”. Very likely, they will show you a diagram that displays the exponential cost growth of fixing a bug, depending on the stage where it is detected. Also very likely, they will brag that — compared to PC-Lint — they have a very low “false positive” rate.

Fair enough, but the story doesn’t end here. Take this code, for instance:

#include <iostream>
#define ADD(x, y) x + y
class Base {
public:
    Base(int i) : _i(i) { ... }
    int add(int a, int b) { return ADD(a, b); }
    ~Base() { ... }
private:
    int _i;
};
class Derived : public Base {
    void increment() { ... }
};

The code, as it stands, is 100% error-free. Yet, PC-Lint will not like it for several good reasons. For instance:

  1. There are no virtual functions in Base, so what’s the point deriving from it? Was it an oversight?
  2. There is no default constructor, and hence you cannot put objects of type Base in standard library containers
  3. The Base constructor is not explicit, so plain integers might get silently converted into Base objects
  4. The add method could be declared const; adding const-correctness later is usually difficult
  5. The ADD macro is not parenthesized and neither are its parameters; this gives rise to all sorts of operator precedence problems
  6. Base’s destructor is not virtual, which means that the behavior is undefined when somebody deletes Derived objects through a pointer to Base
  7. The iostream header file is not used and hence not needed; removing the #include statement improves clarity and compilation times

So there are no bugs. But are these issues flagged by PC-Lint really false positives?

Too me, the reported warnings are a sign of the poor quality of this code; this code is full of latent bugs, just waiting to become alive in the future or in somebody else’s hands. Shady code like this increases technical debt, makes maintenance harder, riskier, and more expensive.

I want such issues reported and resolved before I check in my code, actually, before I even execute it. Right after a successful compile I pay attention to PC-Lint’s feedback and resolve real bugs, latent bugs and any bad coding practices. I don’t want to get a ticket three weeks later from a software QA guy, after I’m done with debugging and when the mental distance has become large. So large  that it would take a lot of effort to recall what I was thinking at the time of writing. I want quick and easy desktop checking such that my bad code is never seen by anyone but me.

Finding bugs and code smells at the earliest possible time and thus keeping maintenance cost lost; not just focusing on finding bugs, but preventing bugs — today and tomorrow — in the first place. That’s the true value of PC-Lint.

 

The Principle of Least Surprise

jokey_smurf

“He reached out and pressed an invitingly large red button on a nearby panel. The panel lit up with the words PLEASE DO NOT PRESS THIS BUTTON AGAIN. He shook himself.”

“The Hitchhiker’s Guide to the Galaxy”

I’ve written elsewhere that to me, the source code itself constitutes the design of a software product. All other forms of (design) documentation fall behind, sooner or later. If you have other documentation in parallel — however legitimate your reasons may be — you have to pay the price that all violations of the DRY principle incur.

If the code is the documentation of the design, it should be easy to read. Good identifier names, short routines that do just one thing, no super-clever hacks and so on. Most importantly, it should be free of surprises such that it can be read (and skimmed) without major effort. Enter the Principle of Least Surprise (PoLS).

A for-loop like this is a surprise:

for (int i = 0; i <= len; i++) ...

Why? If a programmer sees a start index of 0 (s)he assumes an iteration over a half-open range; that is, the upper bound is excluded. Even if it is OK in this case and not a bug (who is to tell without browsing a lot of other code?) it results in quite some head-scratching. Contrast this with this rewrite:

const int SPARE_ELEMENT_COUNT = 1;
...
int gross_len = len + SPARE_ELEMENT_COUNT;
for (int i = 0; i < gross_len; i++) ...

Even without comments, everyone gets it: “Do something ‘gross_len’ times”.

Recently, I came about code that initialized a state machine:

state = STATE_READY | STATE_PERMANENT | STATE_PRIO_HIGH;

I was hunting down a bug. Since no events had occurred, I expected the state machine to be still in state ‘permanent’ but the behavior of the component made me believe that it wasn’t. So I loaded the code into a debugger and set a write breakpoint on variable ‘state’ to find out which code (actually who) reset the ‘permanent’ flag. But apart from the initialization of ‘state’ I didn’t get a hit. After walking around for some time (it is always a good idea to get away from the computer when solving difficult problems) I had another desperate idea: maybe ‘state’ was never initialized to ‘permanent’. I opened up the header file that defined the flags and what I stared at with horror was this:

static const int STATE_READY      = 0x01;
static const int STATE_PRIO_HIGH  = 0x02;
// static const int STATE_PERMANENT  = 0x04;
// Our platform doesn't support 'persistent' --> ignore!
static const int STATE_PERMANENT  = 0x00;

I guess I must have looked like Steve McConnell’s “Coding Horror” guy.

Now, I consider it OK if a platform doesn’t support all features; but this way of dealing with it is probably the worst. And it is a clear violation of PoLS: when I see code that looks like it sets a flag, I expect that it sets the darn flag. Period.

So the general advice for adhering to PoLS is “Write WYSIWYG Code”. Code that uses well-known idioms consistently, code that can be grasped without debugging and jumping back and forth between files and declarations. Put all cards on the table; say what you mean and mean what you say.