The Art of Writing Efficient Software

I’ve just uploaded my talk “The Art of Writing Efficient Software” to slideshare.
Share and Enjoy!

Bitten by the Python

pythonIt’s amazing how one can get used to something nice, up to a point where you don’t recognize it anymore. Sometimes, you don’t recognize it even when its not there anymore — at least initially.

After a night of Python coding, I was back at my embedded C++ project, when I wrote code similar to this:

...
// If item is in range.
int item = ItemStore.getCurrentItem();
if (0 <= item < ItemStore.getItemCount()) {
    ...
}

The expression that I used to check whether the item index is in range is valid in both, Python and C++. Alas, only in Python it does what it should; in C++ it's a clear bug. Obviously, the if statement should have looked like this:

if (0 <= item && item < ItemStore.getItemCount()) {
    ...
}

Duh! Of course the comparison operator doesn't work like this in Java/C#/C/C++, but while Java and C# compilers reject this code, WindRiver's Diab C/C++ compiler happily accepts it.

I wasn't so much annoyed about the fact that the Diab compiler accepted it, since it is syntactically correct according to the rules of C and C++; what upset me was the "happily" -- it didn't produce any warning about my nonsense!

There is, as I found out later, a warning level that would have reported my mistake, but this warning level was not enabled in our makefile, probably because it would have generated many false positives.

Even though I always pay attention to compiler warnings before flashing my code to the target, I had to find this bug the hard way: by using a debugger. I guess that I wasted more than one hour in total to track it down.

What I expect from compiler vendors is that (a) by default, warnings are on and that (b) this default warning level includes all warnings that are easy to detect and are almost always bugs or latent bugs, including (but not limited to):

- Use of unitinitalized variables
- Test of assignment, eg. if (a = b)
- Expressions with no side-effects, eg. a == b;
- Returning addresses of automatic variables
- Forgetting to return a value from a non-void function
- My stupid comparison mistake

I don't expect all C/C++ compilers to be as thorough as PC-Lint. Ironically, most of today's compilers are already able to report the aforementioned issues (and many more), but they do this only at higher warning levels that nobody usually enables.

The Online Programmer

“You should not have a favorite weapon, or any other exaggerated preference for that matter. To become overly attached to one weapon is as bad as not knowing it sufficiently well.”

– Miyamoto Musashi
“The Book of Five Rings”

Recently, my principles were put to the test when I was asked to integrate my existing Java application with another application that was implemented in C#/.NET — a task that made me shudder. I had fully absorbed Richard Stallman‘s philosophy, I was (and still am) a great fan of free and open software.

C#, I thought (and still think), was a nice language, but not really that much different from Java, it’s main purpose just being yet another attempt from Microsoft to lock in people into their technology and operating system (I have to clarify here that I’m not a Microsoft hater; Microsoft has done incredible services to the computing world. What I don’t like is that their operating systems are for users and not for programmers. With Mac OS X, Apple has proved that it is possible make an operating system that is friendly to users and developers at the same time; yet I avoid Apple for other reasons, but I’m digressing…)

Someone, by the name of Bryan Adams, once said “Ain’t no use in complainin’, when you got a job to do” and he was definitely right: a job is a job and life isn’t always a bowl of coffee beans. So I started porting my code to C#, I even did it in Visual Studio (Folks, since I do most of my typing in Vim, you cannot imagine what a torture that was; I constantly had to leave the home row of my keyboard, grab the mouse, go back to the keyboard, again and again. Luckily, I found a nice Vim plugin for Visual Studio that relieved most of my pains. Derek Wyatt has made some terrific screen casts about Vim; if you don’t know the benefits of Vim, please, please watch them!). Contrary to my initial belief, the job didn’t turn out to be that difficult and disgusting, mainly due to the friendly help of two guys: one was a colleague who is an experienced C#/.NET developer and the other was Mr Google himself.

The way I program these days is different to the way I programmed ten years ago. In the old days, I tried hard to learn every aspect of a programming language and most of what the libraries had to offer.

But this doesn’t work anymore, at least for me. Programming language elements and class libraries are so similar that I cannot keep them in separate compartments of my brains. But this is not a problem, at least if I have an Internet connection.

I don’t even bother to consult books, not even the HTML API documentation. Instead, I ask Mr Google things like this:

“ArrayList in C#”
“socket python”
“asynchronous IO in Java”
“XPath Java”
“random numbers C++”

In return, I get links to Q&A sites, tutorials, and online API documentation. Within seconds, I get my problem solved, sometimes even examples that I can use in my own code.

The advantage is that I’m now able to program in many languages, which is great fun and gives me a competitive edge over developers who only call themselves “C++” or “Java” experts. On top of that, I get many new insights by working with different languages; insights that turn out to be useful in completely other contexts.

I hope the Internet never goes down, though…

The Return of the Pizza Delivery Guy

Once again, while reviewing some source code, I’ve come across two new solutions to the Pizza Box Problem. For those who don’t remember, the Pizza Box Problem is about finding an answer to the question how many pizza bags you need to ship a certain quantity of pizza boxes. This problem pops up frequently in systems programming, albeit in different guises. The solution I gave then was:

pizza_bags = (pizza_boxes + N - 1) / N;

where N is the number of pizza boxes that fit into a pizza bag and ‘/’ is the integer division operator that rounds towards zero.

Here is the first alternative I’ve seen. Someone wants to chop-up a larger message into smaller packets and in order to find out how many packets are to be sent:

static const uint32_t MAX_PACKET_SIZE = 6000;
uint32_t packets = size / MAX_PACKET_SIZE;

if (size % MAX_PACKET_SIZE != 0) {
    ++packets;
}

What this code does is add one to the result of the division, except for cases where the remainder of the division is zero. This code seems to work fine, but is it also efficient?

At first glance, you might think that it is way less efficient than my official solution on the grounds that it uses the modulo operator and this requires yet another expensive integer division operation.

As usual, the real answer is: “it depends”. Some architectures (like Intel x86) have DIV instructions that return the division result and the remainder at the same time, while others do not. If the former is the case, you get the modulo operation for free. If not, you will have to pay the price. For instance, the PowerPC architecture requires you to calculate the remainder in three steps “by hand”:

divd RT,RA,RB  # RT = quotient
mulld RT,RT,RB # RT = quotient*divisor
subf RT,RT,RA  # RT = remainder

Another downside is the fact that the code is not branch-free. Depending on the platform you are on, you might get a penalty if the branch is taken (“pipeline flush”). Yet, the likelihood that the branch is taken is actually low, as it happens only if size is not evenly divisible by MAX_PACKET_SIZE.

By and large, I don’t like this approach. It requires more typing and is likely to have an efficiency issue.

But I’ve come across another, more promising solution:

packets = 1 + ((size - 1) / MAX_PACKET_SIZE);

Compared to my original solution, this version has a clear advantage: it doesn’t suffer from overflow issues. Consider the original pizza box formula:

pizza_bags = (pizza_boxes + N - 1) / N;

Depending on the value of size and N it is possible that their sum overflows the value range of their underlying types, which would result in a wrong value of pizza_bags.

On the other hand, the original solution has one distinct advantage over the other two solutions: it handles the case where size (or the number of pizza boxes) is zero correctly.

Only the original version will yield a result of zero in this case; the others will compute one, which is — of course — wrong. Any pizza delivery guy knows that if you have no pizzas you don’t need no bags at all!

So depending on the constraints of your context, you now have two efficient possibilities to choose from. Regardless of your decision, I advise you to use explicit assertions to state and check your assumptions. Either:

assert(pizza_boxes + N - 1 >= pizza_boxes);
pizza_bags = (pizza_boxes + N - 1) / N;

or

assert(pizza_boxes >= 1);
pizza_bags = 1 + ((pizza_boxes - 1) / N);

Circular Adventures IV: A “Complementary” Gift


“Th’ hast spoken right, ’tis true.
The wheel is come full circle, I am here.”

– Shakespear, King Lear Act 5, scene 3, 171–175

In the previous part of this series, I’ve shown how to calculate the distance between two indices into a circular buffer using the mathematical mod operator. Today, I will give you a solution that doesn’t need the mod operator and is even more efficient.

We have always viewed our N element ring buffer like this:

0 1 2 3 4 5 6 7 8 9
      ^     ^ 
      a     b

There are N = 10 elements, indices start at 0 and end at N – 1. But let’s pretend for a moment that we had signed indices instead; indices that can become negative:

-5 -4 -3 -2 -1  0  1  2  3  4
          ^        ^ 
          a        b

Like in the unsigned system, we assume that the borders are joined, that is, the successor of 4 is -5 and the predecessor of -5 is 4.

The arithmetic distance between a and b is, like before, b – a:

    1 - -2 = 1 + 2 = 3

What is special about this new system is that if the distance gets larger than 4, it suddenly becomes negative. Remember that the successor of 4 is -5. So in this case

-5 -4 -3 -2 -1  0  1  2  3  4
          ^                 ^ 
          a                 b

b – a is +6 which is, taken our special wrap-around mechanism into account, -4. This means that not b is ahead of a, but a is ahead of b by 4!

In fact, the wrap-around value of the arithmetic distance is exactly the circular distance that we calculated in part III using a different approach:

b-a    leader  a < b   cd(a, b)
-------------------------------
 0       -     false      0
 1       b     true       1
 2       b     true       2
 3       b     true       3
 4       b     true       4
 5       -     -          -
 6       a     false     -4
 7       a     false     -3
 8       a     false     -2
 9       a     false     -1
-1       a     false     -1
-2       a     false     -2
-3       a     false     -3
-4       a     false     -4
-5       -     -          -
-6       b     true       4
-7       b     true       3
-8       b     true       2
-9       b     true       1
-------------------------------

Note that our invariant which we established in the previous installment still applies: the true circular distance between two indices must be half the size of the ring buffer (N) or less. That’s why the circular distance for cases where b – a equals +/-5 is marked as undefined in the table.

In order to calculate the circular distance, we need to take the following steps:

1. Convert the positive indices a and b from an unsigned number system (0 .. N) into indices of a signed number system (-N/2 – 1 .. +N/2) via a ‘magic’ wrap-around function.

2. Calculate the arithmetic difference: d = magic(b) – magic(a)

3. Calculate the circular distance by applying the magic function to the arithmetic distance: cd = magic(d)

Or, in pseudo code:

circular_distance(a, b, N):
    return magic(magic(b) - magic(a))

You probably wonder how magic() could be implemented, but by now, you should know that the mod operator is the answer to almost all circular problems. So one way to implement magic() is this:

magic(x, N):
   return (x - N/2) mod N

Viewed from a different angle, our new signed system is really just the old unsigned system translated by -N/2.

But what have we gained compared to our implementation of circular_distance from part III? We now have to invoke the mod operator three times (due to three calls to magic()) compared to one time. And we still have the problem that the mathematical mod operator is not available on all platforms/languages. Unless we find an efficient alternative implementation of magic(), we are off worse.

If you take a closer look at the new signed system (values that go from -N/2 – 1 to +N/2) you might notice that it bears a strong resemblance to C’s integer types. Consider the following value ranges on a typical 32-bit system:

Type            From         To
-------------------------------------
signed char     -128         +127
signed short    -32768       +32767
signed int      -2^31 - 1    +2^31
signed long     -2^31 - 1    +2^31
-------------------------------------

Apparently, in the C programming language, the type ranges also go from -N/2 – 1 to +N/2! But what’s even better is that if you exceed the upper/lower boundaries of a signed C integer type, wrapping will be done just like we want it:

signed short x = 32767;
++x;
assert(x == -32768);
--x;
assert(x == 32767);

Strictly speaking, according to the C/C++ standards, what happens on signed integer over-/underflow is not defined. But if your system uses two’s complement to represent negative numbers (and I’ve never come across a system that does not) you are fine.

Leaving this academic portability issue for C/C++ aside, in C/C++/C# and Java, magic() corresponds to a simple cast to a signed type, provided our ring buffer has a size that is supported by the value range of an integer type. If, for example, N is 256, circular_distance can be implemented like this:

#define CIRCULAR_DISTANCE_8_BIT(a, b) ( (int8_t) ( (int8_t)(b) - (int8_t)(a) ) )

We can leave out the explicit casts if we use a function, since the parameters and return value will be ‘cast’ implicitly:

inline int8_t circular_distance_8_bit(int8_t a, int8_t b) {
    return b - a;
}

As an example, consider a 32-bit timer that keeps track of the system time. A timer interrupt service routine will update the timer value regularly behind the scenes. Based on what we have learned, you can easily measure the time between two events, even if the timer should overflow in between:

volatile uint32_t g_systemTime;
...
uint32 start = g_systemTime;

// Do some lengthy operation.

int32_t delta = circular_distance_32_bit(start, g_systemTime);

Most systems software (including embedded systems) is implemented in C, mainly because efficiency is at a premium. Due to the fact that C doesn’t come with a mathematical mod operator, it is sometimes sensible to “force” the size of timers/counters/ring buffers to the upper bound of an unsigned integer type. With this being the case, the aforementioned ‘cast-to-signed’ trick can be applied and calculating circular distances becomes both, easy and computationally cheap.

Circular Adventures I: The Modulo Operation
Circular Adventures II: A Ring within a Ring
Circular Adventures III: A Distance Apart

Masterpieces

I’ve already written about the importance of daily practicing for software developers. One essential habit is Playgrounding, where developers practice in easily accessible try-out areas.

Playgrounds are great for tactical programming, when you want to explore a language feature or a certain technique. In order to try out strategic — or larger — ideas you need something else, something I call Masterpieces.

I borrowed the term Masterpiece from the guild system that was (and still is in many countries) responsible for the education of professional craftsmen. Journeymen who want to become masters not only have to take theoretical lessons and write exams; at the end they have to present a so-called ‘masterpiece’ which is meant to demonstrate that they possess the required skills of the trade.

In my opinion, developers should work on Masterpieces, too.

By my definition, a Masterpiece is a project that serves to improve development skills, provides real-world utility to others, and demonstrates the skill and maturity levels to potential clients and employers.

The last point should not be underestimated. As a hiring manager, would you rather hire developers who can demonstrate their abilities (and stamina) to create real-world software products or people who just claim how good they are? (Usually, for legal reasons, it is next to impossible to show production code from a previous (commercial) product that candidates have worked on, but with Masterpieces it is easy.)

Masterpieces don’t have to be true masterpieces according to the other generally accepted definition of the term: “works of art of outstanding quality and/or novelty”. If they are, that’s terrific, but the main focus of a Masterpiece is on skill-improvement and demonstrating the “can-and-willing-to-do” attitude of a developer. Nevertheless, it goes without saying that Masterpieces should be of decent quality. For instance, source code should be laid-out nicely, have good identifier names and there should be documentation, at least a README file, that shows how to use it. If there are automated unit tests, so much the better!

Here is an example of a very small Masterpiece. Some time ago, while trying to improve my Python skills, I wrote a tiny tool called “NoComment!” — a Python script that scans C/C++/Java source code and counts the total number of commented and uncommented lines. It also calculates a comment-density metric. Call it trivial, but it does have real-world utility. You can use it to easily find out if there is code in your project with too little documentation. Even though it is extremely simply, I added a README file, some regression tests and put it up on Sourceforge.

But just showing that you are a great coder is not enough these days — social and especially communication skills are at least as important.

That’s why Masterpieces are not limited to implementing software. You can improve, share, and demonstrate your skills by hosting blogs, giving lectures at conferences, publishing tutorials, podcasts, screencasts, and by contributing to Q&A sites like stackoverflow.com (who wouldn’t want to hire Jon Skeet for his next C# project?).

Masterpieces are a clear win-win habit: you can improve and demonstrate your skills, while others benefit from your knowledge and experience. Just like in the guild system, you cannot become or be a master without them.

Circular Adventures III: A Distance Apart

“Distance has the same effect on the mind as on the eye.”
― Samuel Johnson, The History of Rasselas, Prince of Abissinia

In previous episodes of this series, I’ve only focused on circular problems involving a single index into a ring buffer. Today, I want to extend the scope to use-cases involving two indices.

At the end of part two of this series, I posted a problem where I wanted to keep track of the time when certain events occurred:

+---------------------------------+-----------+
| First type of event             | 353011    |
+---------------------------------+-----------+
| Second type of event            | 1643      |
+---------------------------------+-----------+
| Third type of event             | 4         |
+---------------------------------+-----------+
| ...                             | ...       |
+---------------------------------+-----------+
:                                 :           :

The first column of this table represents the event type, the second a time-stamp showing when the event took place. The time-stamp comes from a 32-bit timer which can wrap-around.

Our task is to find out in which order these events occurred, which basically boils down to finding a method that allows us to compare two time-stamps, taking wrap-around into account. Once we have such a method, we are able to sort the event list.

Viewed from a another angle, comparing two values of a wrap-around timer is exactly the same as comparing two indices into a ring buffer:

0 1 2 3 4 5 6 7 8 9 (Fig1)
      ^     ^ 
      a     b

In this ring buffer of 10 elements, is a ahead of b or is it the other way around? Which one is larger and which one is smaller? Since we don’t know how many times a or b have wrapped around we are not able to give a meaningful answer to this question. So let’s agree on a rule, an invariant that at any time holds for our indices: a and b may wrap around (as many times as they like) but their real, absolute distance is always less than half the size of the buffer (i. e. N div 2). This effectively means that one index can never get ahead of the other by half the size of the buffer or more.

Under this assumption, it is clear that in Fig1 b is ahead of a and the true distance is 6 – 3 = 3. Because of our invariant, a cannot be ahead of b. Why? If a were ahead of b it would have wrapped around and the real, absolute distance between a and b would then be 7, thus violating our rule which requires that the real distance is less than 5.

Contrast this with the next example:

0 1 2 3 4 5 6 7 8 9 (Fig2)
  ^           ^ 
  a           b

a must have wrapped around and it must be ahead of b by 4. Otherwise (if b were ahead of a), the distance would be 6 and this — according to our “less than half the buffer size” rule — would be illegal.

The next example is ill-formed: regardless of how you view it, the distance is always 5 so the invariant is violated and we cannot tell which index is ahead of which:

0 1 2 3 4 5 6 7 8 9 (Fig3)
  ^         ^ 
  a         b

Just in case you wonder if there is a way to do without our cumbersome invariant: We could in fact utilize the full ring buffer range and not only less than the half, but this would require another invariant: one that requires that index a is always behind index b and there is no overtaking taking place. Employing this invariant works in producer/consumer scenarios but turns out to be a lot less useful in the most typical circular use cases. It certainly would be of no help in our event-recording example above, as we cannot assume which event comes before or after another.

Based on our invariant, it was quite easy for us (as human beings) to determine which index is ahead of which. But how can we determine it in software? Simply calculating the linear distance b – a is not sufficient, since we are operating in a circle and not along a one-dimensional axis. In Fig2, calculating b – a gives 6, but the real distance is -4, since a is ahead of b by 4. If we calculate a – b, we get -6 still not -4. How can we get from linear differences to circular differences?

Ah, you must have guessed it! Here it goes again, the mod operator, our universal weapon for fighting circular problems!

Let me explain how the mod operator helps by using one of the most powerful analysis tools ever invented — a table:

b-a  (b-a) mod 10  leader  a < b    Note
-------------------------------------------------
 0         0         -     false    Distance is 0
 1         1         b     true
 2         2         b     true
 3         3         b     true
 4         4         b     true
 5         5         -     -        Illegal, Fig3
 6         6         a     false
 7         7         a     false
 8         8         a     false
 9         9         a     false
-1         9         a     false
-2         8         a     false
-3         7         a     false
-4         6         a     false
 5         5         -     -        Illegal, Fig3
-6         4         b     true
-7         3         b     true
-8         2         b     true
-9         1         b     true
-------------------------------------------------

In the first column are all possible values for the linear distance of two indices; in the second column, this linear distance is taken modulo N, the size of the ring buffer (which happens to be 10 in our examples). In the third column, I put the leader, the index which is ahead of the other index. I did this manually, by drawing sketches and counting, just like I did in Fig2, when I claimed that a is ahead of b by 4. There are two discontinuities in the table: when the distance is exactly 5, that is, half the size of the ring buffer.

Ignoring these discontinuities and looking at column 2 and 3 we can observe that b is ahead of (or equal to) a if the linear distance taken modulo 10 is less than 5; otherwise a is ahead of b.

Based on this observation, we are now able to implement a less-than operator, which allows us to sort circular values. (As you may know, many sorting algorithms from standard libraries, like, for instance, the C++ standard library sort() algorithm, invoke the less operator ‘<’ on values of a given range to do its job.) Here is an implementation of a circular less function in pseudo-code:

# Returns true if a is less than b in a circle of size N,
# false otherwise.
circular_less(a, b, N):
    if ((b - a) mod N < N div 2)
        return false
    return true

Having had so much fun and success with the table, I added yet another column: the true circular distance, again based on sketching and counting. In Fig2, the true circular distance between a and b would be -4, since a is ahead of b. In Fig1, the circular distance is +3, since b is ahead of a:

b-a  (b-a) mod 10  leader  a < b   cd(a, b) 
-------------------------------------------
 0         0         -     false      0
 1         1         b     true       1
 2         2         b     true       2
 3         3         b     true       3
 4         4         b     true       4
 5         5         -     -          -
 6         6         a     false     -4
 7         7         a     false     -3
 8         8         a     false     -2
 9         9         a     false     -1
-1         9         a     false     -1
-2         8         a     false     -2
-3         7         a     false     -3
-4         6         a     false     -4
-5         5         -     -          -
-6         4         b     true       4
-7         3         b     true       3
-8         2         b     true       2
-9         1         b     true       1
-------------------------------------------

Comparing column 2 and 5 shows that the circular distance (cd) coincides with the linear distance mod 10 for values less than half the size of the circle. For other cases (again, ignoring the discontinuity cases) the circular distance can be obtained by subtracting 10 from the linear distance (mod 10). In pseudo-code:

circular_distance(a, b, N):
    dm = (a - b) mod N
    if (dm < N div 2)
        return dm
    return dm - N

Now this function is even more useful than circular_less. You can use it directly in sorting algorithms like qsort() from the C standard library, or, if you are using C++’s sort(), you can easily implement circular_less in terms of circular_distance:

circular_less(a, b, N):
    if (circular_distance(a, b, N) < 0)
        return true
    return false

But the real benefit of circular_distance over circular_less is that you are now able to do real circular distance calculations. In our event table example above, you can determine how many ticks are between event 1 and event 2. Or, as another example, if you have a free running 16-bit timer (a counter with N = 65536) that is updated every millisecond, you could use circular_distance like this:

start_time = get_timer_value()
... some lengthy calculation
end_time = get_timer_value()

cd = circular_distance(start_time, end_time, 65536)
log("Lengthy calculation took " + cd + " milliseconds")

Even at the risk of repeating myself, please be aware that the mathematical mod operator and Java’s/C’s % operator are not the same. You can, however, apply the trick that I showed in part I: use ANDing instead of the mod operator for cases where N is a base-2 number; 65536 is a base-2 number so in C you would implement circular_distance like this:

inline int circular_distance_16bit(int a, int b) {
    const int N = 65536;
    int dm = (a - b) & (N - 1);
    if (dm < N / 2) {
        return dm;
    }
    return dm - N;
}

In this installment, I’ve shown how to efficiently use the mod operator to solve circular problems involving more than one index. The solution is based on an invariant that states that the distance between two indices is always less than half the size of the value range. In the next — and final — episode, we will learn about another technique, one that is even more powerful than the mod operator: performing circular distance calculation in a two’s complement system.

Next: Circular Adventures IV: A "Complementary" Gift

Capital Offense: How to Handle Abbreviations in CamelCase

The CamelCase notation is one of the most popular word separation schemes. It used to be my favorite as well, but these days, I happen to prefer underscores, just like Kernighan and Ritchie did. But this is just my personal taste and it might be subject to change. I certainly don’t want to convince you to change your style. As always, it is more important to follow one style consistently than trying to find the “best” style.

Speaking of consistency, if there is one clear downside of CamelCase, it’s that it seduces people into being inconsistent — at least if abbreviations (or acronyms) are used. Take a look at some real-life examples from the Java API:

    HTTPException
    HttpURLConnection
    XMLFormatter
    XmlID
    XmlIDREF

But what an inconsistent mess this is! Followers of the CamelCase school of thought obviously have difficulty deciding how to handle abbreviations. In some cases, an abbreviation is written all-uppercase, in other cases only the initial character is capitalized. To avoid such mixes, some coding conventions, like Python’s PEP-8, try to cure this by giving explicit advice:

When using abbreviations in CapWords, capitalize all the letters of the abbreviation. Thus HTTPServerError is better than HttpServerError.

I totally disagree. While it works in simple cases, it leads to abominations when one abbreviation follows another:

    HTTPURLConnection
    XMLIDREF

The only convention that works reliably is this: “Treat abbreviations just like ordinary words”. That is, don’t use all-capital letters:

    HttpException
    HttpUrlConnection
    XmlFormatter
    XmlId
    XmlIdRef

This works also if identifiers have to start with a small letter (e. g. names of local variables):

    httpException
    httpUrlConnection
    xmlFormatter
    xmlId
    xmlIdRef

Software Project Democracy

“It has been said that democracy is the worst form of government except all the others that have been tried”
– Sir Winston Churchill

Considering all that fuss going on about agile processes these days, I sometimes wonder if this is yet another case of violating the first principle of the Agile Manifesto:

“Value individuals and interactions over processes and tools”

Given that there are so many books and consultants on the subject of Scrum/Kanban/Lean Product Development, isn’t it possible that people once again focus on processes instead of individuals?

While this hype is certainly annoying at times, the answer is a clear “No”. Agile processes care a lot about individuals and give democracy to the whole team. There is a clear concept of cross-functional, self-managing teams. Developers “pull” their work and decide (or at least largely influence) what and how to do it; work is not foisted upon them by a single dictator (aka. traditional project manager). In Scrum, the project manager role is filled by the product owner together with the rest of the team. This distributed style of project management is democratic, empowering, and very motivating for all members.

Yet, the most people-valuating “process” I’ve ever heard of comes from Tom DeMarco and Timothy Lister, mentioned in “Peopleware”:

1. Get the right people
2. Make them happy so they don’t want to leave
3. Turn them loose

To me, this is the best recipe for project success ever. If you’d follow this “process”, you wouldn’t need Scrum/Kanban/XP at all. If your team just consisted of smart and motivated people, everything else would follow. Why? Because as part of the team — and most importantly — you would also have the “right” leader, one who clearly says what to do and in what order. I’m not talking about a Gantt-chart/PowerPoint virtuoso, but a leader with the following properties and skills:

1. Charismatic, a born leader
2. Systematic, follows-up issues
3. Technically adept
4. Cares about the product and the team more than his/her personal career
5. Has the respect of upper management and the rest of the team
6. Shows that (s)he cares by inspecting and criticizing the main work product (the “Source Code”)

Alas, such gifted leaders are very hard to find.

Some claim that in real life benevolent dictatorship is by far the best form of government, much better than democracy, which is accompanied by frequent shady compromises. But also in real life, benevolent dictators are hard (if not impossible) to find and that’s why democracy was invented. So despite of all its shortcomings, democracy is the best government system in practice.

In some sense, you can view the appearance of agile methodologies as a result of our industry’s inability to find the “right” leaders. Yes, there are training and certification programmes for project managers, but in my honest opinion, they don’t help a lot; if anything, they focus on the “systematic” part, yielding people that spend a lot of time on upfront planning, estimation, and project reporting (“defending”) to upper management. They not only waste their own time but also the time and motivation of the rest of the team.

Hence, I think this strong shift from software management absolutism to software management democracy — from traditional project management to lean and agile management — it is not just hype. It is a natural and logical consequence. It is an uprising from the disappointed and suppressed “doers” who have had to suffer far too long from “traditional” project management practices.

Until our trade has developed a system for breeding benevolent leaders (like the guild system did hundreds of years ago with the apprentice/journeyman/master model), I vote for agile software development and the software process democracy that it gives me.

 

 

Dangerously Confusing Interfaces, Redux

confused.jpgLast week was a sad week for me. A bug in my code made it into the final version that was shipped to an important customer.

When something like this happens, it is almost always due to fact that there is a higher-level “bug” in the software development process, but I don’t want to go there. Instead, I want to focus on the technicalities. Once more, I got bitten by another instance of a dangerously confusing interface. Let me explain.

I’ve always had a liking for interfaces that are self-evident; that is, one knows immediately what’s going on by just looking at how the interface is used — without having to consult the documentation. Let me give you a counter example, an interface that is far from what I desire:

sort_temperatures(values, values_count, true);

The interface to ‘sort_temperatures’ is not at all self-explanatory. What is clear is that it sorts ‘values_count’ values, which are obviously temperature values, but what the heck does the ‘true’ argument stand for? In order to find out, you have to look at the declaration of ‘sort_temperatures’ and/or its API documentation:

/** Sorts an array of temperature values. 
 * @param values Begin of temperature values array. 
 * @param values_count Total number of temperature values. 
 * @param ascending If true, sort values in ascending order; 
 *        if false, sort in descending order.
 */
void sort_temperatures(double* values, size_t values_count, 
    bool ascending);

Now it is clear what the boolean parameter is for, but at the cost of having to take a detour. Maybe you got so distracted by this detour that you forgot what you originally were about to do.

Programmers often make this mistake when designing interfaces. They use boolean parameters when they have two mutual exclusive modes of operation. This is easy for the implementer of the routine, but confusing to the ones who have to use it. Contrast this with this alternative:

sort_temperatures(values, values_count, SORT_MODE_ASCENDING | 
    SORT_MODE_REMOVE_DUPLICATES);

By replacing the boolean parameter with symbolic constants you not only make the code more readable, you also open it up for future extension: adding more modes becomes straightforward.

Now have a look at this code and try to guess what it does:

uint8_t hash[HASH_SHA256_LEN];
size_t hash_len;
if (!calc_hash(HASH_SHA256,mydata, mydata_len, hash, &hash_len) || hash_len !=
    HASH_SHA256_LEN) {
    LOG << "Hash computation failed" << endl; return false; 
}

That’s fairly simple: ‘calc_hash’ calculates a SHA-256 checksum over ‘mydata’ (which is ‘mydata_len’ bytes in size) and stores it into the provided buffer ‘hash’. The length of the hash is stored to ‘hash_len’ via call-by-reference.

But is this code correct? The answer is — you guessed it — no. If you can’t spot the bug, you are in good company. You can’t see it with just the information given. ‘calc_hash’ interface is not self-evident.

I wrote this code more or less one year ago. It contains a bug that remained dormant until the product was in the hands of our customer. And it is there because of a silly interface.

The last (pointer) parameter ‘hash_len’ actually serves as both, an input AND output parameter. When you call ‘calc_hash’ it is expected that ‘*hash_len’ contains the size of the provided ‘hash’ buffer; on return ‘*hash_len’ will contain the actual number of bytes used by the hash algorithm; that is,the size (or length) of the SHA-256 checksum stored in ‘hash’. The whole idea behind this is that ‘calc_hash’ (or rather its author) wants to offer protection against buffer overruns — for cases where the provided ‘hash’buffer is not large enough to accommodate the checksum.

So the problem here is that ‘hash_len’ (being a stack variable) is not properly initialized to ‘HASH_SHA256_LEN’; it’s value is more or less arbitrary. If it is by chance greater or equal to 32 (the value of the ‘HASH_SHA256_LEN’ symbolic constant) everything is fine and the checksum is correctly calculated. If it is not, ‘calc_hash’ returns ‘false’ and an error is reported.

For as long as a year — by sheer coincidence — ‘*hashLen’ was never below 32 (which is not that unlikely, given that ‘size_t’ can accommodate values ranging from 0 to 4,294,967,296); but in the hands of the customer — and very much in line with Murphy’s Law — it happened.

OK, accuse me of not having initialized ‘*hashLen’ properly, accuse me of not having read the API documentation carefully. Maybe I did read the API documentation and then I was interrupted. I don’t know. But what I know for sure is that this bug would have never happened if the interface had been clearer.

The first problem with ‘calc_hash’ is that ‘hash_len’ is an IN/OUT parameter, which is unusual. I’m not aware of any function in the C/C++ standard library (or the POSIX libraries) which makes use of IN/OUT parameters. Since the input value (not just the output value) is passed by reference, neither the compiler nor static analysis tools like PC-Lint are able to detect its uninitialized state. One obvious improvement is to pass the length of the buffer by value:

bool calc_hash(HASH_TYPE type, const void* src, size_t src_len, void* hash,
    size_t hash_buf_len, size_t* hash_len);

Granted, there is now one more argument to pass (‘hash_buf_len’), but if the unlucky programmer ever forgets to initialize it, the compiler will issue a warning.

But let’s not stop here. I’d like to pose the following question: what good is the hash buffer length check, anyway?

In my view, it is not at all necessary. The length of a particular checksum is constant and known a priori — that’s why the hash buffer is allocated statically like this:

uint8_t hash[HASH_SHA256_LEN];

What additional benefit does a developer get by providing the same information again to the ‘calc_hash’? Isn’t this redundancy that just begs for consistency errors?

And what use is the output value that tells how long the checksum is? Again, this should, no it MUST be known a priori, there should never be a mismatch between what the caller of ‘calc_hash’ expects and what is returned. Of course, there can be error conditions but if ‘calc_hash’ fails, it should return ‘false’ and not a length different to what the caller expects.

Note that ‘calc_hash’ is not at all comparable to functions in the C API that add an additional output buffer length parameter like ‘strncpy’ or ‘snprintf’. These functions carry the length parameter for completely legitimate safety reasons because the total length of the output is usually not known a priori (for instance, as some input may stem from a human user and one has little control over how many characters (s)he will enter).

Based on these arguments, I dismiss the ‘hash_len’ parameter altogether and propose the following simplified interface:

bool calc_hash(HASH_TYPE type, const void* src, size_t src_len, void* hash);

and would use it like this:

uint8_t hash[HASH_SHA256_LEN];
if (!calc_hash(HASH_SHA256, mydata, mydata_len, hash)) {
    LOG << "Hash computation failed" << endl;
    return false;
}

Easy to read, hard to get wrong. It is impossible to forget to initialize a variable that just isn’t there.