Poor Man’s DIP

Sometimes a lower-layer component needs to invoke a service on a higher-layer component. Consider, for example, a timer component (T) that periodically calls a handler function in a user-interface component (U). Component T is probably part of the OS kernel and thus clearly “lower” than component U.

In this setting, there is an upward dependency from T to U; such upward dependencies are undesirable, at least if is bound at compile-time. Implemented naively, there is a hard-coded call to the UI component like this:

// file: OS/Timer.c

#include "UI/UIManager.h"  // Bad!
...
static void OSTimerNotify(void) {
    ...
    UIManagerNotify10msTimeout(); // Bad!
    ...
}

Dependency lines that point up in a component diagram are not just ugly: they denote that the lower-layer component cannot be independently reused and tested.

The classic dependency inversion principle (DIP) is usually applied to solve this problem: instead of having a hard-coded function call in the timer to the handling component, the timer calls back on a function pointer that is set to the timer-handling routine in the initialization code of the higher-layer component:

// file: UI/UIManager.c

#include "OS/Timer.h"

static void UIManager10msTimout(void) {
    // Do something ever 10 ms.
}
void UIManagerInit(void) {
    ...
    OSTimerSet10msCallback(&UIMananager10msTimeout);
    ...
}

// file: OS/Timer.h

typedef void (*TIMER_10MS_CALLBACK)(void);

// file: OS/Timer.c

#include "OS/Timer.h"

static TIMER_10MS_CALLBACK s_10msCallback;

void OSTimerSet10msCallback(TIMER_10MS_CALLBACK callback) {
    s_10msCallback = callback;
}
static void OSTimerNotify(void) {
    ...
    if (s_10msCallback != NULL)
        (*s_10msCallback)();
    ...
}

Note that there is still a T to U dependency, but now this dependency is only present at run-time, which is OK, as this doesn’t hinder reuse and testability. The U to T compile-time dependency is quite natural and doesn’t violate any design principles. So, the undesirable compile-time dependency has been successfully inverted. The classic DIP recipe looks like this:

1. In T export a callback interface
2. In U implement the callback interface
3. In U (or some init/startup code) register the implementation with T
4. In T call back on the interface

When you are working in a constrained environment like embedded systems, you frequently cannot afford the memory and performance overhead that accompanies such late (run-time) binding, so you might try what I call the “Poor Man’s DIP”: simply export a “callback interface” as a function prototype and “implement” it by defining the function in upper-layer component:

// file: UI/UIManager.c

#include "OS/Timer.h"

// Implementation of the call-back interface.
void OSTimerNotify10msTimeout(void) {
    // Do something ever 10 ms.
}

// file: OS/Timer.h

// Call-back interface.
extern void OSTimerNotify10msTimeout(void);

// file: OS/Timer.c

#include "OS/Timer.h"

static void OSTimerNotify(void) {
    ...
    OSTimerNotify10msTimeout();
    ...
}

This pattern gives you most of the advantages of the classical (run-time bound) DIP but doesn’t incur any overhead. It can (and should) be applied whenever there is a dependency from a lower-layer component to an upper-layer component that doesn’t need to change at run-time but stays fixed throughout the lifetime of the application.

Breaking the Limits!

running.jpgHave you ever heard about Cliff Young? In case you haven’t here is his story…

For those who think that running a marathon is not challenging enough, there is the Sydney to Melbourne ultra-marathon: 875 kilometers across the hot Australian desert. This race is obviously only for the best of the top athletes and it usually takes them six to seven days to complete. Having gone through the pains of preparing and running a “normal” marathon myself I can only try to imagine what a race like that means.

Despite of these apparent tortures, in 1983, a 61-year-old farmer decided to take part in this race, wearing an old army jacket and gumboots. Spectators were worried that this poor guy wouldn’t survive the first hours of the race but when he left the stadium they gave him a big cheer anyway.

The race started and as expected, Cliff lagged way behind. After 20 hours, the first runners arrived at the camp, where they took a bite, got a massage and slept for about two hours (they typically stopped for a four hour break for every 20 hours of running). When Cliff arrived at the camp three hours later, he didn’t stop — he just continued to run. When a reporter asked him whether he didn’t want to have a break like the other athletes, Cliff responded that he was here to run and not to sleep.

He ran and ran. On the third day he decided to “sleep” for 25 minutes. Believe it or not: he arrived at the finishing line in Melbourne almost two days ahead of number two.

Amazing story, isn’t it? There a various slightly different versions of this story out there, but what they all have in common is that they stop here. What usually isn’t told is that the next year, 17 out of the 18 runners even exceeded Cliff’s record — all of a sudden, they had realized that it is possible to run the race without sleeping at all.

Cliff Young won the race because he challenged existing limits. Not really hard limits, like laws of nature, but arbitrary limits.

So the question is this: is something truly impossible or do we just think it is impossible because we haven’t tried before?

Regular Expressions — Sweetest Poison

It’s amazing how much time you can save by using regular expressions; it’s even more amazing how much time you can spend getting them to work correctly.

Because they are so powerful and easy to use, regexps can easily be misused, for instance by applying them to problems that are not “regular”, that is, where balancing is important:

    if (a > b) {
        ...
        if (x > y) {
            ...
        }
    }

Parsing problems like this are not suited for a regular expression matcher, as you need to retain state information and regexps simply cannot keep track of which blocks or braces are open or closed. In cases like this, what you really need is a parser. Period.

Alas, often people can’t be bothered writing a true parser, even if lex/yacc-like tools greatly simplify the work. And I’m guilty of this myself. Years ago I wrote a profiling tool for embedded systems. Since the embedded C code that I wanted to profile had to be instrumented (each function required enter/exit logging calls to get out the execution timing data) I needed to write a tool to do the job. I was not particularly interested in this job — hacking the actual performance analysis code was much more fun — so I decided, well, to go for a heuristic “parser” based on regexps.

In less than one hour I had cobbled together a little script that seemed to work fine. Over the next couple of months I had to spend endless hours fixing all the nasty corner cases; even today it doesn’t work in all circumstances! But I’ve learned my lessons: don’t use regexps when you need a true parser. Again, period.

But even if the problem is regular, people often define regexps sloppily. Look at the following example that checks if a .cfg file appears anywhere in a given string:

    while (<>) {
        print “Found: $&\n”
            if /\w:\\(\w+\\)*\w+\.cfg/;
    }

So let’s see what we’ve got here. We are obviously looking for a Windows-style absolute path: a single drive letter, followed by a colon and a backslash, followed by n optional directories (each of which followed by a backslash), followed by a mandatory filename that has a .cfg extension. Looks really neat…

These are the regexps people love to write and I don’t know how many times I’ve had to fix one because of this pathological “simplicity”. It might work today, but it is far from future-proof. Sooner or later the surrounding context will change and this regexp will match much more (or much less) than was intended.

Here a some of the major shortcomings:

- Using word characters \w is way too restrictive. According to the Windows long filename specification, a filename may contain any UTF-16 character, but for all practical purposes \w is really only a shortcut for [a-zA-Z0-9_]. If a filename contains a blank or umlaut, the expression won’t match anymore.

- Actually a corollary of the previous item: you cannot have partial relativity within an absolute path, e. g. C:\files\services\base\..\items\main.cfg would not match because the \w character class does not allow for dots.

- The regexp is not aligned on a word boundary, which means that if your editor happens to create backup files like C:\config\user.cfg~ they’ll match, too.

Often — but not always — using regexps means striking a careful balance between accuracy and convenience. It makes little sense to implement the complete Windows filename spec in a regexp. But investing a little energy to tighten them up usually pays off in spades. How about this?

    while (<>) {
        print “Found: $&\n”
            if /\b[a-zA-Z]:\\([^\\]+\\)*[^\\]+\.cfg\b/;
    }

At the cost of being only slightly more difficult to read, this solution is much more resilient to change due to the use of some good practices. First of all, it is easy do define a set of valid drive letters, so I used [a-zA-Z] instead of \w; second, the whole regexp is aligned on word boundaries, which means no more regexp over-/underruns and third, by stating that everything between separators (backslashes, in this case) is a series of non-separators we won’t run into “strange character” problems.

Next time you write a regexp think this: “I know that by using regexps I’m saving hours of development time, so I can afford to spend another 10 minutes to make them more robust”.

The Programmer Who Wrote Golden Code

Once upon a time, in the land of Foo, there was a programmer who wrote awesome code: free of defects, easy to read and maintain and yet efficient beyond compare. Everyone admired him and his works.

The word spread and the emperor was excited when he heard about the skills of the programmer. The emperor ordered him to come to his palace to work for him as his chief court programmer. The programmer obeyed and continued to write outstanding software — much to the delight of the emperor.

After some months, however, the emperor wanted to find out what exactly made the programmer so great and whether it was possible to instill his spirit into the other court programmers. So he observed the programmer and collected countless metrics. He also had the programmer attend many meetings in order to define rules and procedures for all court programmers to follow.

One day, the emperor noticed with horror that the programmer didn’t write golden code anymore; and neither did the other court programmers.

(with apologies to Aesop.)

The Safety Net That Wasn’t

The other day, I wasted time debugging some Java code. When I say “wasted” I do not complain about debugging per se — debugging is part of my life as a developer. Time was wasted because debugging should not have been necessary in this case. Let me explain…

It just so happened that I called a method but violated a constraint on a parameter. Within the called method, the constraint was properly enforced via the use of an assertion, just like in this example:

    public int invertValues(int rows, int[] values) {
        assert rows <= MAX_ROWS;
        ...
    }

Normally, my violation of the method’s contract would have been immediately reported and I wouldn’t have had to debug this bug. Normally, yes, but not in this case, as I forgot to run my program with assertions enabled. So instead of

    java -ea MyProgram

I wrote what I had written thousands of times before:

    java MyProgram

Silly me, silly me, silly me! That’s what I thought initially. But then I was reminded of the words of Donald A. Norman. In his best-selling book “The Design of Everyday Things” he observes that users frequently — and falsely — blame themselves when they make a mistake, when in fact it is the failure of the designer to prevent such mistakes in the first place. Is it possible that Java’s assertion facility is ill-designed? After having thought about it for some time, I’m convinced it is.

Assertions first appeared in the C programming language and they came with two promises: first, assertions are enabled by default (that is, until you explicitly define NDEBUG) and second, they don’t incur any inefficiencies once turned off. These two properties are essential and Java’s implementation misses both of them.

The violation of the first principle means that you cannot trust your assertion safety net: It is just too easy for you, your teammates or your users to forget the ‘-ea’ command-line switch. If you don’t trust a feature, you don’t want to use it. What use is an anti-lock break system that you have to enable manually every time you start your car?

Efficiency has always been a major concern to developers. If you execute your Java code with assertions disabled (which is, as we know, unfortunately the default) you will most likely not notice any speed penalty. What you will notice, however, is the additional footprint for your assertions that will always travel with your Java program. There is no way to compile assertions out. Take a look at this C example:

    int binary_search(const int* values, size_t values_len, int search_value)
    {
    #ifndef NDEBUG
        // Ensure that given values are sorted.
        int i;
        for (i = 1; i < values_len; ++i) {
            assert(values[i] >= values[i - 1];
        }
    #endif
        …
        // Actual implementation of binary search.
        …
    }

A prerequisite of any binary search implementation is that the input values are sorted, so why not assert it? Since we need to iterate over all elements, a simple assert expression is not sufficient. Contrary to Java, this is not a problem in C and C++: the code for the assert as well as the for-loop will be removed from the release build, thanks to the pre-processor.

While assertions — especially non-trivial assertions that require supporting debug code — already waste memory, you can do worse if you use the kind of assertion that allows you to specify a string to be displayed when an assertion fails:

    ...
    assert factory != null : "Factory must exist at this point!";
    ...

This string is of little use. If a programmer ever sees it, (s)he will have to look at the surrounding code anyway (as provided by the filename, line number pairs in the stack trace), since it is unlikely that such an assertion message can provide enough context. But, hey, I wouldn’t really mind the string if it came at no cost, but in my view, wasting dozens of bytes in addition for the string is not justified. I prefer the traditional approach, that is, an explanation in the form of a comment:

    ...
    // Ensure that a factory exists.
    // If this assertion fails, it is highly likely that the
    // initialization order in Startup.init() is messed-up.
    // Double-check that Factory.init() is called right
    // at the beginning.
    assert factory != null;
    ...

Assertions are like built-in self-tests and one of the cheapest and most effective bug-prevention tools available; this fact has been confirmed once again in a recently published study by Microsoft Research. If developers cannot rely on them (because someone did forget to pass ‘-ea’ or inadvertently swallowed the assertion by catching ‘Throwable’ or ‘Error’ in surrounding code) or always have to worry about assertion code-bloat, they won’t use them. This is the true waste of Java assertions.

Personal Scrum

pomodorosEven though I’ve never participated in a Scrum project, I’m a big Scrum fan. I’m convinced that a feedback-enabled, quantitative project management approach, one which puts the customer in the driver’s seat, is key to avoiding delays and frustration.

Especially the concept of time-boxing is very powerful: the Scrum team sets their own goals that they want to achieve within a given period of time. In Scrum, this period of time — or iteration — is called “sprint” and usually lasts two to four weeks. Because the sprint deadline is in the not-so-distant future, developers stay on track and the likelihood of procrastination and gold-plating is fairly low.

But there is even more time-boxing in Scrum: Every day at the “Daily Scrum Meeting” the team comes together and everyone tells what they have achieved and what they want to achieve until the next daily scrum. In practice, that’s another 24 hours (or 8 work-hours) time-box.

Still, getting things done is not easy. If you are like me you are distracted dozens of times every day. While hacking away, you are suddenly reminded of something else. Maybe it’s a phone call that you have to make. Or you want to check-out the latest news on “Slashdot“. Maybe a colleague pops by to tell you about the weird compiler bug he just discovered in the GNU C++ compiler…

If you give in to these interruptions, you won’t get much done in a day. You won’t get into what psychologists call “flow“: a highly productive state were you are totally immersed in your work.

Is there a way to combat such distractions? There is, but let me first tell you what doesn’t work: quiet hours. Quiet hours are team-agreed fixed periods of time were you are not interruptible, say, from 9.00 to 11.00 in the morning and from 14.00 to 16.00 in the afternoon. Every team member is expected to respect these hours. Sounds like a nice idea, but it fails miserably in practice. Especially in large projects, people depend on each other and productivity drops if developers are blocked because they cannot ask for help for two hours. All teams I belonged to and which tried quiet hours abandoned them shortly after they had introduced them.

The solution is to make the period of highly focused work much shorter, say 25 minutes. If interruptions occur, you make a note of them in your backlog and carry on with your task. When the time expires, you take a quick break (usually 5 minutes), check your backlog and decide what to do next: either continue with your original task or handle one of your queued interrupts. In any case, you start another period of highly efficient 25 minutes and after 4 such iterations, you take a bigger break (15 - 30 minutes). That’s the Pomodoro technique in a nutshell.

Pomodoro (Italian for tomato) was invented by Francesco Cirillo, a student who had problems focusing on his studies. He wanted to find a method that allowed him to study effectively — even if only for 10 minutes — without distractions. He used a mechanical kitchen timer in the shape of a tomato to keep track of time, and hence named his technique after his kitchen timer. He experimented with different durations, but finally came to the conclusion that iterations of 25 minutes (so-called “Pomodoros”) work best.

I like to think of the Pomodoro technique as “Personal Scrum”. To me, a 25 minute time-box is just perfect. It’s enough time to get something done, yet short enough to ensure that important issues that crop up are not delayed for too long. In his freely available book, Francesco writes that while there are software Pomodoro timers available, a mechanical kitchen timer usually works best — and I definitely agree. The act of manually winding up the timer is a gesture of committing to a task and the ticking sound helps staying focused, since you are constantly reminded of time. However, mechanical timers are a clear no-no if you share your office with others: the ticking and especially the ringing sound would be too annoying.

When I’m all by myself, I prefer a mechanical kitchen timer, but if I share a room with someone else, I prefer something softer. I’ve asked the folks at AudioSparx to implement a Pomodoro kitchen timer MP3 for me: 25 minutes of ticking, followed by a 10 seconds gentle ring (yes, you can download it — it’s USD 7.95 and no, I don’t get commission). I listen to it on my PC’s MP3 player wearing headphones, which has two additional benefits: first, headphones shut off office noise and second, they signal to others that I wish to be left alone, so they only interrupt me if it is really, really urgent.

“I have a deadline. I’m glad. I think that will help me get it done.”
–Michael Chabon

Get into ‘Insert’ Mode

Here I am, trying to write something. I’m sitting at my desk, staring at my screen an it looks like this:

empty-screen.jpg

It is empty. I just have no clue how to even start.

Are you familiar with such situations? Among writers, this is a well-known phenomenon and it’s called “writer’s block”. But similar things happen in all creative fields: sooner or later, people hit a massive roadblock and don’t know where to start. A painter sits in front of a blank canvas, an engineer in front of a blank piece of paper and a programmer in front of an empty editor buffer.

Is there any help? Sure. You can use a technique called “free writing“, which means you just write down whatever comes to your mind, regardless of how silly it looks. It’s important that you don’t judge what you write, you don’t pay attention to spelling or layout, your only job is to produce a constant stream of words — any words. This exercise will warm-up your brains and hopefully remove the block. Applied to programming, you set up a project, you write a “main” routine (even if it only prints out “Hello, World, I don’t know how to implement this freaking application”) and a test driver that invokes it.

The next thing that you do is write a “shitty first draft“, as suggested by Anne Lamott. You probably know the old saying: the better is the enemy of the good. By looking for the perfect solution, we often end up achieving nothing because we cannot accept temporary uncertainty and ugliness. That’s really, really sad. Instead, write a first draft, even if it is a lousy one. Then, put it aside and let it mature, but make sure you revisit it regularly. You will be amazed at how new ideas and insights emerge. Experienced programmers are familiar with this idea, but they call it prototyping. They jot down code, they smear and sketch without paying attention to things like style and error-handling, often in a dynamic language like Perl or Python.

So if you have an idea that you think is worthwhile implementing, start it. Start somewhere — anywhere — even if the overall task seems huge. Get into ‘insert’ mode (if you are using the ‘vi’ editor, press the ‘I’ key). Remember the Chinese proverb: “The hardest part of a journey of a thousand miles is leaving your house”.

Greyface Management

no no no“On arrival we will stay in dock for a seventy-two hour refit, and no one’s to leave the ship during that time. I repeat, all planet leave is cancelled. I’ve just had an unhappy love affair, so I don’t see why anybody else should have a good time. Message ends.”
(Prostetnic Vogon Jeltz, Hitchhiker’s Guide to the Galaxy)

Grey is not just a color — it’s an attitude. There is a management style that I refer to as “Greyface Management”. The term is loosely based on the “Curse of Greyface“, an important concept of Discordianism.

Greyface Management is characterized by a total absence of fun. Everything is prohibited: free speech, sarcasm and parties. And there is no praise for good work, either. Never. In fact, a Greyface Manager’s motto is: “Praise is the absence of punishment”. A Greyface Manager typically wears a grey suit (mentally, at least) and an annoyed look on his face — he is a humorless bureaucrat, akin to a member of the Vogon race.

The presence of Greyface Management is not just unpleasant — it is a sign of serious trouble. A manager who uses this kind of management style in a software shop openly confesses that he doesn’t have a clue about software development in general and “Peopleware” (that is, developers) in particular. Now, it is a well-known fact that most software managers can’t manage (a subject well-worth exploring; I will certainly revisit this topic in future posts) but many software managers are aware of their limitations and successfully use techniques such that productive work is still possible under their reign. A Greyface Manager, on the other hand, hasn’t reached that level of sophistication and uses the worst-possible approach: oppression.

Humor is very important for software developers, especially “creative” humor that requires “out-of-the-box” thinking — that’s the very reason why programmers usually love Monty Python and Dilbert. Sarcasm and inside jokes help keeping the team knit together, so it’s not always a bad sign if developers make jokes about testers and sales people (and vice versa). And, dear Greyface Manager, what use are conforming “yes-sayers” that work to the rule, anyway?