Monthly Archives: October 2013

Being Tolerant Towards NULLs

black_hole

KING LEAR: “What can you say to draw a third [of the kingdom] more opulent than your sisters? Speak.”
CORDELIA: “Nothing, my lord.”
KING LEAR: “Nothing!”
CORDELIA: “Nothing.”
KING LEAR: “Nothing will come of nothing; speak again.”

Around the year 2000, when the mindless “outsourcing-to-India-will-solve-all-our-problems” hype was near its top, I saw an interview with an Indian minister on TV. When the reporter asked, why there were so many talented software engineers in India, the minister replied: “Well, the number zero was invented in India, and programming is all about ones and zeros…”.

Now, this could have been a good joke, but trust me: that man was dead serious about his statement. My first reaction was anger, my second compassion: this poor guy clearly didn’t know what he was talking about. At least on the matter of programming, he was the human equivalent of a null device.

Some months ago, I worked on a Java program. In one of the dialogs the user was asked to specify a path to a file in an edit field; the default value of this edit field came from a property file which I used to store previous path entries made by the user:

Even though the code above can be shortened by using the ternary operator, I thought that this ‘null’ handling business unnecessarily cluttered-up my code. All I wanted was just this: “if there is a value in the property file, use it; otherwise leave the edit field as it is (ie. empty)”. I wanted to write something that was easy on the eye, like this:

But had I done so, I would have gotten a dreaded NullPointerException in cases where there was no filnename property in the property file. Darn! I really wish that sometimes APIs could cope with NULL values; that is, silently ignore them.

Since the early days of databases we have used NULL values to signal the absence of a real value. In SQL, it is well defined how NULL values are interpreted in the context of SQL operators (for instance, if you perform a logical OR operation like TRUE OR NULL, you will get TRUE as a result). In contemporary programming languages, NULL values are often treated like orphans.

In C, for instance, using NULL values usually puts you in the realm of undefined behavior:

Depending on the platform you are on, this might either do nothing or lead to a core dump — you cannot tell for sure. Java is more strict in this respect: if you pass a NULL reference to the VM it will complain by throwing a NullPointerException:

But wouldn’t it be equally valid to expect that nothing happens in these cases? If you copy 42 to nowhere (or from nowhere), why shouldn’t this be valid? In this case, NULL would behave like a black hole: it sucks up everything and there is no way to get anything out of it.

In OOP, people often apply the Null Object Pattern to get no-op behavior. But wouldn’t it be much nicer if methods would automatically do nothing if invoked on NULL references?

Now that I’ve thought about it for a while, what I really want is that APIs and/or higher-level languages behave like /dev/null in Unix: if there is nothing, do nothing. I believe this is just another variant of the Rule of Silence:

Contrary to what most people believe, doing ‘nothing’ is often harder than one might expect. Here are some examples:

  • Setting/printing a NULL string: don’t output anything
  • Writing to a file through a file pointer that is NULL: don’t write, don’t complain.
  • Writing NULL to a properly opened file: don’t write anything, leave file pointer unchanged.
  • Opening a file when filename is NULL: don’t open it, just return NULL.
  • Using memcpy where dest is NULL: don’t copy, just return NULL.
  • Using memcmp where either src or dest is NULL: return -1 (or +1).
  • Reading 42 bytes from a socket where dest array is NULL: receive 42 bytes and discard them.

Even though I’m convinced that NULL-tolerant APIs would often simplify a programmer’s life, for a low-level programming language like C, performance is everything and hence being NULL-tolerant is probably not an option. But higher-level languages (and dynamic languages and high-level APIs implemented in lower-level languages) could certainly benefit.