Intended Use vs. Real Use

Often, things are invented to solve a particular problem, but then the invention is used for something completely different.

Take Post-it® Notes, for instance. In 1970, Spencer Silver at 3M research laboratories was looking for a very strong adhesive, but what he found was much weaker than what was already available at his company: It stuck to objects, but could easily be lifted off. Years later, a colleague of his, Arthur Fry, digged up Spencer’s weak adhesive — the rest is history.

Another example is the discovery of this blue little pill called Viagra®. Pfizer was looking for medications to treat heart diseases, but the desired effects of the drug were minimal. Instead, male subjects reported completely different effects — again, the rest is history.

In 1991, a team of developers at Sun were working on a new programming language called “Oak” — the goal was to create a language and execution platform for all kinds of embedded electronic devices. They changed the name to “Java” and it has become a big success: You can find it almost everywhere, except — big surprise — in embedded systems.

I would never have guessed how minute Java’s impact on embedded systems was until I read Michael Barr’s recent article, provokingly called “Real men program in C” where he presents survey result showing the usage statistics of various programming languages on embedded systems projects.

The 60-80% dominance of C didn’t surprise me — C is the lingua franca of systems programming: high-level enough to support most system-level programming abstractions, yet low-level enough to give you efficient access to hardware. If it is fine for the Linux kernel (which is around 10 million lines of uncommented source code, SLOC) it should be fine for your MP3 player as well.

Naturally, at least to me, C++ must be way behind C — Barr reports a 25% share. C++ is a powerful but difficult language. It is more or less built on top of C, so it is “backwards-efficient”. Alas, to master it, you need to read at least 10 books by Bjarne Stroustrup, Scott Myers, Herb Sutter et. al. and practice for five years — day and night. But the biggest problem with C++ is that it somehow encourages C++ experts to endlessly tinker with their code, using more and more advanced and difficult language features until nobody else understands the code anymore. (Even days after everything is already working they keep polishing — if people complain that they don’t understand their template meta-programming gibberisch, they turn away in disgust.)

But how come Java is only at 2%? Barr, who mentions Java only in his footnotes (maybe to stress the insignificance of Java even more) has this to say: “The use of Java has never been more than a blip in embedded software development, and peaked during the telecom bubble — in the same year as C++.”

Compared to C++, Java has even more weaknesses when it comes to embedded systems programming. First of all, there is no efficient access to hardware, so Java code is usually confined to upper layers of the system. Second, Java, being an interpreted language, cannot be as fast as compiled native code and JIT (just-in-time) compilation is only feasible on larger systems with enough memory and computational horsepower. As for footprint, it is often claimed that Java code is leaner than native code. Obviously, this is true, as the instruction set of the JVM is more “high-level” than the native instruction set of the target CPU. However, for small systems, the size of the VM and the Java runtime libraries have to be taken into account and this “overhead” will only amortize in larger systems. But two more properties of Java frequently annoy systems programmers: the fact that all memory allocation goes via the heap (i. e. you cannot efficiently pass objects via the stack) and the fact that the ‘byte’ data type is signed, which can be quite a nuisance if you want to work with unsigned 8-bit data (something that happens rather frequently in embedded systems). Finally, if C++ seduces programmers to over-engineer their code by using every obscure feature the language has to offer, Java seduces programmers to over-objectify their code — something that can lead to a lot of inefficiency by itself.

I don’t think that the embedded world is that black and white. I’m convinced that for small systems (up to 20 KSLOC) C is usually the best choice — maybe sprinkled with some assembly language in the device drivers and other performance-critical areas. Medium-sized systems can and large systems definitely will benefit from languages like C++ and Java, but only in upper layers like application/user interface frameworks and internal applications. Java clearly wins if external code (e. g. applets, plug-ins) will be installed after the system has been deployed. In such cases, Java has proven as a reliable, secure and portable framework for dynamically handling applications. For the rest, that is, the “core” or the “kernel” of a larger system, C is usually the best and most efficient choice.