Dear Compiler, What’s My Size?

Sometimes, when you work with classes or structs, your code is not only dependent on particular fields (attributes, member variables) but on all of them at the same time.

The copy constructor, assignment operator, and the equality operator are prominent examples: if you add a field to your class, these functions need to be maintained in parallel; that is, all of them potentially need to be updated to support this new field. And we all know that it is so easy to forget to update one of them…

I got bitten by this when I used a class that lacked an operator to stream out all of its members. Since I didn’t own the source code of the class (it was part of a library), I couldn’t modify it directly. (I actually could have modified it, but this would have been outright stupid). Instead, I decided to add the missing functionality in my own code. Here is a simplified example:

Here’s what I added in my code:

It wasn’t long until I got a new version of the library in which “Fruit” was extended by another attribute, “weight”. Since my code didn’t support this new attribute yet, it was broken. Even worse: by (unconsciously) ignoring the new attribute my code failed silently at run-time! Just in case you didn’t know: If there is one thing I absolutely can’t stand it’s if CODE FAILS SILENTLY AT RUN-TIME.

Something had to be done to prevent this from happening again. After a while I came up with the idea of checking the size of class Fruit at compile-time against a hard-coded value. Even though this wasn’t perfect (you can easily think of cases where you would get false positives) it gave me a good chance of catching similar modifications next time I rebuilt my own code:

(Note: static assertions are like plain assertions but checked by the compiler at compile-time, not at run-time. I’m using a C++ 2011 feature here, but you can implement static assertions yourself, if your compiler lacks support for them. Read more here.)

Now, if the compiler flagged a failed assertion I would review the public interface of class Fruit and look for any new attributes, update my code along with the reference value of the size and everything would be fine. But how would I get the reference size value (i. e. 24) in the first place?

Calculating the size of a class or struct by hand is not trivial. There can be hundreds of members, including union members, padding bytes, hidden pointers to vtables and the like. A viable solution is to add a temporary print statement that writes sizeof(Fruit) to stdout, but that would require linking and running the program. And what if you are working on an embedded system where there is no stdout or where it is impossible to inspect the value in a debugger?

No, these approaches are too slow and cumbersome. If the compiler knows the size of Fruit it should be able to tell me. But how?

This question haunted me for quite some time. I devised a cunning plan: why not add a statement containing a syntax error that somehow forced the compiler to spit out the size of my class. Among other things, I tried this:

This hack exploits the fact that in C/C++ it is illegal to declare an array of negative size. I had hoped to get a message like this:

Alas, all compilers that I checked-out just gave a worthless error message among these lines:

There was no clue as to what the actual size was. Too bad. The compiler knows the answer but it doesn’t want to tell me!

I was just about to give up when I suddenly had another idea: what the compiler would always tell me is the line on which an error occurred. Why not attempt to create many arrays, where the size is decremented, starting from sizeof(Fruit):

I would just include this header file and the compiler would compile line after line until it hits a syntax error due to a negative array size. When it does, it would report the corresponding line number which in turn corresponds to the size of Fruit. G-r-e-a-t!

Generating this include file (getsize.h) was straightforward: I cobbled together a simple Bash for-loop:

I used it like this:

Compiling this code yields the following error messages (the exact messages may vary, depending on the compiler that you use):

The first line that is reported as erroneous is the size of class Fruit, 24 in our example. Once you know the size you just need to update the reference value and comment out the two lines to have them ready when you need them again.

You might find all of this a bit hacky, but I happen to like it very much. It serves a good purpose and nicely solved my problem. On top of that, it gave me a really good time. What more can one expect?

[Update 2015-12-06: Bernd Hanebaum sent me a nice implementation of this idea that is based on template metaprogramming — many thanks, Bernd!]