10 Rules For Working Successfully With Pull Requests

“Don’t expect to build up the weak by pulling down the strong.”
— Calvin Coolidge
Software development has always been a rich source of religious wars: emacs vs. vi, tabs vs. spaces, brace placement. Where do I stand regarding “pull requests” vs. “trunk-based” development? Even though it looks like the really hip developers opt for “trunk-based” these days, I’m clearly on the side of “pull requests”. But only if certain rules are followed.

Rule 1: Keep pull requests short

Don’t work on a feature branch for days. That’s against the idea of continuous integration. Not only do you risk integration problems, people will have a hard time reviewing your large changeset. Instead, split-up the pull request into several small(er) pull requests, for instance: preparatory refactoring, main implementation, clean-up, additional unit tests to improve code coverage. A pull request comprising 50 changed lines is easy to review, a pull request comprising 1000 changed lines, hard. What is easy to do gets done, what is hard to do usually gets delayed.

Rule 2: Constantly collect ideas on your way

Resist the temptation to cram more into your pull request than necessary. If you discover a badly named class or a way-too-long function, don’t clean it up right now. Make a note of it, add a TODO, or, even better, create a technical debt ticket. This keeps your pull requests short (see rule 1) and provides you and your teammates with work while waiting for feedback (rule 3). When implementing, review existing code and always be on the lookout for things to improve. Collect ideas for hard times like a squirrel collects nuts.

Rule 3: Be patient while waiting for feedback

Most developers, after having published their pull requests, only have one thing on their mind: integrating it asap. If they haven’t got any feedback or approval within two hours, they start hassling teammates, complain that they are blocked, do all sorts of whining. Such developers need to realize that while their changes mean all the world to them, they most likely don’t mean much to others. What developers should really have on their mind after publishing a pull request is this: a) are there any open pull requests from others that I could review? b) are there any technical debt tickets that I could tackle? If developers follow rule 1 and rule 2, there’s little chance that they ever run out of work.

Rule 4: Be kind and grateful when giving and receiving feedback

Nobody likes being criticized but every criticism should be viewed as an opportunity to learn for all involved parties. When you give criticism, be constructive and avoid cynical and arrogant remarks like “Only rookies would use a for loop here”. If you get harsh and unfair criticism, don’t take it personally and don’t add to the fight. Calm is contagious.

Rule 5: The author has the final judgment

I once was part of a team where people had a pull request battle over what’s the best way to initialize a string constant in C++. “Proof of efficiency” was shown by contestants in the form of disassembly for various platforms and they all showed different results. This ridiculous battle went on for days. Let’s not forget this: everybody is welcome to criticize the author’s code, but in the end, it’s the author’s decision what to accept and what not. The pull request doesn’t have to satisfy everyone and a software project is not a democracy. A good way to end a fight is this: “Thanks for your feedback, but I’ve decided to keep the code as it is”.

Rule 6: Advertise your pull request

If you want to make sure that nobody looks at your pull request, use a meaningless title full of spelling mistakes like “REwork for prformace”. If not, craft the title carefully to express it’s purpose and importance. I’m sure that the same pull request with this title will be reviewed much sooner: “Fix: Use a faster scanning algorithm to prevent nightly build timeouts”. The same goes for the description of the pull request. Give a crisp executive summary of the motivation and what you are trying to achieve. Don’t give long sermons to avoid TL;DR.

Rule 7: Review your own pull request

Before publishing your pull request, carefully review it yourself. It’s amazing what you can detect yourself, especially silly goof-ups and typos. Make sure that the pull requests is of highest quality, to the best of your knowledge. Don’t waste your reviewer’s precious time or they will soon start wasting yours by not looking at your pull requests anymore.

Rule 8: Publish only when you’re done

There’s nothing more annoying for a reviewer than having to constantly re-review because the author keeps on pushing changes. If you find out that your code needs rework, set your pull request to draft state. Publish it again once you’re done.

Rule 9: Proactively explain unusual changes

Experienced developers know that code shall be written such that it is self-explanatory and comments should be only used to document surprises. Still, certain changes to obvious, non-commented code might still be surprising to reviewers. In such cases, add an explanatory review comment yourself such that reviewers don’t have to ask you about it.

Rule 10: Spread the word

If teammates unknowingly violate these rules, educate them about these rules. If they don’t care or continue to deliberately violate these rules, let them wait next time they want you to review their pull request. Show, don’t tell.

Bug Hunting Adventures #16: Lame Surveillance

“Under observation, we act less free, which means we effectively are less free.”
― Edward Snowden

Imagine a distributed surveillance system where recorded video files are uploaded to a central server at regular intervalls.

Due to limitations of the transport protocol, video files must be split up in chunks and no chunk may exceed 1 GB (10^9 bytes). On top of that, in high-load scenarios, the server might shorten a chunk even more, in which case instead of N bytes only K bytes are transmitted. Naturally, the N-K bytes that were not transmitted need to be sent with the next chunk upload.

Everything works fine, all unit and system tests passed. Once deployed, however, sysadmins from the central server team started lamenting that the video files were arriving at a glacial pace. What’s wrong with this code?


// Establishes a connection to a video upload server.
// Returns a handle to the server connection or -1 if no connection
// could be established.
int connectToServer(const char* url, int port);

// Uploads at most 'length' bytes of 'data' to an upload server connection.
// Returns the actual number of bytes uploaded (<= length) or a value <= 0 if
// the connection is broken.
int uploadToServer(int serverHandle, const char* data, int length);

// Waits for and retrieves the next stored video.
// Returns a pointer to a contiguous video blob or NULL if there is temporarily
// no video. The size of the video blob in bytes is stored in the 'length' out
// parameter.
const char* getNextVideo(int* length);

static const char* const serverUrl = "http://video.upload.server.com";
static const int serverPort = 1234;
static const int maxChunkSize = 10^9;

...

// Open connection to server.
int serverHandle = connectToServer(serverUrl, serverPort);
if (serverHandle >= 0) {
    bool continueToUpload = true;
    while (continueToUpload) {
        // Get next video.
        int remainingVideoLength = 0;
        const char* p = getNextVideo(&remainingVideoLength);
        // If there is a video.
        if (p != NULL) {
            // Upload video in chunks.
            while (remainingVideoLength > 0) {
                int bytesToUpload = remainingVideoLength;
                if (bytesToUpload > maxChunkSize) {
                    bytesToUpload = maxChunkSize;
                }
                int uploadedBytes = uploadToServer(serverHandle, p, bytesToUpload);
                if (uploadedBytes > 0) {
                    p += uploadedBytes;
                    remainingVideoLength -= uploadedBytes;
                } else {
                    fprintf(stderr, "Uploading failed, aborting");
                    continueToUpload = false;
                    break;
                }
            }
        }
    }
} else {
    fprintf(stderr, "Failed to connect to %s:%d", serverUrl, serverPort);
}

// Establishes a connection to a video upload server.

// Returns a handle to the server connection or -1 if no connection

// could be established.

int connectToServer(const char* url, int port);

// Uploads at most 'length' bytes of 'data' to an upload server connection.

// Returns the actual number of bytes uploaded (<= length) or a value <= 0 if

// the connection is broken.

int uploadToServer(int serverHandle, const char* data, int length);

// Waits for and retrieves the next stored video.

// Returns a pointer to a contiguous video blob or NULL if there is temporarily

// no video. The size of the video blob in bytes is stored in the 'length' out

// parameter.

const char* getNextVideo(int* length);

static const char* const serverUrl = "http://video.upload.server.com";

static const int serverPort = 1234;

static const int maxChunkSize = 10^9;

...

// Open connection to server.

int serverHandle = connectToServer(serverUrl, serverPort);

if (serverHandle >= 0) {

bool continueToUpload = true;

while (continueToUpload) {

// Get next video.

int remainingVideoLength = 0;

const char* p = getNextVideo(&remainingVideoLength);

// If there is a video.

if (p != NULL) {

// Upload video in chunks.

while (remainingVideoLength > 0) {

int bytesToUpload = remainingVideoLength;

if (bytesToUpload > maxChunkSize) {

bytesToUpload = maxChunkSize;

}

int uploadedBytes = uploadToServer(serverHandle, p, bytesToUpload);

if (uploadedBytes > 0) {

p += uploadedBytes;

remainingVideoLength -= uploadedBytes;

} else {

fprintf(stderr, "Uploading failed, aborting");

continueToUpload = false;

break;

}

} else {

fprintf(stderr, "Failed to connect to %s:%d", serverUrl, serverPort);

}

Solution

Do They Treat You Like A Superuser?

“A good workman is known by his tools”
— proverb
The process of getting admin rights as a corporate software developer is definitely on a spectrum. Over the last 20+ years, I’ve written code for more than ten companies and boy, do their policies differ!

In one case, I had full admin rights from day one. In more typical cases, I had to start a workflow to request admin rights which would arrive within hours to days. In one extreme case, I had to do an online training about the dangers of working with admin rights before I could start the workflow. After I passed the exam and once my request was approved (7 days later), I would be granted admin rights only for a limited number of time (180 days at most). Even worse — the online training course would need to be taken again as well!

Let’s meditate a little bit on this latter case. Too me, it’s an utter catastrophe. As software developers, we constantly need to maintain and tweak our PC, our beloved toolbox. We need to install or upgrade development tools, device drivers and the like, sometimes just for the purpose of experimentation and learning. What if I wanted to switch to a newer version of g++ one day only to find out that my ‘sudo’ rights had expired? Sure, I could start the workflow again, wait a couple of days for approval, but why? Such processes are nothing but a nuisance that break developers’ flow and inspiration while not adding any real security.

A software developer is not a regular user — a software developer is a superuser, literally. If a company has to have their software developers take training courses to ensure that they don’t work in a root shell all day they should not have been hired in the first place. Doesn’t it border on insulting if you learn in such a training that you should not open email attachments from unknown senders, especially while being logged-in as root? You don’t say!

If a company doesn’t give you unlimited superuser rights within a couple of hours, you’re definitely not treated like a superuser. You’re rather treated like a regular office worker who has no clue about how computers work, let alone computer security.

It’s not just about wasted time. It’s about lack of empowerment and trust. But it’s mainly about a missing software culture: are you viewed as precious human capital that develops top-notch software products which will make the company thrive, or are you rather viewed as a schmuck that poses an severe risk to the company?

A company with good software culture understands the chief need of creative makers, which is: working on interesting projects in a frictionless, libertarian environment where they can spend most of their time doing what they love most: craft exciting software.

Restricting software developers in terms of admin rights is just one problem of companies lacking good software culture, but it’s symptomatic. While such shops might manage to lure in great creators, they will certainly not be able to retain them in the long run.

People Patterns In Software Development: The Non-Painting Painter And The Non-Painter

“If you put off everything till you’re sure of it, you’ll never get anything done.”
— Norman Vincent Peale

I once read about an individual who wanted to become a painter. But instead of painting, he spend a lot of time and money on finding the right canvases, brushes, and paints. He read dozens of books on painting, went to museums, studied the great masters but never got around to paint a picture.

Inspired by this story I coined the phrase “Non-Painting Painter” to label individuals who behave in such manner.

It’s highly likely that you will encounter non-painting painters in your career, so it’s worthwhile to shed more light on their attributes and behaviors, because it’s easy to misdiagnose them. It’s especially common to confuse them with a more extreme form, which I call the “Non-Painter”. Non-painting painters and non-painters have a lot in common (after all, they don’t paint), but while there is hope for cure for the former, all is lost for the latter.

APPEARANCE

Both, non-painting painters and non-painters are indistinguishable by looks, sight, or sound. You recognize them by their personality traits.

PERSONALITY TRAITS

Both, non-painting painters and non-painters procrastinate, they are on the constant lookout for reasons to evade their primary task. This doesn’t imply that they are lazy — far from it! They spend a lot of effort on side issues. What they don’t do is get the task done that they are supposed to do.

To them, there is always more research to be done, more code samples to study, more design documents to write, more meetings to call for. They cannot start to code unless the coding standards have been approved by QA and to avoid this from ever happening, they permanently submit change requests. Then, some day, they hear about a new build system, programming language, software tool that need to be evaluated because this would “save us from so many problems in the future”.

Other characterizing attributes of both, the non-painting painter and the non-painter are that they are usually smart people with above-average social skills. They have a lot to say, seem to be well-versed with many contemporary technologies (at least superficially) as well as everyday topics. They are a naive hiring manager’s dream come true.

To qualify as a non-painting painter, it’s imperative that such an individual still possess a willingness and ability to actually perform the task they so eagerly avoid to do. The chief reason why they don’t perform is rooted in fear. Fear to make mistakes, fear to look stupid in front of coworkers. But contrary to the non-painting painter’s believe, it’s no shame to have little experience, as long as there’s a strong desire to learn and improve. In his book “The Passionate Programmer”, Chad Fowler goes as far as giving this advice: “Always be the worst guy in every band you’re in. – so you can learn. The people around you affect your performance. Choose your crowd wisely.”

But let’s now turn to the pathological, incurable sibling of the non-painting painter — the non-painter. Non-painters also don’t produce what they’re supposed to, they also waste time on unimportant, minute details and side issues. But contrary to the non-painting painter, they have neither the intention, nor the ability to get the job done. If that wasn’t bad enough, in almost all cases they go to great length to prevent also teammates from getting their jobs done, thus multiplying their own unproductivity. Why? Non-painters are in constant fear that well-performing teammates steal the limelight from them and expose them for what they are: freeloading charlatans.

RATING

According to the Q²S² framework, the rating of a non-painting painter and a non-painter is the same: 2/1/4/2; the Q²S² framework is unfortunately not able to discern them.

POLAR OPPOSITE

The Codenator

CONCLUSION

A non-painting painter is still a painter, alas an inhibited one. My recommendation is to give non-painting people the benefit of the doubt and not by default label them as non-painters. If they don’t have the courage to speak about their fears, a senior developer should approach them and provide support and guidance. Just like Attaboys, non-painting painters often are diamonds in the rough. If they are open about their inexperience and strive hard to improve, they can turn into a valuable asset. However, if their productivity doesn’t improve, relabeling them as non-painters is more than deserved and non-painters have no place in the team.

Why We Count From Zero

“A zero itself is nothing, but without a zero you cannot count anything; therefore, a zero is something, yet zero.”
— Dalai Lama

If you do a Google search for why programmers typically start counting from zero, you’ll likely find two reasons. Today, I’m going to add another one. But let’s start with the usual explanations.

DIJKSTRA’S HALF-OPEN RANGES

On August 11, 1982, Edsger Dijkstra wrote a short paper on why numbering should start at zero. He first demonstrates that half-open ranges with an excluded upper bound are superior to other alternatives:


lower <= i < upper

lower <= i < upper

Here’s a quick summary of his reasoning, in case you don’t want to read it yourself. The main advantages are that a) you can easily represent empty ranges (i. e. lower equals upper) and b) compute the number of elements in a range by subtracting the lower bound from the upper bound.

Based on ranges where the upper bound is excluded, Dijkstra goes on to show that for sequences of N elements, there are only two ways of indexing:


1) 1 <= i < N + 1
2) 0 <= i < N

1) 1 <= i < N + 1

2) 0 <= i < N

Obviously, the latter is much more elegant and hence we see code like this everywhere:


for (int i = 0; i < N; ++i) { // upper bound N excluded, N-0 == N iterations.
    ...
}

for (int i = 0; i < N; ++i) { // upper bound N excluded, N-0 == N iterations.

...

}

BASE-RELATIVE ADDRESSING

Sequences of heterogeneous data in contiguous memory are almost universally laid-out like this:


+------+------+------+------+-----+------+
| elem | elem | elem | elem | ... | elem |
+------+------+------+------+-----+------+
^
|base address ---------> higher addresses|

+------+------+------+------+-----+------+

+------+------+------+------+-----+------+

|base address ---------> higher addresses|

A sequence starts with the first element at some base address, the second follows sizeof(elem) bytes after the first element and so on. Computing the start address of the n-th element can be done using this formula:


elemAddr = baseAddr + (n * sizeof(elem))

elemAddr = baseAddr + (n * sizeof(elem))

However, this formula only applies if you index your elements from 0 to N – 1. If instead you chose to number indices from 1 to N, the formula would need to be adapted:


elemAddr = baseAddr + ((n - 1) * sizeof(elem))

elemAddr = baseAddr + ((n - 1) * sizeof(elem))

This alternative is not only less pleasant to look at, but because of the additional subtraction also harder for the CPU to compute. Consequently, the fathers of C employed the zero-based array access syntax that we are all so familiar with:


int x = elem[i];

int x = elem[i];

which is really just a shorthand notation for


int x = *(elem + i);

int x = *(elem + i);

If i is 0, you get the first element, if i is positive, the i-th successor of elem, and if i is negative, the i-th predecessor of elem. The latter fact often surprises developers because they either assume that negative offsets are illegal in the first place or yield elements from the end of the array, like in Python.

Incidentally, there’s another secret to C array indexing: since the addition operation is commutative, you can equally well write


int x = i[elem];  // same as int x = *(i + elem)

int x = i[elem]; // same as int x = *(i + elem)

As obvious as this is in hindsight, it’s not well known amongst C programmers and a good opportunity to show off at parties. However, I strongly advise against writing code like this for production use.

MODULAR ARITHMETIC

As you know, applying the mod operator like this


n mod N

n mod N

yields values ranging from 0 to N – 1. This dovetails nicely with zero-based indices into sequences.

Take hash maps for example. To determine the position of an element in a hash map containing N slots, just apply a hash function and take the result modulo N to get the index. That’s it!

Another use case is the ring buffer, one of my favorite containers: to advance an index into a ring buffer, just add the desired offset and apply the mod operator to get wrap-around — no need for extra if/else logic. Again, starting indices at 1 instead of 0 would entail extra additions and subtractions.

There you have it — one more reason to start numbering from zero. (As if you still needed to be convinced…)

A Neverending Story

“Nothing is lost. Everything is transformed.”
― Michael Ende, The Neverending Story

I explained in this post that I don’t view technical debt as something that is bad per se. Rather, I believe that “good technical debt” — at times — should be employed for strategic reasons. As an example, when developing a feature, you might take shortcuts in order to unblock stalling teammates who need your changes in order to carry on with their own work.

Let me reiterate: to qualify as good technical debt, it must be

1. taken on consciously
2. managed
3. repaid timely

How can this be achieved in practice?

Once you’ve made a deliberate decision and considered the pros and cons, you implement your makeshift solution. But before you mark your task as done, you create another task which has the goal of removing the technical debt you just introduced. But where should you put this task?

Don’t hide it in the product backlog as it a) contains usually many issues already b) is more concerned with externally observable features and c) is actually owned by the product owner. If you did, most likely technical debt issues would be delayed (or rather ignored) in favor of “real” stories, which would grossly violate good technical debt requirement #3: “repaid timely”. Instead, put it under a story of the current sprint titled “Repay technical debt”.

An immediate advantage of this approach is that it makes your shortcuts visible to the whole team. Further, everybody, including you, can pick up this task and just start working on it. However, in typical cases, such clean-up tasks won’t be done in the current sprint. So what happens with the “repay” story and all attached tasks at the end of a sprint?

You move it to the next sprint, of course! Consider it a story that never ends, which nicely mirrors a well-known software engineering truth: the fight against software entropy goes on forever. Since the story doesn’t go away, it’s a great reminder for the whole team that repaying technical debt (or constant improving of internal software quality) is of super-high importance.

Having this story in place allows you to manage technical debt, as stipulated by good technical debt requirement #2. There are too many technical debt tasks in this story? Maybe it’s time for a “technical debt sprint” where everyone focuses on getting the list shorter instead of adding new functionality. Are there no technical debt tasks at all? Maybe the team isn’t really tracking their technical debt. Another possibility is that the team doesn’t use “good technical debt” as a strategic tool and instead make other people wait for their gold-plated implementation.

In my view, a neverending “payback technical debt” story is a great tool. It’s a quality backlog maintained by developers which puts dirt right in front of everybody’s noses. I believe that this drastically mitigates the risk of creating a maintenance nightmare while still allowing for occasional shortcuts.

Approxion

Code – People – Everything