« Posts under Bug Hunting Adventures

Bug Hunting Adventures #12: String Limits

“The limits of my language mean the limits of my world.”
— Ludwig Wittgenstein

The aim of the routine below (‘reduce_string’) is to limit a given ‘string’ to at most ‘max_len’ characters. If the length of ‘string’ exceeds ‘max_len’, characters are removed from around the middle and filled with an ‘ellipsis’ string. Here are some examples that demonstrate what ‘reduce_string’ is supposed to do:

But as always in this series, a bug slipped in. Can you find it?


Bug Hunting Adventures #11: Bad Weather

“It is only in sorrow bad weather masters us;
in joy we face the storm and defy it”
— Amelia Barr

Imagine a weather monitoring system where environmental data is collected by various sensors and distributed via messages to other components for further processing.

In the code below, produce_env_measurement() represents a task that constantly produces messages containing various environmental measurements while another task (represented by process_env_measurement()) consumes them. To ensure data integrity, a Fletcher-16 checksum is appended to every message, but the application nevertheless doesn’t work reliably. Where’s the bug?


Bug Hunting Adventures #10: For Whom The Bell Tolls

“Then later that night when the ship’s bell rang
Could it be the north wind they’d been feelin’?”

“The Wreck Of The Edmund Fitzgerald”
— Gordon Lightfoot

At my home, I’m using a Raspberry Pi as a watchdog (aptly named “Brutus”) for all kinds of tasks: burglar detection, network intrusion detection, and server monitoring, just to name a few. Still, most of the time, my watchdog hangs around, idling away. That’s not the way I like it, so I’m constantly on the lookout for new jobs that I can assign to Brutus, small or big.

My current plan is to create a little ship’s bell app that emits pleasing bell sounds every 30 minutes, just like it has been done traditionally on all ships since the 16th century: double-strikes for full hours and an additional single-strike for half an hour. But unlike civil clocks, ship’s bells don’t have dedicated indications for every one of the 12 (or rather 24) hours in a day; instead, bell patterns repeat every four hours:

Bell pattern Time (a.m. and p.m.)
1 12:30 4:00 8:00
2 1:00 5:00 9:00
2 1 1:30 5:30 9:30
2 2 2:00 6:00 10:00
2 2 1 2:30 6:30 10:30
2 2 2 3:00 7:00 11:00
2 2 2 1 3:30 7:30 11:30
2 2 2 2 4:00 8:00 12:00

In this table, a “2” denotes a double-strike whereas a “1” signifies a single-strike of the bell.

The code below is a first draft of my ship’s bell app. It is running as a thread, sleeping most of the time (so you can still call Brutus a lazy dog). When it wakes up, it checks the current local time and determines how many strikes are to be done (‘compute_strikes’). Afterwards, the thread puts itself to rest again. However, I didn’t want to wake it up every second to check the wall time — that would be too inefficient. Instead, I base the sleep time on the temporal distance between now and the next half hour (‘compute_sleep_time’) and sleep for half of this time before checking again.

Alas, my initial implementation comes with a bug and the bell doesn’t work as it is supposed to. Can you spot it? (The bug is in the algorithm — it has nothing to do with any Python language quirks, of course.)

Ship’s Bell app code at GitHub.

Bug Hunting Adventures #9: A Random Piece of PI

According to an old saying, there’s more than one way to skin a cat. There are at least as many ways to compute the value of π. One of them uses the Monte Carlo method to approximate π’s value and it is the subject of today’s Bug Hunting epsisode.

We start with a so-called unit circle, a circle with radius 1 whose center is positioned at the origin in the Cartesian coordinate system. Next, we put a square around the unit circle whose sides have length 2 (the diameter of the unit circle):

drawingThere are two areas (literally!) of interest in this picture: the circle area Ac and the square area As:

Ac = πr² = π
As = (2r)² = 4

The ratio Ac/As is π/4

Why is this ratio important? Because we can use it to calculate the value of π:

π = 4 Ac/As

Now let’s do some random sampling. We take N random points whose x and y values are both in range [-1; +1] and tally the number of points that fall within the square (Ns) and the number of points that fall within the circle (Nc). Given enough points, the ratio Nc/Ns is a very good approximation for Ac/As and we hence can compute:

π ≈ 4 Nc/Ns

The C code below attempts to calculate π in this manner, but sports a blunder. What is the bug? Bonus question for the mathematically inclined: without executing the code, what value does it really compute (instead of π)?


Bug Hunting Adventures #8: Just Like a Rubber Ball…

Two incidents gave rise to this Bug Hunting episode: first, I’m currently beefing up my burglar alarm system with a Raspberry Pi and second, I’ve just come across the old song from the ’60s “Rubber Ball” by Bobby Vee while randomly surfing YouTube (“Bouncy, bouncy, bouncy, bouncy”).

The alarm system extension I have in mind is this: I want to be able to configure temporary inhibition of motion detection via a simple push-button: if you press it once, you suppress motion detection by one hour, if you press it n times, you suppress it by n hours, respectively. If you press the button after you haven’t pressed it for five seconds, any suppression is undone. There are status LEDs that give visual feedback such that you know what you’ve configured.

Thanks to the fine RPi.GPIO library, accessing GPIO ports is usually a piece of Pi. However, there is a fundamental problem when reading digital signals from switches and buttons called “switch bounce”, a phenomenon that has driven many brave engineers nuts.

If you flip a switch, there never is a nice transition from zero to one (or one to zero, depending on how you flipped the switch). Instead, the signal bounces between zero and one, in a rather indeterminate fashion for many milliseconds. The bouncing characteristics depend on various factors and can even vary for the same type of switch.

The takeaway is this: with switches (and buttons, of course), expect the unexpected! You must use some sort of debouncing mechanism, implemented in either hardware or software. For some background information check out this excellent article by embedded systems luminary and die-hard veteran Jack Ganssle.

But back to my alarm system. I skimmed the RPi.GPIO documentation and was happy to read that since version 0.5.7, RPi.GPIO has support for debouncing of digital input signals, so there was no need to debounce the push-button myself. Exuberantly, I sat down and wrote this test code:

To avoid arbitrary, floating input voltages, I setup the port such that it is pulled down (to ground) by default. As a consequence, reading that port when the button is not pressed yields a clean zero.

I didn’t want to poll the button port, as this would burn valuable CPU cycles, so I registered an event handler with the RPi.GPIO library that would be called back once a rising edge had been detected. I added a ‘bouncetime’ parameter of 200 milliseconds, to avoid multiple calls to my ‘button_pressed’ handler, in full accordance with the RPi.GPIO documentation:

To debounce using software, add the ‘bouncetime=’ parameter to a function where you specify a callback function. Bouncetime should be specified in milliseconds. For example:

In other words, after a rising edge has been detected due to a button press, I won’t be bothered with spurious rising edges for 200 ms. (200 ms is plenty of time — no switch I’ve ever seen bounced for more than 100 ms. Yet, 200 ms is short enough to handle cases where the user presses the button in rapid succession.)

I tested my button for a minute and it worked like a charm; then, I varied my press/release scheme and suddenly some spurious “Button pressed!” messages appeared. After some head scratching, I discovered the problem — can you spot it, too? (Hint: it’s a logic error, not something related to misuse of Python or the RPi.GPIO library.)

Epilogue: Once the problem was understood, the bug was fixed easily; getting rid of this darn, “bouncy, bouncy” earworm was much harder, though.


Bug Hunting Adventures #7: Random Programming

If you want to draw m random samples out of a set of n elements, Knuth’s algorithm ‘S’ (ref. The Art of Computer Programming, Vol 2, 3.4.2) comes in handy. It starts out by attempting to select the first element with a probability of m/n; subsequent elements are selected with a probability that depends on whether a previous element was drawn or not.

Let’s do an example. To select 2 random samples out of a set of 5 (say, the digits ‘0’, ‘1’, ‘2’, ‘3’, and ‘4’), we select ‘0’ with a probability of 2/5. If ‘0’ is not chosen, we attempt to select ‘1’ with a probability of 2/4 (since there are only 4 candidates left); otherwise (if ‘0’ has been selected) with a probability of 1/4, since one attempt to draw has already been used up. It’s easy to show the corresponding element selection probabilities in a binary tree:

In this tree, the nodes represent the likelihood that the element in the column on the right will be picked. Transitions from a node to the right mean that the element was picked, while transistions to the left denote that the corresponding element was not picked. In the ‘selected’ case, the numerator is decreased by one; the denominator is decreased in either case.

Here is an implementation of Knuth’s algorithm in C:

For the sake of demonstration, assume that bigrand() is a function that returns a random number that is “big enough”, meaning a value that is greater or equal to 0 and has an upper bound of at least n – 1 (a larger upper bound does not harm). Spend some time to understand how the ‘if’ statement implements the selection with the correct probablity.

So far, so good. But how would this clever algorithm look in another language? Let’s port it to Ruby!

Not being a Ruby programmer (but being an experienced Online Programmer), I had to ask Auntie Google many questions in order to cobble together this code:

This Ruby code looks remarkably similar to the C version. It doesn’t contain any syntax errors, but nevertheless doesn’t output what it is supposed to. Can you see why?

For those folks who don’t want to waste their time on Ruby, here is the Python version, sporting the same behavior and thus the same blunder: