Debugging

Pretty much any nontrivial piece of software will have bugs during development. Fixing bugs is thus an unavoidable part of programming, and it’s important that all programmers have some skill at the task. I recently made a video series about developing some poker-related software. The focus was on the problem domain, but much of the audience was new to programming, and I didn’t talk too much about what to do when things don’t go perfectly, i.e. when there are bugs. So, this post is a quick intro to debugging methodology in general, but I have my poker audience in mind.

So again, for my poker students, if some of your code doesn’t work while you’re going through the series, and you have to fix it, please don’t think of that as an unfortunate detour. On the contrary, it’s an important and valuable part of the process. If you just copy down everything I write, and it all works perfectly, you haven’t gotten your money’s worth :). When you write your own code, there won’t be an answer key available, and you’ll have to figure out problems on your own.

Generally, the debugging process starts when I notice the code is doing something that’s not what I expect. This can happen naturally while using the software or it can be the result of explicitly testing the code. I usually start out by looking for low-hanging fruit. Maybe I can think of a couple things that could concievably cause the the unexpected behavior, so I go check on those first. If I don’t have any luck, however, I start a more systematic approach…

The approach basically involves coming up with a series of assumptions about how code should work and then verifying those assumptions and digging into them when they’re violated. We’ll find a series of violated assumptions that form a sort of trail leading us to the root of the problem. The trail begins with whatever unfulfilled expectation led us start debugging in the first place. (“My code gives the wrong answer.”) How do we find the next step?

So, we have a function that’s producing a wrong answer, and that function has some inputs. Google image search provides this helpful illustration, where I guess “I” is “Input”:

functions have inputs and outputs

As far as I can recall, for all functions we write in our video series, the output follows deterministically from the input and are otherwise “pure” in the sense described here. If we run such a function a bunch of times with the same input, we’ll get the same output every time. Things become a bit tricker with code that uses random numbers or reachs out to the user or over the network for data or things like that. In our case, however, if the output is wrong, then either

one of the inputs is wrong, or
the logic in the function itself is wrong

Test each of those cases, starting with the inputs. For each input to the function, figure out what you think it should look like, and then check what it actually is when you run the program. A debugger can help to inspect the values of variables when a function runs, but print statements are a simple way to accomplish this as well. If one of the inputs is bad, move to the place where that input is being generated, i.e., to the function that called this one. The caller is another function that’s producing an unexpected result causing the problem in the current one. Repeat the process there.

On the other hand, if all the inputs are correct, then it’s the current function itself that is at fault. Work through it line by line, figuring out the values of all variables involved and verifying that they are as you expect. When one is not, that’s the smoking gun that indicates our root problem. Verifying individual functions line-by-line can be tricky. It helps a lot to follow good programming practice and break complicated tasks into small, easily-analyzable functions. Facility with a debugger is useful as well.

So that’s it. If we follow these steps, and our code satisfies our original assumptions re: pureness, this process should eventually lead to the root cause of unexpected behavior.

Now, if you can’t figure out an issue, and you need to ask for help, the sort of information gleaned in this process is exactly the sort that somebody will need to help debug your problem. So (dear poker students), please provide all of the following info (in a nicely formatted way) if you need debugging help:

What function is giving a wrong answer, what is that answer, and what should the answer be?
For all inputs to the function:
1. what should the input be?
2. what is the input when you actually run the code?
If all the inputs are correct, then presumably the problem is in the current function. Please provide your code for that function.
If one of the inputs was not correct, then move to the function that produced that input and go back to 1.

glgl!