Aaron N. Tubbs bio photo

Aaron N. Tubbs

Dragon chaser.

Twitter Facebook Google+ LinkedIn Github

Don’t Lose All Hope

The next question I cover is going to be a lot more fun, I promise. This one is a bit drab.

Introduction

Before I get started, it is my belief that every programmer looking for a job should be able to speak at least a minimal dialect of C. I’ve mentioned this before, but Joel’s got it right on this one. I don’t care if they’re making Web 2.0 applications with the newest fancy tools on the block, they should be comfortable with a few basics. By this, I mean they should be able to write real code that has a chance of compiling in a C compiler. The basic toolkit requires a cursory understanding of the following:

  • Pass by value semantics and implications
  • malloc and free
  • Character buffers
  • Pointer and basic pointer math
  • for and while loops
  • Awareness of character, signed, unsigned, and floating point types
  • Familiarity with boolean operations

Basic understanding of bitwise operations is a huge bonus, but not critical.

A quick pass through the first few chapters of K&R should be sufficient. I’m not asking for anything too deep here.

The Question

Perhaps the most tired, worn-out, and over-used technical interview question ever is “Reverse a string in C.” One of the benefits is that it’s trivial to explain. Send the candidate home if they don’t understand the request in 30 seconds. Let’s pick it to death, now.

There are a few reasonable questions:

  1. Should I do it in place?
  2. Should I return the reversed string from the function?
  3. How much extra space do I get?
  4. Isn’t there some clever XOR trick for this?
  5. Uhhhh, strrev?
  6. Uhhhh, std::string(str.rbegin(), str.rend())?
  7. Uhhhh, std::reverse(str.begin(), str.end())?

It’s up to the interviewer to make a decision about whether they want the candidate to deal with issues of memory allocation, or if they want them to try to deal with an arbitrary space trade-off. The question is generally asked in a way that directs the candidate towards doing the reversal in-place, since that leads to the oh-so-clever optimization that lets the “clever” interviewer feel smug about themselves, when they should be spending their time doing something useful. More on that later. Some other variations include suggestion recursion instead of iteration, but I’m getting off-track. I’m going to focus on the in-place implementation for the remainder of this post.

Providing a declaration for the function should generally avoid most of these problems. Something nice and tricky, like:

void strrev(char* c);

The “Suboptimal” Solution

Questions about using library functions can be deflected with a quick lie or whatever. To encourage an iterative approach, respond to questions about space usage as not caring at first, and then enforcing further space constraints down the road. Or do the management thing, “how much space do you think you need?”

Whatever, let’s start with the worst case. We take the novice interview candidate, and set them to this task, and don’t give them any sort of space limits, and they’re comfortable with malloc and free, and in our hypothetically perfect world, they write something like this (I’m being lazy and using C99 here; I don’t care if a candidate is sloppy and doesn’t know what they’re using. Or if they write C that would only compile in C++, that’s fine in my book too. The appropriate trivia is easy to teach, if it comes down to it):

void strrev(char* c) {
  int len = 0;
  for (; c[len]; ++len);
  char* t = malloc(len * sizeof(char));
  for (int i = 0; i < len; ++i) {
    t[i] = c[len - i - 1];
  }
  for (int i = 0; i < len; ++i) {
    c[i] = t[i];
  }
  free(t);
}

If the candidate wants to use strlen and strncpy or memcpy, that’s fine. I’m not entirely convinced that it’s in the spirit of the problem, but as long as the string reversal itself is hand-coded, it’s not that far off the mark.

An astute candidate will point out that, in the presence of strlen, there’s surely something to reverse the string too. If you want to burn another three minutes (I hope it doesn’t take longer than this), make them write strlen if they want to use it and start getting smart with you. This question really isn’t worth this much wasted time, honestly. There may be a discussion about maybe making this quadratic by not caching the strlen and calling it in the loop, but whatever.

Does the candidate check inputs? Should they? What do the C runtime functions do? What behavior makes sense? Is it different if we’re writing library code versus hacking together an application?

In all likelihood the original code will have something wrong. Most likely opportunities are:

  • Trying to allocate memory on the stack
  • Not freeing memory
  • Off-by-one-issues
  • Forgetting to add space for nulls, depending on implementation
  • Not writing loops that terminate normally

Some trivial mistakes that really don’t matter, but interviewers can be a dick about, if they want to waste time for no reason at all:

  • Incorrect order to str functions
  • Not checking return of malloc
  • Not using a specific dialect of C that the interviewer likes
  • Not commenting the code

End of the day, we’re using linear storage to reverse a string, which is poopy. The interviewer has a decision to make. Is it better to have a candidate write code with a suboptimal solution, or to tease out an optimal solution and then write code? Obviously, this is up to the arbitrary preferences of the interviewer. When I’m being interviewed, I’d rather develop a novel approach and then code it up, if I can, but if I get stuck, it’s fun to start coding and see what I’ve overlooked. I don’t really think there’s a right answer. There’s a lot of value in seeing a poor solution coded well. If a candidate is comfortable writing code, we still learn a lot. One can watch their habits, such as how they set up their code. Do they start thinking about free as soon as they malloc? Do they get nervous and triple-check their loops to catch off-by-one errors? Or, do they shit memory down the toilet and reverse a string using factorial time with respect to length of the string? Anything is possible here, to be fair.

Thoughts on the Suboptimal Solution

I guess my point is that this is a poopy strrev by most reasonable standards, but tells you a great deal about how comfortable a candidate is writing code. A big part of a lot of jobs is writing shitty cookie-cutter code. Some people like to write shitty cookie-cutter code and then use what they learned in the process to write a “good” version. Some people learn by thinking, some people learn by doing. What I want to hammer home here is that there’s no right answer. Interviewing is an art, it’s adaptive, it’s creative, and there are no simple easy answers. The candidate gets a gold star from me if they show up, write the above solution, but warn me ahead of time that “this isn’t optimal, but I want to start by writing something that works and see if I can improve it.” A well-executed simple approach gets more points in my book than a flawed and poorly-executed novel approach. Cleverness only gets you so far in a practical profession, and being too clever is dangerous. Yes. I’m foreshadowing what comes later. Be patient, even though you know what’s coming.

I’ll go on the record to say that I think one will learn more about somebody’s ability to write code while implementing this solution than any of the solutions that follow from this point. Chew on that one for a bit.

Use Constant Space!

The next step is to use a single spare character and reverse via character swapping. This isn’t a hard leap from the previous solution if the candidate gets the idea of swaps and understands that the problem is not particularly complicated. Here’s my first try; like my first, it’s not any sort of reference implementation, but hopefully gets the jobs done:

void strrev(char* c) {
  int len = strlen(c);
  char t;
  for (int i = 0; i < len / 2; ++i) {
    char t = c[i];
    c[i] = c[len - i - 1];
    c[len - i - 1] = t;
  }
}

The sorts of problems that come up with this solution are of the same variety as the previous approach, though there are fewer mistakes to be made, and we shouldn’t need to deal with memory anymore, which is sort of a bummer. There is a slight bonus that there are decent odds the candidate will inadvertently reverse the string twice by going over the wrong bounds (and thus undoing their work). This is neat, but not as neat as dealing with memory leaks and more opportunities to screw up bounds. Still, the “great” thing about this interview question is that the better the solution gets, the less you learn about the candidate’s ability to code. This doesn’t strike me as an optimal characteristic of an interview question.

Use no space!

There are two ways I know of to do this algorithm completely in place, there may be more. I first ran into this problem as a candidate when interviewing with Amazon.com in college. In a one-hour interview slot, I managed to derive both of these approaches. I’m not a clever C hacker now, but was a much less capable programmer back then. The key part of my statement is that we wasted one hour to derive these, which is a colossal amount of time to figure out if I can solve a pointless puzzle. I’m getting ahead of myself.

The first approach is to exploit the knowledge that we’re working with characters, and we can cheat a bit with the math, since we know the printable character range falls within certain numeric bounds. This sort of hacking around gets more complicated if we’re supporting something other than plain old ASCII. This is an interesting discussion in and of itself, but we’ll ignore that for now. We use the same approach with the swap as above, but instead of using a temporary, we implement a swap that works like this:

void swap(char* a, char* b) {
  *b += *a;
  *a -= *b;
  *b += *a;
  *a = -*a;
}

And thus we write our solution as:

void strrev(char* c) {
  int len = strlen(c);
  for (int i = 0; i < len >> 2; ++i) {
    swap(&c[i], &c[len - i - 1]);
  }
}

Ignore function calling overhead and the fact that we’re creating storage with our function parameters; we can write this in place, exploit a macro/inline, whatever. Not important. At all. If the candidate brings these things up, great, it’s a nice 15-second chat, and they get brownie points.

The second approach gets all cute, and use XOR, which is a clever and succinct:

void swap(char* a, char* b) {
  *a ^= *b, *b ^= *a, *a ^= *b;
}

Or:

void swap(char* a, char* b) {
  *a ^= *b;
  *b ^= *a;
  *a ^= *b;
}

Delicious, right?

No. I’m here to set the record straight. This is fucking dumb. Wasting an hour to tease out this solution (or half an hour with a good candidate, I imagine, unlike myself) is … wasting an hour. I didn’t get the job with Amazon, and it wasn’t because of my performance with string reversal, but rather with an abysmal fit interview with this guy from Xerox Parc. Well, that and an inability at the time to appreciate the merits of detecting aliasing on a spinning disc via gray codes, but that’s the topic of longer digression some other day.

Maybe this sort of pointless optimization isn’t always pointless. Maybe I’m missing the point. If you’re hiring somebody to decipher some hideously obfuscated shit code, it might help if stuff like this springs to mind. Or, if you’re looking for somebody to obsessively waste every last minute of their workday trying to squeeze memory or performance out of their program, maybe this makes sense, right?

No. It doesn’t make sense. People that are actually good at this sort of shit and that will apply it in real code (heaven forbid) already knew this trick before you asked the question. They’ve probably read Hacker’s Delight, and can explain why compiled integer division code with a constant uses multiplication and a magic number instead of a division operator. They know why I used a shift in the above example, except instead of prematurely optimizing and being cute, the point out that the compiler is already going to do this, and that I’m being stupid and cute. And that it’s not endearing or interesting.

Many of you will persist in believing that there’s a point in getting to this silly optimization. You are wrong. Pushing somebody to use the tools they have and satisfy some arbitrary goal that requires the production of obfuscated code is not going to find the people you want. At least not when their tools are the ones you’re trying to make them use in this problem. This isn’t realistic, and it isn’t practical, and it’s just pointless.

Let’s Get Back to the Point

Cutting to the chase a bit, everybody has seen this problem. Or they should have. It’s listed on any site out there that collects technical interview questions. The upside is that if the candidate struggles on this problem, then they are both unqualified and not particularly good at doing their independent study.

What I’m getting at here is that this isn’t a good positive screening question. It’s a good heuristic to use to reject a candidate quickly and off-hand, but it does little to assess somebody afterward, especially if pushed towards a more optimal solution. It would be somewhat more useful if it wasn’t such a common question, but it is what it is.

An especially nasty characteristic of the nature of this problem is that the most interesting solution to see in code will be implemented by those most incapable of writing the code, and the most interesting solution theoretically will result in entirely dull code. I describe interestingness here not as cleverness, but as useful for evaluating a programmer’s ability to write code. I don’t like the way this continuum works, because it pretty much guarantees is that any local maxima is a minima in the orthogonal plane.

Pros

  • Has the possibility of allowing a candidate to write code with memory allocation, loops, bounds issues, pointers, and all sorts of other delicious stuff
  • Addresses the possibility of making space/clarity trade-off
  • Has a nice hook for talking about character encodings
  • Shouldn’t take more than fifteen minutes, unless you’re dumb and try to get to the oh-so-clever solution

Cons

  • Everybody has seen this problem
  • Space/clarity trade-off is not nearly as interesting as space/time trade-off
  • There is nothing difficult about reversing a string, per se
  • “Better” solutions result in less interesting code being written

TL;DR

My advice: don’t ever use this problem.