WHAT
front page
entry archive
my flickr photos
professional home
my twitter
my wishlist
my tumblog
my facebook
WHO
2010-03-01 : programming
Don’t Lose All Hope
The next question I cover is going to be a lot more fun, I promise. This one is a bit drab.
Introduction
Before I get started, it is my belief that every programmer looking for a job should be able to speak at least a minimal dialect of C. I’ve mentioned this before, but Joel’s got it right on this one. I don’t care if they’re making Web 2.0 applications with the newest fancy tools on the block, they should be comfortable with a few basics. By this, I mean they should be able to write real code that has a chance of compiling in a C compiler. The basic toolkit requires a cursory understanding of the following:
malloc and freefor and while loopsBasic understanding of bitwise operations is a huge bonus, but not critical.
A quick pass through the first few chapters of K&R should be sufficient. I’m not asking for anything too deep here.
The Question
Perhaps the most tired, worn-out, and over-used technical interview question ever is “Reverse a string in C.” One of the benefits is that it’s trivial to explain. Send the candidate home if they don’t understand the request in 30 seconds. Let’s pick it to death, now.
There are a few reasonable questions:
strrev?std::string(str.rbegin(), str.rend())?std::reverse(str.begin(), str.end())?It’s up to the interviewer to make a decision about whether they want the candidate to deal with issues of memory allocation, or if they want them to try to deal with an arbitrary space trade-off. The question is generally asked in a way that directs the candidate towards doing the reversal in-place, since that leads to the oh-so-clever optimization that lets the “clever” interviewer feel smug about themselves, when they should be spending their time doing something useful. More on that later. Some other variations include suggestion recursion instead of iteration, but I’m getting off-track. I’m going to focus on the in-place implementation for the remainder of this post.
Providing a declaration for the function should generally avoid most of these problems. Something nice and tricky, like:
void strrev(char* c);
The “Suboptimal” Solution
Questions about using library functions can be deflected with a quick lie or whatever. To encourage an iterative approach, respond to questions about space usage as not caring at first, and then enforcing further space constraints down the road. Or do the management thing, “how much space do you think you need?”
Whatever, let’s start with the worst case. We take the novice interview candidate, and set them to this task, and don’t give them any sort of space limits, and they’re comfortable with malloc and free, and in our hypothetically perfect world, they write something like this (I’m being lazy and using C99 here; I don’t care if a candidate is sloppy and doesn’t know what they’re using. Or if they write C that would only compile in C++, that’s fine in my book too. The appropriate trivia is easy to teach, if it comes down to it):
void strrev(char* c) {
int len = 0;
for (; c[len]; ++len);
char* t = malloc(len * sizeof(char));
for (int i = 0; i < len; ++i) {
t[i] = c[len - i - 1];
}
for (int i = 0; i < len; ++i) {
c[i] = t[i];
}
free(t);
}
If the candidate wants to use strlen and strncpy or memcpy, that’s fine. I’m not entirely convinced that it’s in the spirit of the problem, but as long as the string reversal itself is hand-coded, it’s not that far off the mark.
An astute candidate will point out that, in the presence of strlen, there’s surely something to reverse the string too. If you want to burn another three minutes (I hope it doesn’t take longer than this), make them write strlen if they want to use it and start getting smart with you. This question really isn’t worth this much wasted time, honestly. There may be a discussion about maybe making this quadratic by not caching the strlen and calling it in the loop, but whatever.
Does the candidate check inputs? Should they? What do the C runtime functions do? What behavior makes sense? Is it different if we’re writing library code versus hacking together an application?
In all likelihood the original code will have something wrong. Most likely opportunities are:
Some trivial mistakes that really don’t matter, but interviewers can be a dick about, if they want to waste time for no reason at all:
End of the day, we’re using linear storage to reverse a string, which is poopy. The interviewer has a decision to make. Is it better to have a candidate write code with a suboptimal solution, or to tease out an optimal solution and then write code? Obviously, this is up to the arbitrary preferences of the interviewer. When I’m being interviewed, I’d rather develop a novel approach and then code it up, if I can, but if I get stuck, it’s fun to start coding and see what I’ve overlooked. I don’t really think there’s a right answer. There’s a lot of value in seeing a poor solution coded well. If a candidate is comfortable writing code, we still learn a lot. One can watch their habits, such as how they set up their code. Do they start thinking about free as soon as they malloc? Do they get nervous and triple-check their loops to catch off-by-one errors? Or, do they shit memory down the toilet and reverse a string using factorial time with respect to length of the string? Anything is possible here, to be fair.
Thoughts on the Suboptimal Solution
I guess my point is that this is a poopy strrev by most reasonable standards, but tells you a great deal about how comfortable a candidate is writing code. A big part of a lot of jobs is writing shitty cookie-cutter code. Some people like to write shitty cookie-cutter code and then use what they learned in the process to write a “good” version. Some people learn by thinking, some people learn by doing. What I want to hammer home here is that there’s no right answer. Interviewing is an art, it’s adaptive, it’s creative, and there are no simple easy answers. The candidate gets a gold star from me if they show up, write the above solution, but warn me ahead of time that “this isn’t optimal, but I want to start by writing something that works and see if I can improve it.” A well-executed simple approach gets more points in my book than a flawed and poorly-executed novel approach. Cleverness only gets you so far in a practical profession, and being too clever is dangerous. Yes. I’m foreshadowing what comes later. Be patient, even though you know what’s coming.
I’ll go on the record to say that I think one will learn more about somebody’s ability to write code while implementing this solution than any of the solutions that follow from this point. Chew on that one for a bit.
Use Constant Space!
The next step is to use a single spare character and reverse via character swapping. This isn’t a hard leap from the previous solution if the candidate gets the idea of swaps and understands that the problem is not particularly complicated. Here’s my first try; like my first, it’s not any sort of reference implementation, but hopefully gets the jobs done:
void strrev(char* c) {
int len = strlen(c);
char t;
for (int i = 0; i < len / 2; ++i) {
char t = c[i];
c[i] = c[len - i - 1];
c[len - i - 1] = t;
}
}
The sorts of problems that come up with this solution are of the same variety as the previous approach, though there are fewer mistakes to be made, and we shouldn’t need to deal with memory anymore, which is sort of a bummer. There is a slight bonus that there are decent odds the candidate will inadvertently reverse the string twice by going over the wrong bounds (and thus undoing their work). This is neat, but not as neat as dealing with memory leaks and more opportunities to screw up bounds. Still, the “great” thing about this interview question is that the better the solution gets, the less you learn about the candidate’s ability to code. This doesn’t strike me as an optimal characteristic of an interview question.
Use no space!
There are two ways I know of to do this algorithm completely in place, there may be more. I first ran into this problem as a candidate when interviewing with Amazon.com in college. In a one-hour interview slot, I managed to derive both of these approaches. I’m not a clever C hacker now, but was a much less capable programmer back then. The key part of my statement is that we wasted one hour to derive these, which is a colossal amount of time to figure out if I can solve a pointless puzzle. I’m getting ahead of myself.
The first approach is to exploit the knowledge that we’re working with characters, and we can cheat a bit with the math, since we know the printable character range falls within certain numeric bounds. This sort of hacking around gets more complicated if we’re supporting something other than plain old ASCII. This is an interesting discussion in and of itself, but we’ll ignore that for now. We use the same approach with the swap as above, but instead of using a temporary, we implement a swap that works like this:
void swap(char* a, char* b) {
*b += *a;
*a -= *b;
*b += *a;
*a = -*a;
}
And thus we write our solution as:
void strrev(char* c) {
int len = strlen(c);
for (int i = 0; i < len >> 2; ++i) {
swap(&c[i], &c[len - i - 1]);
}
}
Ignore function calling overhead and the fact that we’re creating storage with our function parameters; we can write this in place, exploit a macro/inline, whatever. Not important. At all. If the candidate brings these things up, great, it’s a nice 15-second chat, and they get brownie points.
The second approach gets all cute, and use XOR, which is a clever and succinct:
void swap(char* a, char* b) {
*a ^= *b, *b ^= *a, *a ^= *b;
}
Or:
void swap(char* a, char* b) {
*a ^= *b;
*b ^= *a;
*a ^= *b;
}
Delicious, right?
No. I’m here to set the record straight. This is fucking dumb. Wasting an hour to tease out this solution (or half an hour with a good candidate, I imagine, unlike myself) is … wasting an hour. I didn’t get the job with Amazon, and it wasn’t because of my performance with string reversal, but rather with an abysmal fit interview with this guy from Xerox Parc. Well, that and an inability at the time to appreciate the merits of detecting aliasing on a spinning disc via gray codes, but that’s the topic of longer digression some other day.
Maybe this sort of pointless optimization isn’t always pointless. Maybe I’m missing the point. If you’re hiring somebody to decipher some hideously obfuscated shit code, it might help if stuff like this springs to mind. Or, if you’re looking for somebody to obsessively waste every last minute of their workday trying to squeeze memory or performance out of their program, maybe this makes sense, right?
No. It doesn’t make sense. People that are actually good at this sort of shit and that will apply it in real code (heaven forbid) already knew this trick before you asked the question. They’ve probably read Hacker’s Delight, and can explain why compiled integer division code with a constant uses multiplication and a magic number instead of a division operator. They know why I used a shift in the above example, except instead of prematurely optimizing and being cute, the point out that the compiler is already going to do this, and that I’m being stupid and cute. And that it’s not endearing or interesting.
Many of you will persist in believing that there’s a point in getting to this silly optimization. You are wrong. Pushing somebody to use the tools they have and satisfy some arbitrary goal that requires the production of obfuscated code is not going to find the people you want. At least not when their tools are the ones you’re trying to make them use in this problem. This isn’t realistic, and it isn’t practical, and it’s just pointless.
Let’s Get Back to the Point
Cutting to the chase a bit, everybody has seen this problem. Or they should have. It’s listed on any site out there that collects technical interview questions. The upside is that if the candidate struggles on this problem, then they are both unqualified and not particularly good at doing their independent study.
What I’m getting at here is that this isn’t a good positive screening question. It’s a good heuristic to use to reject a candidate quickly and off-hand, but it does little to assess somebody afterward, especially if pushed towards a more optimal solution. It would be somewhat more useful if it wasn’t such a common question, but it is what it is.
An especially nasty characteristic of the nature of this problem is that the most interesting solution to see in code will be implemented by those most incapable of writing the code, and the most interesting solution theoretically will result in entirely dull code. I describe interestingness here not as cleverness, but as useful for evaluating a programmer’s ability to write code. I don’t like the way this continuum works, because it pretty much guarantees is that any local maxima is a minima in the orthogonal plane.
Pros
Cons
TL;DR
My advice: don’t ever use this problem.
“I’m not a clever C hacker, and was a much less talented programmer back then.”
You’re just as talented as you were. You’re now more skilled.
— Bill 2 March 2010 #
There’s an in place solution that’s not overly clever. Use the string’s null terminator as the temporary character — just set it back to zero at the end.
I wonder what superficial limitation the Amazon interviewer would have imposed on the problem to steer it towards the xor approach. “Well, now we need to do it in place… just not like that!”
— Kaminski 2 March 2010 #
I’m a big fan of this question:
http://tinyurl.com/yfnlmpz
The commentary is awesome and sadly all too familiar.
— bennett 2 March 2010 #
I’m pretty sure the Amazon reviewer’s trick for forcing both the XOR and not to use the null terminator (good call, thanks) was to say that the string is binary and a length is provided now, rather than providing a null-terminated string.
— Aaron N. Tubbs 3 March 2010 #