Aaron N. Tubbs bio photo

Aaron N. Tubbs

Dragon chaser.

Twitter Facebook Google+ LinkedIn Github

Repeating Myself

If I hammer no other point home, it is of the utmost importance that, during the course of a candidate’s technical interviews, they write code. Real code that will interpret and compile if the kinks are worked out. This is not a matter of trivia and remembering syntax, it’s about proving the candidate is a programmer. The only way to verify the candidate is a programmer is to make them write real code. Not pseudocode. C is a great option, the language of the job is another good option. Preferably both. Make them write C.

Introduction

Alright.

A fairly popular selection of easy “write some code” problems comes from C’s string library. There are a few valid uses of these functions:

  1. Check that a candidate is comfortable writing code.
  2. Check that a candidate can find bugs in their code, especially off-by-one issues.
  3. Check that a candidate has comfort with pointers.

These are all great things, and these problems excel at burning through the above in short order. Here’s the invalid uses of these functions:

  1. Aggregate a bunch of them to try assemble an interview out of string.h questions.
  2. Relentlessly pursue the optimal implementation to strrev for an hour.
  3. Bludgeon the candidate with your knowledge of Rabin-Karp, Knuth-Morris-Pratt and Boyer-Moore solutions to strstr and then throw in their face that, after an hour, they failed to correctly implement either.

Obviously option #3 is perfectly understandable if you’re a sadist, and your purpose in interviewing is to both turn off the candidate and feel intellectually superior.

An Example

In one interview, I just had a few minutes left, and uncharacteristically had made it that far without the candidate writing any code. The candidate forgot a parameter to memcpy, called it memcopy, and swapped the order of the source and destination. The called strlen twice plus the copy, for three k-length passes, where k is bounded roughly by the length of the string. They allocated bytes for two nulls, not just one, during the realloc (never mind that that’s not the way strcat actually works), and inserted the second string a byte late. Considering the following is a reasonable (and probably buggy) implementation of strcat, you might think I was disgusted with the above result:

char* strcat(char* dest, const char* src) {
  char* t = dest;
  while (*t++); --t;
  while (*t++ = *src++);
  return dest;
}

Getting off track, a useful observation is that strlen is a C string function, so either hand that out for free, make the candidate implement it inline, or have them write a strlen real quick. Of course, if you’re going to make them write strlen, you may as well just make them write strlen, and not bother with whatever function you’re asking them to implement.

So, returning to the ostensible disaster above, the trick was I had about three minutes. My goal was to find out if the candidate could comfortably write code on the board on demand. They could. That’s all I cared about. If I had a few more minutes, we would have found out most of the issues. Good questions are “Is this perfect?” If for some reason they are foolish enough to believe it is, “So, what’s wrong with your code?” is the next choice. Even if there’s nothing wrong. Make them defend it.

In the end, we’re talking ten minutes, tops. Most of the functions in string.h can be reasonably approximated, debugged, and erased in about ten minutes. There are, unfortunately, some gotchas. There are traps in functions like strcmp, because the exact behavior of strcmp is not usually clear to the candidate. memmove is a little more complicated than memcpy. As I’ve discussed before, strrev has issues, though it’s not really part of string.h anyway. strtok is a much more serious problem, and now we have to have a discussion about reentrancy, which is good and bad, but we’re not going to do it all in ten minutes. And, well, strstr is a bitch.

strstr

strstr is a trap. The brute force solution is ugly, but easy to code. After that, the candidate either knows Rabin-Karp, Boyer-Moore, or Knuth-Morris-Pratt … or they don’t. Trying to tease out the solution and get it right is hard, and as many times as I’ve read KMP, I sure wouldn’t want to defend it on an interview. Anything other than the brute force solution is no longer a ten minute interview problem. That’s okay, but be willing to set aside the entire interview to implement strstr. Then, ask yourself, is it worth blowing 45 minutes to an hour on strstr? No. There are more interesting things to do. So, don’t ask a candidate to implement strstr. QED.

The other thing to keep in mind with all of this stuff is that these questions are asked frequently. Candidates probably have studied these things, have seen them previously, or whatnot. Therefore, they’re not really good to assess their knowledge or skill, they’re just good quick heuristics to check comfort with writing code and debugging by reading, anything more than that, and you’re wasting your time. Use these things to check if a person can write C, and then move on to the meat of your interview.