Aaron N. Tubbs bio photo

Aaron N. Tubbs

Dragon chaser.

Twitter Facebook Google+ LinkedIn Github

I’ve wanted to hate Python for so long.

I’ve often read about how swell Python is, and I’ve worked through various quick intros/tutorials a few time to get a taste for the language. But, I’d never really given it an honest-to-god chance.

I had a brief and illicit affair with Ruby, and it was sort of a fun time, but I quickly learned that the promise of ruby (the expressive power and syntactic flexibility of perl, the objectness and functionalness of python) was countered by things that nobody really wants to admit. I keep hearing rumblings about performance/GC problems. I hear rumors that the cool language features are getting cut because they’re too slow. But, far more worrying, I’ve yet to meet a half-sane ruby programmer.

Seriously. They’re all certifiable. Why the lucky stiff seems pretty normal once you start talking to some people that actually take ruby seriously. That should scare you. It scares me.

So anyhow, usually after going through some nut’s blog post about python, I try to give it another chance, and I go through another quick taste-of-python journey, and come out with the following observations each time:

  1. Whitespace as syntax. Yeah, I know, this issue has been beaten to death. That does not make it suck any less.
  2. Fucked up scoping. No, I don’t mean whitespace is fucked up. I mean the way scoping is decided, and what variables belong to what scopes in an object are not at all intuitive.
  3. Every problem in python has one solution, and it’s “import Solution. solutionFactory.solve(myProblem)” It’s not as bad as Java, but avoiding the canon is dangerous, and it’s easy to fall into framework/pattern hell.

And then I give up and ignore it for another six months.

So, I had a project. I wanted a program for work (though this is not a work project) that would:

  1. Provide a web server with a short link service, though with the ability to store some additional metadata as well. Specifically, if I get a link to some online video, I’d like to know, at least in one-sentence form, what I’m visiting.
  2. Provide an IRC bot component to the same system. It would post links posted to the web site, capture links posted in IRC and put them in the database with metadata and user information automatically, and would give people an exceptionally difficult time if they tried to repost a link.

I decided I’d give the project a shot with Python. As a side goal, I wanted to play with SQLite.

Oh, and I wanted to make my life complicated. So I did the anti-UNIX thing and decided to make a big monolithic process, full of threads and glue, rather than making two little programs instead. Why? Because threads are fun. Right? To this point, I’ve avoided threads, thinking they generally cause more problems than they solve, and thinking they have little place in a system where process spawning is inexpensive. So, it seemed that as long as I was toying with a language that features whitespace as syntax, I may as well go all in and use threads too.

Everything was going great. Python and I were getting along. First BasicHTTPServer, then Cheetah. Pysqlite (and a gentle introduction to dbapi2 and PEP249) Throw in some Twisted Python. Everything’s happy. I have a swell web server. A database abstraction layer. An IRC bot, scraping urls into a database.

Time to combine them. Inject some threads.

And that’s pretty much where everything crumbled apart. The Twisted reactor had a runtime exception indicating it couldn’t run in a thread. Since then, a quick google search found me a workaround, but the “gosh, this is trivial” feeling went away as soon as threads entered the picture.

At the time, however, I didn’t know this, and just went a slightly different route — I’d leave the IRC component in the main thread, and spawn the web server in another one.

So, that was going pretty swell.

Then I tried to pass my database connection across thread boundaries.

I think this is when I realized Python was not for me. Conceptually, I’ve got a handle to a database, where the database is not leveraging any sort of service to coordinate asynchronous access. This means that transactions are expensive, and I’d really like to do the bookkeeping process-side across threads with locking constructs, and just leave the connection open.

Yes, let’s digress for a moment. For the massive transactional load this application would see (on the order of a few dozen transactions a day), none of this really matters, and flock/fsync in sqlite are not going to cause any meaningful performance hits. Similarly, I’m not really getting any savings by using threads, as opposed to two processes here.

I know this. I made the project arbitrarily more difficult, specifically to learn a bit more than I would have otherwise, and to see how Python responds when I start trying to coax it into doing what I want, rather than doing what it wants.

Back to the story.

“Oh no you don’t!” says our protagonist; as soon as I try to pass a database handle across threads, I’m told I can’t pull that sort of shit.

Lest you think I haven’t done any of my homework, here’s a direct quote from the SQLite FAQ:

The restriction on moving database connections across threads was relaxed somewhat in version 3.3.1. With that and subsequent versions, it is safe to move a connection handle across threads as long as the connection is not holding any fcntl() locks. You can safely assume that no locks are being held if no transaction is pending and all statements have been finalized.

Now, before I start down some righteous path, I want to make something clear. Despite the irritation, I think the fact that, out of the box, I had a rough time trying to get any of this stuff working in/across threads is actually a good thing. This headache is likely to cause enough pain that the novice programmer will not keep trying to do something they lack the experience to do safely.

I think it’s another nice language feature in Python.

Unfortunately, and this may be my downfall, I like languages that let me shoot myself in the foot.

Kids, it’s bad analogy time!

Some will tell you a Volvo is the safest car out there, because it has an excellent record for passive safety. You can do all sorts of stupid shit, but with all of the safety gadgets and engineering, you’ll come home in one piece every night.

There is some truth to this. Analogy digression time!

A car is just another tool. It always confused me that table saws came with blade guards, but I never saw an accomplished wood worker using one. Here’s the situation: There is a spinning carbide-tipped blade of death rotating in very close proximity to all sorts of important body parts. And, at the same time, a talented craftsman, which I assumed was also intelligent to this point, has all this protective equipment laying on the floor.

Idiots, you think? No. What the craftsman knows is that he gets more visual feedback when he can see the blade cut. He knows when he’s coming up against a knot, because he can see it get close. Without a guard pushing down on the board, he’s getting more sensory feedback in his arms when something isn’t right, or when the saw is about to kick. Without something in the way, if something does happen, it’s easier to get stuff out of the way, because the safety equipment isn’t fighting to try to keep everything in the worst possible place: where the problem is happening.

I am not a good woodworker, and I am clumsy with tools. Shapers, jointers, and table saws all scare me, and I would feel better with the guards in place.

What the good woodworkers appreciate is active safety. They get more feedback and can react faster to a problem when unencumbered by passive safety mechanisms.

Pop the analogy stack.

While clumsy at woodworking, I at least have some formal training in the proper driving of a car. And, in the case of a car, I’d rather drive a Ferrari than a Volvo. Here’s the short course: If a semi broadsides either one at 70 miles an hour, the occupants are dead. If either one plows into a granite wall at highway speeds, the occupants are dead. All is not, however, lost. Superior handling, acceleration, and deceleration means that with skilled input from the driver, the Ferrari has more ability to actively avoid an accident.

Pop the stack again. We’re talking about Python again. In case it wasn’t clear. Python is a Volvo with a blade guard, it reeks of passive safety and hand holding. For the novice programmer, I think this is an excellent thing, because it makes dangerous things difficult. When dangerous things are difficult, those without the experience to safely do them are less likely to attempt to do so. Take the path of least resistance — in the “there’s only one way to do it” philosophy, it means it will usually be the correct path.

It may be my downfall, but I think I prefer languages that let me shoot myself in the foot.

Case closed, right?

No so fast. I have another bone to pick with Python. Larry and I have discussed this a bit, so I have the benefit of his insights here, which makes this point a little more focused than the rest:

In Python, unit testing is not optional.

In all languages, there is some balance of run-time and compile-time behavior (replace/combine compile with interpret as necessary). The syntax validation before runtime in Python is a joke; the ability to dynamically rebind nearly everything at runtime provides a certain guarantee: Nothing is safe, and nothing is guaranteed to actually work. The fact that a program runs in Python means nothing. At first I was horribly excited by the fact that, more often than not, when I wrote something in Python it just ran the first time. Later I realized this is as much of a bad sign as good. Python has no idea if your syntax is valid until it actually executes it.

Knowing this, one’s only option is to take every single bit of functionality in every single class, and unit test it to see if it actually runs. Otherwise, the only thing that can be guaranteed is that it probably won’t actually work at runtime. Unit testing may as well be a syntax feature just as much as whitespace and exception handling — something made implicitly mandatory by any nontrivial block of code.

Yuck. Again, this is great for writing safe code, especially when one is new to programming. I’m beginning to at least appreciate why Python is heralded as such a great learning language.

So that’s pretty much where I am right now. I’m not giving up on Python yet, and I want to get to the bits that excite people so much, but the journey thus far has been a bit rocky.