Friday, December 5, 2008

The safest way to sanitize input: avoid having to do it at all

Last weekend, I finished some work on a small development project for a new client. Thanks to that, I found myself in a frame of mind that gave me the urge to write more code. I turned that to good use by finally getting to work on the long neglected task of writing a new contact page back end for my professional Website. I had a contact page there, of course, but it was essentially an ugly hack of a contact page back end I had written in PHP for a completely different Website a few years ago. Worse, it bore no resemblance to the rest of the site, and I had not bothered to give it a navigation element (i.e., what users tend to call a “menu”).
The back end for this contact page would be written in Ruby, of course. I wrote out a rather pretty script, if I do say so myself, that made use of the TMail library, a tool that abstracts away a lot of the behind-the-scenes drudgery of specifying email headers and content and preparing it for transmission using SMTP. I wrote it such that it would work equally well from the browser and the command line. Then, I tested it.
It worked brilliantly on my laptop when I executed it from the shell. It worked brilliantly on the server when I executed it from the shell, too. It failed utterly when I tried entering the URL for it in the browser. I spent entirely too long beating my head against the intractable problem of getting it working from the browser. I also tried RubyMail, an alternative to TMail, and ran into the same problems. As it turns out, someone saw fit to change some configuration option on the Webserver so that installing and using gems — that’s the term for Ruby libraries and utilities packaged up for use through Ruby’s own software management system — no longer works. This is the sort of thing that makes me think I should get myself a virtual server account for my professional Website, but it’s difficult to justify spending more than required by a shared hosting account considering the rather minimal technical requirements I have for the site.
I won’t get into the sordid details of how exactly Ruby gems no longer install and work properly on my shared hosting account. After a day plus of finding out that no amount of finesse, dirty hackery, or pleading with the server on my hands and knees would do any good, I gave up. Ruby, like many high level dynamic languages that are either interpreted or JIT compiled, provides extremely easy to use functionality for accessing other programs through outside shell processes. For instance, if you want to access the mail command from within Ruby, you can just wrap the command in backticks or execute it by way of the Kernel#system or Kernel#exec method, each of which provides subtly (but importantly) different functionality.
It’s surprisingly easy to send a command to the shell via one of these methods. I chose backticks. I started out writing something like this for the actual mail sending code:`echo "#{body}" mail -s "#{subject}" #{recipient}`
In that one line of code, I had a way to cram the body of a message (stored in the body variable) into an email that would arrive in my inbox with a subject that had been stored in the subject variable. My email address (since this was for a contact page) was specified by the recipient variable, since the point was to create a way for people to send emails to me without having to make that email address available to the world at large (and to make contact page emails stand out in my inbox). It was all quick, easy, and neat. Now comes the messy part: because the body of the email contained user supplied input, I needed to sanitize it so malicious input wouldn’t result in a vulnerability in my code that, oh, deleted everything in my directory on the server, for instance.
I ran around in circles on that for a little while, soliciting some outside help with figuring out how to make sure all the holes were plugged in the output sanitization code (thanks Sterling and ruby-talk). It was a lot of work, but I was making progress. Then, of course, someone on ruby-talk (thanks, James Gray) pointed out the obvious — I shouldn’t send the content of the body variable to the shell at all. What I should do instead is open a pipe to the mail command as an IO stream and just write to it like any other file handle.
The new code, then, ended up looking something like this:open( %Q{ mail -s "#{subject}" #{recipient} }, 'w' ) do msg
msg << body
end
Perhaps even more important, in terms of handling user input, is the fact that the `subject` and `recipient` variables are set by me, and not by user input.
As a result of taking this approach with both the content and headers of emails handled by this contact page, the need for me to sanitize input before I sent it to the shell simply evaporated. I was no longer sending any user input to the shell with my own code, at all.
The Moral of the Story
I was reminded, in a forehead slapping moment, of one of the cardinal laws of input handling. The first thought that occurs to most people when they think about security and accepting arbitrary input from unknown users — such as on a Website contact form — is that all input must be sanitized. That’s really a secondary rule, though.
One might get a little closer, in concept, to the first law of sanitizing data by observing the rule that one should not reinvent wheels, when it comes to security especially, without very good reason. In other words, if you have to sanitize data, use someone else’s well tested code to do it if you can, because writing it from scratch will be initially prone to error. I would have done just that, if I found something in the core language or standard library for Ruby that would do the kind of input sanitizing I actually needed. Alas, what I need is not something like URL escape characters, which is in the standard library via the CGI module.
The actual first rule of thumb for input sanitizing is in some ways much more obvious than the preceding two guidelines, but simultaneously far less well observed. I, myself, forgot it until a helpful soul on the ruby-talk mailing list reminded me (in a roundabout way) that:
It’s safer to write code that doesn’t require input sanitizing than to try to sanitize it.
It’s not a lesson I’ll forget again, any time soon. I’m just glad I didn’t learn it the hard way — by writing broken input sanitizing code and suffering the consequences when I exposed it to the Internet.

No comments: