More on Output Filtering


Another one of the Justice League guys has posted about the value of output filtering. Now, the people at Cigital are uber smart - not because they're blogging about output filtering, but because they actually look at code from a very academic perspective. If you're not reading Justice League, you should be.

Now, I'm not sure I've ever said that you shouldn't do input validation. I hope I never have, and if I have, I apologize for leading you astray. However, the phrase I've been using in reporting for some time is "business-rule input validation, presentation-specific output filtering". A favorite line of a favorite movie of mine (PCU) is when Gutter is going to see George Clinton, and he's wearing a Funkadelic shirt:

Droz: What's this? You're wearing the shirt of the band you're going to see? Don't be that guy.

While I understand the desire to prepare data early (when it comes in), you've already manipulated it to be improper in some other context. For example, if you HTML encode data when it comes in, a month from now when you're using it in a PDF, your customers will complain about all these < in it.

And the very best thing about output filtering is that it can be a habit. Scott hit the nail on the head in his post - the way that you properly get input squared away varies all the time, but output filtering for each presentation layer is always the same. In HTML/XML, using .encodeAsHTML() or <c:out /> or Server.HTMLEncode becomes a habit, just as much as the style guides you expect your developers to adhere to. And it's something that can be checked with a couple of regex's, instead of a really expensive tool (not that those really expensive tools aren't good for other things).