Dealing with SQL Injection Part I


It turns out the new cool way of spreading malware is by SQL Injection. SQL Injection is also my favorite way of getting almost every piece of data available in the application. But I'm a solutions guy, and this is a solutions blog, so let's learn how to deal with it.

As Jeremiah said in his excellent post on the subject, a way (not necessarily the best way) of finding injection points is by outside in penetration testing. Trying particular sets of metacharacters might yield a database error, and you can often tell if the application is giving the error or if the database is giving the error. There are a couple of flaws with this approach: 1) programmers know errors are bad, and often do what they can to hide those, so you might not get adequate enough information to formulate a good attack. 2) Black-box penetration testing will only find a subset of the problems. If a particular injection is only exploitable on Tuesday, for instance, if it's not exercised on a Tuesday, it's never found. (These are called "corner cases")

The best place to fix SQL Injection vulnerabilities is in the source code. And if you have access to the source code, the absolute best method of finding SQL Injection points is by source code analysis - preferably a combination using automated static analysis tools and manual analysis. The way static analysis tools work to find the injection points is by marking data that enters the application as tainted or untrusted. Expensive tools can trace the taint through the application, through API calls, assignments, etc. until the data goes into a function that's not trusted. SQL Injections happen when untrusted data is concatenated into a query to be used in a prepared statement or query. Some example API's include in .NET DbCommand.CommandText (and children), or JDBC Statement.execute* or Connection.prepareStatement.

Here's a quick example from a web application:

Connection conn = DBUtil.getConnection();
Statement stmt = conn.createStatement();
String sql = "SELECT id, surname, givenName FROM person WHERE surname = '"
  + request.getParameter("surname")
  + "'";
ResultSet rs = stmt.executeQuery(sql);

There are two things that need to be corrected in the code above:
  • Rather than using a concatenated query, we should use a prepared statement, parameterized query, bind variables statement, whatever you want to call it. This allows the database driver to properly escape and transmit data to the query. For repeated queries, it also substantially improves performance.
  • Some sort of input validation needs to take place on the request parameter "surname" to verify that it's a legal surname syntax according to our syntax.

Here's the improved code using the first item:

Connection conn = DBUtil.getConnection();
String sql = "SELECT id, surname, givenName FROM person WHERE surname = ?";
PreparedStatement stmt = conn.prepareStatement(sql);
stmt.setString(1, request.getParameter("surname"));
ResultSet rs = stmt.executeQuery();

It's important to note that using PreparedStatement doesn't immediately make the code immune. You have to use them properly by using bind variables. The syntax for using them differs from driver to driver and RDBMS to RDBMS, but question marks are pretty common. Some things can't be bound, however. Only constant values can be bound, so if you're depending on the user to provide table or column names, you need to make sure it is exactly one of the allowed values, or use a lookup method such as a map to map integers to the column names. Also, LIKE statements seem to cause people grief - you need to concatenate the percent signs to the bind variables like "WHERE foo LIKE '%' + ? '%'" or "WHERE foo LIKE CONCAT('%',?,'%')".

As above, the other thing that needs to happen is input validation. I picked surname on purpose because the most common SQL metacharacter first used for injection is the apostrophe. An apostrophe might also be allowable in a surname. So it becomes a perfect example of why prepared statements work so well - it doesn't mean that you don't have to use apostrophes. Here's an updated version that takes pretty complex US-ASCII surnames including apostrophes, but still gets those safely to the query later.

String search = request.getParameter("surname");
if (search == null || !search.matches("^([A-Za-z][a-z]*[' ])?[A-Z][a-z]+(-([A-Za-z][a-z]*[' ])?[A-Z][a-z]+)?$")) {
Connection conn = DBUtil.getConnection();
String sql = "SELECT id, surname, givenName FROM person WHERE surname = ?";
PreparedStatement stmt = conn.prepareStatement(sql);
stmt.setString(1, search);
ResultSet rs = stmt.executeQuery();

Also, it's worth noting that simply using stored procedures does not remove SQL Injection vulnerabilities. There are two ways that injection can still take place - either by the way the statement is called from the code (with an EXEC with user data concatenated into it), or if the stored procedure uses concatenation to build a dynamic query - all you've done is moved the injection attack into the stored procedure, effectively using the prepared statement to safely transport the attack there.