SQL injection is a common vector of attacking websites. With only a moment's mental lapse by a programmer, he can instantly give full control of his server's database to any random visitor with only a few second's work. The hacker can take all of the information from the database and then delete it, leaving the site owner in a very unfortunate position, especially if the data stolen was confidential.

Definitions and the Problem

When a programmer creates a website that interfaces with a database, he will often use "SQL" -- Structured Query Language -- to do so. SQL is a simple language with English-like syntax; the line "SELECT username, user_rank FROM users_table WHERE user_posts > 50" will, as it suggests, return a set of every user and his rank who has more than 50 posts in this particular database. The simple but powerful nature of SQL means it is used almost everywhere on the Web today.

Another feature of SQL that makes it so widely used is that it can be used inside of other common scripting languages -- such as VBScript and PHP -- easily. Programmers new to the web, or even people new to programming at all, can create a database-driven website in days. And therein lies the problem; many of the resources that educate these new programmers completely dismiss the issue of security, creating an army of unaware coders making thousands of websites without any of the necessary protections.

SQL Injection, put simply, is a malicious user changing the programmer's SQL statements to do things to the database that were never originally intended. Where the coder only wanted the user to login if he had a valid username and password, the hacker logs in to the admin account without a password; where the coder wanted time details to be displayed, the hacker displays credit card information; and, as previously mentioned, at almost any step the hacker can delete the entire contents of the database, possibly preventing the site from properly working in the process.

However, SQL Injection is, in most cases, simple to fix. If a coder is aware of the potential danger of SQL Injection while he is programming the website, he should find no trouble steeling his site against such attacks.

How it Works - By Example

The easiest way to explain SQL Injection is by showing an example. In this example, I will use HTML, the markup used to create static webpages; SQL; and VBScript, the simple scripting language used in Microsoft's Active Server Pages. I have chosen VBScript for two reasons; it is easy for even non-programmers to understand, and it is extremely prevalent among new web programmers, the type most likely to leave themselves open to SQL Injection.

So, let's say we want to make a page with two textboxes: one for username, and one for password. We want to check this information against a database to see if the username and password that the user gives us exists. If it does, we give them the confidential data; otherwise, we tell them that their information was not valid. For the HTML form, we may write:

<FORM action="login.asp" method="POST">
<input type="text" name="username_input">
<input type="text" name="password_input">
</FORM>

In our Active Server Page, "login.asp", we have our VBScript and SQL code that checks the database to see if our user exists in the database:

  stringUsername = Request.Form("username_input")
  stringPassword = Request.Form("password_input")

  stringSQL = "SELECT * FROM Users " & "WHERE Username='" & stringUsername & "' AND Password='" & stringPassword & "'"

  Set databaseRecordset = Server.CreateObject("ADODB.Recordset")
  databaseRecordset.Open stringSQL, DatabaseConnectionString

  If (databaseRecordset.EOF) Then
    Response.Write "Your credentials are incorrect."
  Else
    Response.Write "You are logged in as " & databaseRecordset("Username")
  End If

Don't worry about the details of this code; the important part is to see that the SQL string is comprised of both parts that we as the programmer write and parts that the user submits. For instance, if we submit a username of "admin" and a password of "dogsname", the resulting SQL string will be:

SELECT * FROM Users WHERE Username='admin' AND Password='dogsname'

Then, we return a "Recordset" -- a table of information -- from the database where the username is admin and the password is dogsname. If such a row exists in our database, the user has supplied the right information, and the Recordset will be set to that row. If the data supplied is not in the database, the Recordset will return empty. The Recordset will be at the End Of File, since it is empty; therefore, the (databaseRecordset.EOF) check will be true, and we tell the user that the credentials were incorrect.

A hacker who wants to try an SQL injection on this page may input "admin'--" into the username field. This will result in an SQL string of:

SELECT * FROM Users WHERE Username='admin'--' AND Password=''

However, the two dash characters ("--") act as a comment tag in SQL; the database will ignore everything after them. This means that all that the database sees is:

SELECT * FROM Users WHERE Username='admin'

The database will then return the "admin" row from the database, even though we didn't give it a password! Now, the website thinks we've logged into the system as the admin, and we can do anything that the website would normally have allowed only someone with proper credentials to do.

However, it is usually not this easy for a hacker to take advantage of a SQL Injection exploit. There are often several barriers impeding the hacker, even if the programmer doesn't know about it. For one, we may not know what the administrator account is called; for another, many databases do not use the comment tag. However, the system is still exploitable. if we submit a username and a password of "' OR 'a'='a" we will get the following SQL string:

SELECT * FROM Users WHERE Username='' OR 'a'='a' AND Password='' OR 'a'='a'

"a" is always equal to "a", and since we used the OR statement, this entire statement will be true for every field in our database. Therefore, we will be logged in as the first person in the database -- usually the administrator account!

How to Prevent It

Preventing SQL Injection is simple. All web programmers need to learn one catchphrase and always remember it. Never trust user-supplied data. This is vitally important, so allow me to repeat:

Never trust user-supplied data.

Okay, good. Now, we know not to trust the username and password field. So how do we prevent this user we're not trusting from changing our SQL statement? If you notice, the user is able to "break out" of the place we expect them to put things with the single-quote character ("'"). If they couldn't use single quotes, they wouldn't be able to inject anything into this statement, because whatever they input would be stuck inside the pair of single-quotes.

Thankfully, SQL has a feature that helps us here: if it sees two single-quotes in a row ("''"), it knows that the quote should be a character of the text field, not part of the structure of the statement. This is called escaping the single-quote. So, if we replace all single-quotes in our untrusted input with two single-quotes, our SQL Injection exploit is fixed!

  stringUsername = Request.Form("username_input").Replace("'","''")
  stringPassword = Request.Form("password_input").Replace("'","''")

Now, in this instance, our SQL string is safe. However, this does not solve every possible injection possibility; numbers in SQL, for example, are not surrounded by single-quotes, so an exploit on numerical fields could get past this prevention method. But as long as you remember to never trust user-supplied data, you can figure out a method to keep your SQL free from exploits. (In the case of numericals, for example, do not allow any characters in the variable.)

More Advanced SQL Injection

There are many, many more techniques for advanced SQL Injection. For instance, many programmers often forget about "hidden" input tags; these are pre-set values on forms that the user can't even see. Many programmers will think these values are safe; however, data does not have to be passed through the form that you made. A hacker can make his own version of the form where he can adjust the data in the "hidden" field and send it back to you; your program has no way of telling if the form sending it data was yours or someone else's. Thus, treat values from "hidden" input the same as you do all other user-supplied data: don't trust it.

As far as database manipulation goes, the hacker has many more tricks available to him. The semicolon in SQL seperates two statements, so if a malicious user sent a password of "';DROP TABLE Users;--" to the above form, the resulting string would be "SELECT * FROM Users WHERE Username='' AND Password='';DROP TABLE USERS--'". The first command would execute and return nothing, but the second command would execute as well, deleting our entire Users table!

Another more advanced trick is the use of "UNION SELECT". This allows a coder to combine different SELECT statements, possibly from different tables, into the same recordset. One use a hacker may have for this is on an online shopping site; if he can UNION SELECT credit card information into the list of available products, he can steal hundreds of identities of people who have shopped at this site.

Conclusion

No matter how advanced the attacker who uses SQL Injection may be, he can be easily defeated by any web programmer who remembers to never trust user-supplied data. However, until the masses of new web programmers learn that critical rule, easy opportunities for SQL Injection will leave hundreds of websites vulnerable and thousands of people's secure information at risk.