Storing passwords in SQL Server – things to know to keep the data secure

Problem

This is the first in a planned mini-series about common design anti-patterns in SQL Server. What do we mean by 'anti-pattern'? An anti-pattern is 'a common response to a recurring problem that is usually ineffective and risks being highly counterproductive.' (Wikipedia, 2016). This article considers the security implications of storing passwords in the database, examining how this is commonly implemented; by trying basic attack methods on these implementations we show how even salted password hashes can be broken; and offer insights on how to strengthen the security of stored passwords to counter the risk of breach.

Solution

When developers need to create an access control system, they often rely on a data storage medium such as SQL Server to store username and password information. Other metadata might also be stored such as the number of failed logins, the last successful login date, and this authentication information may be linked to customer records in other tables.

This presents an opportunity for the hacker. Imagine that someone was able to fetch the username and password for a customer of an e-commerce site. This someone may be able to:

Place fake orders for the customer
Fetch card or bank account details from the customer profile
View order history
Fetch addresses and security questions for the user
Use the information above to perpetrate attacks on other sites

It is immensely important that passwords are stored securely in SQL Server to ensure that even if the tables containing this information were infiltrated, the passwords themselves remain secure. For the scenarios in the following text, we will assume that a hacker has already gained access to the authentication table, so we can focus on data protection.

We will start by examining a basic example of an authentication table. If you would like to follow along, please consult the 'preparation.sql' script attached to create this table.

SELECT * FROM dbo.Auth_Basic

You will note that two critical columns are exposed here, the username and the password (in plaintext). This means there is no barrier to the hacker - as they have this information, they can simply select a victim and log in via the application interface. Some passwords are random and some are popular. Let's do a frequency analysis using a simple GROUP BY to illustrate this:

SELECT	passwd, COUNT(*) [occurrences] 
FROM	dbo.Auth_Basic 
GROUP	BY passwd 
ORDER	BY COUNT(*) DESC

SQL Server supports cryptographic hash functions. These are functions that take a string input and obfuscate it to a specific length and format using a one-way cryptographic function. These can be very useful - for our example, we could amend all the passwords to use one of these functions. SHA-256 is (at the time of writing) considered cryptographically secure. This article is written to be compatible with SQL Server 2008 and older and using SHA-256 is tricky in versions prior to 2012, so let's use MD5 for backward-compatibility. As part of our security hardening effort, we will create a new authentication table with hashed passwords from the old table using the HASHBYTES() function:

SELECT	UserName, HASHBYTES('MD5', CAST(passwd AS NVARCHAR(4000))) [HashedPasswd], UserID 
INTO	dbo.Auth_Hashed
FROM	dbo.Auth_Basic

SELECT * FROM dbo.Auth_Hashed

Now, the application developer can amend the application code to encrypt user input with the SHA-256 algorithm and compare the output hashes only, returning a simple TRUE/FALSE if user input matches the stored password.

However, we still have a problem. If the hacker has access to this table, he cannot see the plaintext passwords, but he is certainly able to perform the same frequency analysis as we did on the Basic_Auth table. This would show him which hashes were most popular. He could then search publicly-available databases or websites to find the appropriate plaintext, enabling him to guess the passwords that would affect hundreds of accounts.

For example:

SELECT	HashedPasswd, COUNT(*) [occurrences] 
FROM	dbo.Auth_Hashed 
GROUP	BY HashedPasswd
ORDER	BY COUNT(*) DESC

Number of occurrences for the hashed password

If I took the top hash by frequency, 0xC31AC605793F580B386C0FB53F1B9775 and remove the 0x to leave a hexadecimal hash string, I could plug this term into a website like pastebin.com to see if the hash has been cracked:

Convert the hexadecimal string to a clear text string

As we see, the plaintext equivalent is 12345. Now the hacker has access to 223 accounts with a simple query to determine the usernames using this password:

SELECT	UserName 
FROM	dbo.Auth_Hashed 
WHERE	HashedPasswd = 0xC31AC605793F580B386C0FB53F1B9775

User Names with a Hashed Password that matches 1 2 3 4 5

It is, of course possible to perform this attack in the opposite direction. Let's compute the hash for the password 'letmein' and see if it matches any of the password hashes in the table:

SELECT	UserName, HashedPasswd 
FROM	dbo.Auth_Hashed 
WHERE	HashedPasswd = HASHBYTES('MD5', CONVERT(NVARCHAR(4000),'letmein'))

User Name and Hashed Password with the value of letmein

To overcome this deficiency, it has become common practice to salt a password. What is salting? It refers to the practice of modifying a string in a deterministic way i.e. by concatenating another string to it, or inserting/replacing characters in the string according to rules. Salting a password is typically done in two ways - either a unique randomly-generated salt can be used alongside the username/password credentials, or a single salt can be used, defined in the application-side source code.

Let's take our original plaintext table and create another table, generating a random salt for each password along the way and hashing this, so we can examine if this method is secure. For simplicity, we'll generate a random integer in the range 0-99999 to act as the salt, although in reality salts can be much longer and more complex strings, and simply append this integer to the end of the string.

;WITH generateSalt AS (
	SELECT	UserName, passwd, CAST(ABS(CHECKSUM(NEWID())) % 100000 AS NVARCHAR(6)) [salt], UserID 
	FROM	dbo.Auth_Basic ) 
SELECT	UserName, HASHBYTES('MD5',CONVERT(NVARCHAR(4000),passwd + salt)) [HashedSaltedPwd], UserID 
INTO	dbo.Auth_Hashed_Salted 
FROM	generateSalt 

SELECT * FROM dbo.Auth_Hashed_Salted

This approach adds complexity for the hacker. Repeating his frequency analysis on this new table isn't going to work, since even those passwords that are identical will have different hashes. Let's try it now:

SELECT	HashedSaltedPwd, COUNT(*) 
FROM	dbo.Auth_Hashed_Salted 
GROUP	BY HashedSaltedPwd 
ORDER	BY COUNT(*) DESC

Number of occurences of the same password has dropped with the salt

We can see we have just one collision, where an identical salt has been generated for an identical password. All other hashed, salted passwords are now unique. So what can the hacker do to counter this? One approach is still possible, but more time-consuming - the hacker knows the range of salts is 0-99999 from the table data, and can speculate that a simple method like appending it to the password might be used. Let's take ten common passwords and compute the hashes for them with all salts ranging from 0-99999 appended, then join this result set on to our hashed, salted table to see if any passwords can be identified. We can do all of this in SQL. Let's first generate the hash list. 10 passwords x 100,000 salt values = 1,000,000 hashes to check and put this information into the table #computedHashes:

IF EXISTS ( SELECT name FROM tempdb.sys.tables WHERE name LIKE ('%computedHashes%') )
	DROP TABLE #computedHashes 
GO
CREATE TABLE #computedHashes ( 
	ptPwd VARCHAR(50), 
	salt INT, 
	hsPwd VARBINARY(MAX) )

;WITH top10pwds AS ( 
	SELECT 'password' [pwd] UNION SELECT '123456' UNION SELECT '12345678' UNION SELECT '1234' UNION SELECT 'qwerty' 
	UNION SELECT '12345' UNION SELECT 'dragon' UNION SELECT 'letmein' UNION SELECT 'abc123' UNION SELECT 'admin' ), 
-- on-the-fly numbers table
salts AS ( 
	SELECT TOP 100000 ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) [num] FROM sys.objects o1 CROSS JOIN sys.objects o2 CROSS JOIN sys.objects o3 )

INSERT INTO #computedHashes 
	SELECT	top10pwds.pwd, salts.num [salt], HASHBYTES('MD5',CONVERT(NVARCHAR(4000),pwd + CAST(salts.num AS VARCHAR(6)))) [hash]
	FROM	top10pwds		
	CROSS	JOIN salts 
	ORDER	BY top10pwds.pwd, salts.num

Here's what the #computedHashes table contains:

SELECT TOP 1000 * FROM #computedHashes

Here is what the #computedHashes table contains

It is now trivial to join this information back to our dbo.Auth_Hashed_Salted table, selecting the username and ptPwd (plaintext password) where the calculated hash matches the hash we have stored:

SELECT	auth.Username, comp.ptPwd [Password], auth.UserID 
FROM	dbo.Auth_Hashed_Salted auth 
INNER	JOIN #computedHashes comp ON auth.HashedSaltedPwd = comp.hsPwd

Joining back to get the original user name, password and user id

We can see we have broken 2,081 user passwords from our data table of 10,001 in this manner.

IMPORTANT! Even if we are not using integer-based salts, we can simply select all distinct salts from the table and apply each salt to each password we are testing. For this reason, storing the salt in the table may not be a great idea. If we do not know the cryptographic function, we can try a range. If we do not know how the salt has been applied, we can guess at some common techniques - i.e. applying the salt to the beginning or end of a password. Nor are potential passwords limited to 10 - within a click or two of the right Google search, it is possible to download millions of password samples.

This is computationally expensive, but only the result of the Cartesian products of all of these factors:

Check operations = Number of passwords to check * number of different cryptographic schemes * number of distinct salts known * number of different salt application techniques

Instead, some developers choose to store the salt in the application-side code. This has one major benefit and one major drawback - the benefit is that the source code (or server-side code) would have to be compromised to access the salt value, which would probably present many other, more urgent problems if that occurred, and can be difficult to do. The drawback is that using this approach, typically a single salt value is used for all passwords. If this value is compromised, all passwords could be at risk of compromise.

Other theoretical attack vectors include packet sniffing. Is the application-side code using SSL to communicate with the database? If not, why not? SQL Server supports key/certificate-based authentication with client connections. Without this, a successful login attempt including username, user-entered password and salt could be sniffed en-route.

So what lessons can we learn from this demonstration?

Strong passwords make dictionary-like attacks less likely. Make sure your users adhere to password policies to create strong passwords. Enforce this at the client-side if necessary.
Consider the role of table-level security in SQL Server. Ensure your users only have the access they need.
Consider options like row-level security where available to you.
Consider storing your username and password information in separate tables or databases.
Consider the use of SSL for database traffic encryption during user authentication.
Never store plaintext passwords in the database.
Consider storing your salts separately from your passwords or obfuscate these salts through a hidden, reversible methodology (i.e. an application-side mathematical function).

Next Steps

Karwin, B. (2010) SQL Antipatterns, Publisher: Pragmatic Programmers, ISBN 978-1-93435-655-5
Sergey Gigoyan: Storing passwords in a secure way in a SQL Server database
Manvendra Singh: Identify blank and weak passwords for SQL Server logins
Brent Shaub: SQL Server stored procedure to generate random passwords

About the author

Dr. Derek Colley is a UK-based DBA and BI Developer with more than a decade of experience working with SQL Server, Oracle and MySQL.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2016-04-25

Thursday, January 13, 2022 - 9:13:26 PM - Mahesh	Back To Top (89660)
This. Was nice explanation, thanks

Monday, June 15, 2020 - 11:27:47 AM - Stephen P Seymour	Back To Top (86007)
Thanks for the nice overiew.

Tuesday, April 26, 2016 - 9:35:12 AM - Marc Jellinek	Back To Top (41343)
Thanks for pointing out the anti-patterns related to storing usernames and passwords. My question is: how does one do this properly?

Monday, April 25, 2016 - 5:04:15 PM - Justin J Stark	Back To Top (41332)
Use BCrypt and do the hashing in server code and not in the database. In .NET, use BCrypt.NET (available in NuGet as BCrypt-Official) which is fool proof and gives you a nice 60 character string (which includes the algorithm, cost value, salt, and hash) to store in the database.

Storing passwords in SQL Server – things to know to keep the data secure

Problem

Solution

Next Steps

About the author

Comments For This Article