By: Jared Westover | Updated: 2023-12-07 | Comments | Related: > Constraints
Problem
Do your friends make incredible claims and then provide no proof to back them up? As a kid, I wanted people to like me and told some wild stories. One involved crossing paths with a real ninja. How can you disprove a claim so outlandish? The internet shrunk the likelihood of getting away with making stuff up. I'm glad it wasn't around in the '80s. Yet, people still make stuff up. A friend told me a developer asked them not to add check constraints because of the detrimental impact on performance. They asked me if it was true: do constraints tank performance? What does tank even mean? Unlike my ninja story, we can put this one to the test.
Solution
This article explores check constraints and when you want to use them. Also, what's the difference between a check and a unique constraint? The big question I'll answer is: Do check constraints tank your database's performance? By the end of this article, you'll know how check constraints affect performance. When one of your coworkers makes an outlandish claim, send them a link to this article.
Check Constraints Explored
A check constraint is a logic rule placed on one or more columns in a table. The official Microsoft definition is that they enforce domain integrity by limiting the accepted values by one or more columns. Think of them as the traffic cops of SQL Server. I couldn't find the official SQL Server release introducing them, which means they're older than me. Below is the code to add one.
ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckFirstName CHECK (FirstName LIKE '%[A-Z]%');
One way to think about them is that they perform a test on the values coming into a row. Is it true or false? If it's true, then insert or update the row. If it's false, then it will not pass. When the constraint returns a false, SQL throws an error message like the one below.
You can disable them if you want. Sometimes, people do this when bulk-loading data and accept that the data loaded doesn't follow the rules. I try not to do this. When loading data, first put it into a staging table. Then, you can force it to follow whatever rules the constraints enforce. Once you deactivate it and load data, SQL marks the constraint as untrustworthy. This action can impact SQL's ability to create the best plan. For example, imagine a column where you only allow four values: 1, 2, 3, and 4. You then have a query with a WHERE clause, looking for 5. Since you invalidate the trust, SQL scans the table or index. If we didn't invalidate the trust, SQL would likely perform a Constant Scan since it knew the value wasn't allowed based on the constraint's logic. Grant Fritchey wrote an article titled 'Constraints and Select Statements' that delves deeper into this topic.
Unique Constraint
A unique constraint is also a check since it only allows distinct values into a column. SQL does this by creating a unique index. I've used unique constraints on columns like email or employee ID where neither fit the primary key criteria. Adding one might help prevent your app from crashing if you only expect one record when returning an email or ID. You can add multiple columns to a unique index. When you know a column is unique, adding a unique index can give your queries a performance boost since SQL knows only one value exists.
ALTER TABLE dbo.Person ADD CONSTRAINT UX_Email UNIQUE (Email);
Real World Examples
I'm highlighting three real-world examples I've used and ones we can test. The ones listed below are simplistic. I've seen constraints where you're verifying a dozen different checks. But, for our test, I want to keep it simple. I'm always up for hearing about other people's experiences. In the comments, list constraints you've implemented, especially anything out of the norm.
Phone Number
Phone numbers in their current form have been around for decades. At least in the United States, you don't use letters. You see late-night infomercials where a shady character shouts, "Call 555-FREE now before time runs out." But the word FREE refers to numbers on the dial pad. If you have a field where an end user enters a phone number or you're loading data, you want it to follow a specific format. This format excludes letters or other special characters—numeric only. Below is an example of creating a check constraint for this.
ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckPhonenumber CHECK (DATALENGTH(PhoneNumber) = 10 AND PhoneNumber LIKE '%[0-9]%');
File Extension
If users upload files to your application and it saves them in a data lake or, the database, I hope there's an extension. An extension tells you what type of file the end user uploaded. For example, I want to know if people upload .exe files and even stop them. At the very least, I want every file uploaded to contain an extension. The extension data allows you to determine how many PDF or XLSX files you own. Below, we reject any file without an extension at the end.
ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT FileExtension CHECK (CHARINDEX('.', filename) > 0 AND LEN(RIGHT(filename, CHARINDEX('.', REVERSE(filename1)) - 1 )) < 5 );
The example code above for checking the file extension is only a starting point. If you have a better one, please comment below so we can all benefit.
Salary Restrictions
Companies use Human Resource (HR) applications to keep track of employees. Part of the information gathered includes employee salaries. Generally, salaries range based on the role that someone fits into. For example, I would expect a developer's salary to range from $60K to $100K a year—despite what someone on LinkedIn says. For this reason, you might add a constraint to keep someone in HR from fat-fingering extra zeros. Now, being the recipient of this error, you might protest. But the company will catch on and want their money back. Below is an example of narrowing the salary range.
ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckSalary CHECK (Salary >= 60000 AND Salary <= 100000);
Building Our Dataset
We looked at three constraints, and now it's time to build a decent size dataset. For this one, we'll create one table with one million rows. If you want to create a larger dataset, feel free to do so. I include three columns where adding a check constraint makes sense, especially when you can't control the front end. Try to handle all validations on the front end, but when you can't, use SQL.
USE [master]; GO IF DATABASEPROPERTYEX('CheckConstraintsDemo', 'Version') IS NOT NULL BEGIN ALTER DATABASE CheckConstraintsDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE; DROP DATABASE CheckConstraintsDemo; END; GO CREATE DATABASE CheckConstraintsDemo ON PRIMARY ( NAME = N'CheckConstraintsDemo', FILENAME = N'C:\code\MSSQLTips\SQLFiles\CheckConstraintsDemo.mdf' ) LOG ON ( NAME = N'FragmentationImpact_log', FILENAME = N'C:\code\MSSQLTips\SQLFiles\CheckConstraintsDemo.ldf' ); GO ALTER DATABASE CheckConstraintsDemo SET RECOVERY SIMPLE; GO USE CheckConstraintsDemo; GO CREATE TABLE dbo.Person ( Id INT IDENTITY(1, 1) NOT NULL, FirstName VARCHAR(25) NOT NULL, LastName VARCHAR(50) NOT NULL, PhoneNumber VARCHAR(100) NOT NULL, Email VARCHAR(100) NOT NULL, BirthDate DATE NOT NULL, Salary DECIMAL(10, 2) NOT NULL, CreateDate DATETIME NOT NULL DEFAULT GETDATE(), ModifyDate DATETIME NULL, CONSTRAINT PK_Person_Id PRIMARY KEY CLUSTERED (Id) ); GO ;WITH PersonData AS ( SELECT TOP (1000000) SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 8 ) AS FirstName, SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 52) + 1, 15 ) AS LastName, ABS(CHECKSUM(NEWID()) % 9999999999) + 1000000000 AS PhoneNumber, DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + 7300), '1980-01-01') AS Birthdate, ABS(CHECKSUM(NEWID()) % 100000) + 25000 AS Salary FROM sys.all_columns AS n1 CROSS JOIN sys.all_columns AS n2 ) INSERT INTO dbo.Person (FirstName, LastName, PhoneNumber, Email, BirthDate, Salary) SELECT p.FirstName, p.LastName, p.PhoneNumber, CONCAT(p.FirstName, '.', p.LastName, '@sqlsavior.com') AS Email, p.Birthdate, p.Salary FROM PersonData p; GO CHECKPOINT; GO
Performance Test
To test performance, I'll use Adam Machanic's SQLQueryStress. It's a free tool dating back to 2007 for generating load. If you've never used it, I wrote an article on getting started. For our test, I'll focus on inserts and updates. You could add deletes to the mix, but I doubt check constraints produce a massive performance impact, unlike foreign keys.
We must first establish a performance baseline without the check constraints in place. At this point, let's go ahead and create a database backup to restore after completing each round.
BACKUP DATABASE [CheckConstraintsDemo] TO DISK = N'C:\code\MSSQLTips\SQLFiles\CheckConstraintsDemo.bak' WITH COPY_ONLY, NOFORMAT, INIT, NAME = N'CheckConstraintsDemo-Full Database Backup', COMPRESSION, STATS = 10; GO
Without Check Constraints
Let's get an idea of the performance without our three check constraints. In SQLQueryStress, I'll set the number of iterations to 100 and the number of threads to 200. This setting produces 20,000 iterations. Also, I'll run this three times for a total of 60,000 iterations completed.
Inserts
The first test I'll perform is inserting data into our table. I'm performing one insert at a time, sometimes called trickle inserts. Additionally, I inserted multiple rows at once and didn't see a big difference in performance. The code below is what I used for testing. Also, I've included a screenshot showing the settings in SQLQueryStress.
Here is the code for populating the parameter substitution:
;WITH PersonData AS ( SELECT TOP (1000) SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 26) + 1, 8 ) AS FirstName, SUBSTRING( 'ABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZABCDEFGHIJKLMNOPQRSTUVWXYZ', (ABS(CHECKSUM(NEWID())) % 52) + 1, 15 ) AS LastName, ABS(CHECKSUM(NEWID()) % 9999999999) + 1000000000 AS PhoneNumber, DATEADD(DAY, RAND(CHECKSUM(NEWID())) * (1 + 7300), '1980-01-01') AS Birthdate, ABS(CHECKSUM(NEWID()) % 100000) + 25000 AS Salary FROM sys.all_columns AS n1 ) SELECT p.FirstName AS FirstName, p.LastName AS LastName, p.PhoneNumber AS PhoneNumber, CONCAT(p.FirstName, '.', p.LastName, '@sqlsavior.com') AS Email, p.Birthdate AS Birthdate, p.Salary AS Salary FROM PersonData p;
Finally, the code below performs the insert.
INSERT INTO dbo.Person (FirstName, LastName, PhoneNumber, Email, BirthDate, Salary) SELECT @FirstName, @LastName, @PhoneNumber, @Email, @Birthdate, @Salary;
I'm using Extended Events (XE) to capture the performance markers. You can use SQLQueryStress, but I prefer to gather the stats using XE. To learn more about XE, search for any of Erin Stellato's free videos online. I'll include the results of my test in a table below.
Updates
Our second test, without check constraints, is performing updates. Like the inserts, we'll update one row at a time. When SQLQueryStress finishes, we'll perform 60,000 updates.
UPDATE dbo.Person SET BirthDate = DATEADD(DAY, 1, BirthDate), Salary = (Salary * 1.001), PhoneNumber = '5555555555' WHERE Id = @Id;
Below is the code I used for Parameter Substitution. It returns 200 random integers between one and a million.
SELECT TOP 200 (ABS(CHECKSUM(NEWID()) % 1000000) + 1) AS Id FROM sys.all_columns c1;
The following table outlines the performance of all executions for inserts and updates. I ran them multiple times until I got consistent results across executions. Since I used 200 threads, we can't divide the SUMs to get an average. However, since the SUMs are larger, it helps to determine if we experience a significant performance impact.
Action |
Rows |
CPU (AVG) |
CPU (SUM) |
DUR (AVG) |
DUR (SUM) |
---|---|---|---|---|---|
INSERT |
60000 |
0.138ms |
8280ms |
5.551ms |
333100ms |
UPDATE | 60000 | 0.074ms | 4453ms | 0.763ms | 45820ms |
Adding Check Constraints
It's time to add our three check constraints. Before we add them, I'll restore the database from the backup we made earlier. The constraints apply to the BirthDate, Salary, and PhoneNumber columns.
ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckBirthdate CHECK (BirthDate > DATEADD(YEAR, -105, GETDATE())); ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckSalary CHECK (Salary >= 25000 AND Salary <= 150000); ALTER TABLE dbo.Person WITH CHECK ADD CONSTRAINT CheckPhonenumber CHECK (DATALENGTH(PhoneNumber) = 10 AND PhoneNumber LIKE '%[0-9]%');
I'll repeat the steps from above and add the times to the table below. Also, I'll add the percentage increase from the first table.
Action | Rows | CPU (AVG) | % Diff | CPU (SUM) | % Diff | DUR (AVG) | % Diff | DUR (SUM) | % Diff |
---|---|---|---|---|---|---|---|---|---|
INSERT |
60000 |
0.149ms |
+7.9% | 8990ms | +8.5% |
6.860ms |
+23% | 411605ms | +23% |
UPDATE | 60000 | 0.089ms | +20% | 5364ms | +20% | 1.159ms | +51% | 69563ms | +51% |
Reviewing the Results
The results above were consistent across dozens of tests. Updates suffer the biggest impact. Before starting, I didn't expect a 51% increase in the duration. But would I describe an average of .076ms increasing to 1.15ms tanking performance? Not unless you need to keep the inserts under 1.0ms. Also, if you add a constraint that checks a dozen conditions, it will take more time.
Remember, adding a new column or foreign key causes performance to decrease when compared to a fresh table. Most databases grow, and with growth comes cost. If performance goes from 1ms to 1sec, you may have a problem. What's your experience with adding check constraints?
Key Points
- You add check constraints on one or more columns to enforce data rules. I advise handling this on the front end, but the back end works when you don't have control.
- You'll likely see a performance impact with any constraint, including foreign keys. However, if simple, it isn't significant. Also, what's the alternative?
- When you disable a check constraint and then re-enable it, SQL doesn't trust it anymore. But you can win back that trust.
Next Steps
- For another perspective on check constraints, check out Armando Prato's article, Enforcing business rules using SQL Server CHECK constraints.
- Have you disabled a constraint and want SQL Server to trust it again? Armando Prato wrote a short article titled, Avoid Untrusted Constraints in SQL Server.
- Ben Snaidero wrote SQL Server Performance Testing for Check Constraint vs Foreign Key. It compares the performance of check constraints and foreign keys.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2023-12-07