By: Ben Snaidero
Overview
When using the LIKE operator and having the first character in your search string a wildcard character, either % or _, the SQL Optimizer will be forced to do a table/index scan when executing your query.
Explanation
Before we get into the details of our explanation let's first create an index on the column that we are going to use in the WHERE clause of our query. Here is the code to create that index on the Child table.
CREATE NONCLUSTERED INDEX idxChild_VarcharDataColumn ON [dbo].[Child] ([VarcharDataColumn]) -- cleanup statements --DROP INDEX Child.idxChild_VarcharDataColumn
So why does it have to perform a table/index scan? Since all SQL Server indexes are stored in a B-Tree structure when we begin our search criteria with a wildcard character the optimizer is not able to use an index to perform a seek to find the data quickly. It either performs a scan of the table or a scan of an index if all the columns required for the query are part of the index. Now I understand that there are some cases where this would not be possible based on your requirements but the following example shows why you should try to avoid doing this whenever it's possible. Let's write a simple query that performs a search on the column we indexed above. Here is the code for this simple SQL statement.
SELECT * FROM [dbo].[Child] WHERE VarcharDataColumn LIKE '%EST5804%'
Looking at the explain plan for this query we can see that the index on the VarcharDataColumn that we created is ignored and a clustered index scan (essentially a table scan) has to be performed.
Now let's change the search string in this query to remove the wildcard so the string you are searching for begins with a valid character. Here is the updated SQL statement. Note: I picked the search criteria so that both queries return the same result set so that the results are not skewed by one query returning a larger result set.
SELECT * FROM [dbo].[Child] WHERE VarcharDataColumn LIKE 'TEST5804%'
Looking at the explain plan for this query we can see that the optimizer is now using the index we created and performs a seek rather than a scan.
Although we should be able to tell from just comparing the explain plans that the second query will perform better let's just confirm that it indeed uses less resources and executes faster than our initial query by looking at the SQL Profiler results. We can see from below that by removing the wildcard character from the start of the query we do in fact see quite a big improvement.
CPU | Reads | Writes | Duration | |
---|---|---|---|---|
Wildcard at Start | 328 | 7042 | 0 | 404 |
No Wildcard at Start | 0 | 670 | 0 | 64 |
Additional Information
- Investigate using Full Text Search as an alternative to "LIKE '%abc'"
Last Update: 2/17/2014