How to make SQL Server Wildcard Searches Faster

By:   |   Updated: 2018-09-13   |   Comments (11)   |   Related: > TSQL


Problem

In SQL Server, we often need to search character fields using the percentage (%) wildcard.  I see when I put the wildcard at the end of the search string (String%) it uses the index, but if I put it the front (%String) it does a scan of the index.  How can I make this search (%String) more efficient?

Solution

Having the wildcard (%) at the end of the string when searching on uncertain characters is not as challenging as having the wildcard at the beginning of the search string. For example, searching on a character field with 'abc%' is not an issue if the column has an index, but searching '%abc' is always slow because it does an index scan. However, there are many situations where we need to write queries with '%abc'.

In this tip, we will look at one way this can be resolves to make queries faster than ever before.

SQL Server Wildcard Searches Using %

For example, say you have a table named Employee and you want to find all the rows where the name starts with 'Aaron'. In this case, you would write a query as follows:

SELECT * 
FROM Employee 
WHERE name LIKE 'Aaron%'

Similarly, you can find all the employees with the name ending with certain characters, as follows:

SELECT * 
FROM Employee 
WHERE name LIKE '%Allen'

You can also find all the employees where the name has letters 'Co' anywhere in the name. The query would look like below:

SELECT * 
FROM Employee 
WHERE name LIKE '%Co%'

Assuming the Employee table has an index on the Name column, you will quickly see the last two queries are slower than the first one because they do an Index Scan and instead of an Index Seek.

The first query is faster because the WHERE condition is Sargable. However, the 2nd and 3rd queries are not Sargable, hence those queries could not leverage the index on Name. As a result, the 2nd and 3rd queries are slower than the 1st query.

How to make SQL Server Wildcard Searches Using % Faster

Let's create a sample data set using the AdventureWorks2016 database.  Let's call this new table customer.

USE AdventureWorks2016
GO

CREATE TABLE customer 
(
   customer_id int identity(1,1) primary key,
   cust_name varchar(100),
   cust_name_reverse varchar(100)
);

CREATE INDEX idx_cust_name ON customer(cust_name);
CREATE INDEX idx_cust_name_reverse ON customer(cust_name_reverse, cust_name);

The customer table has three columns including a primary key for the customer_id. I intentionally added the third column to contain the customer name in reverse order. We will need that column later.

The script below populates the table using the data from the [Person].[Person] table in the AdventureWorks2016 database.

INSERT INTO customer
SELECT FirstName + ' ' + LastName , REVERSE(FirstName + ' ' + LastName)
FROM [Person].[Person]

Note that the INSERT query uses the REVERSE function to generate the customer name in reverse order.

Also, note that we have indexes on both cust_name and cust_name_reverse columns.

Slower SQL Server Wildcard Search

Let's do a search to find out all the customers with their names ending with 'Abercrombie'. See the below query:

SELECT cust_name FROM customer 
WHERE cust_name LIKE '%Abercrombie'

Here are the results:

Query result

The execution plan does an Index Scan.

Non-sargable wildcard search

This table has 19,972 records and the query returns only three records. It does an Index Scan, meaning the entire index of cust_name had to be scanned to find just three records which is not good in terms of performance.

The query above cannot be further improved.

Faster SQL Server Wildcard Search

We could use an alternative method to get the same result more efficiently with an index seek. See the alternative query below:

SELECT cust_name FROM customer 
WHERE cust_name_reverse LIKE REVERSE('Abercrombie') + '%'

Here are the results:

Query result

The execution plan does an Index Seek.

Execution plan - Sargable wildcard search

This time, we used cust_name_reverse column which has the customer name in reverse order. Also, notice the percentage sign (%) is at the end of the search string. Basically, we have converted our query to make it Sargable as 'abc%'.

Comparing the SQL Server Wildcard Queries

Let's execute both queries at once and compare the execution plans and cost comparison.

We can see the first query that does an Index Scan is 96% of the batch and the second query that does an Index Seek is 4% of the batch.

Execution plan comparison - Non-sargable and sargable wildcard search

It's great! The alternative approach is a big improvement.

Summary

The drawback of this approach is that you need to have an additional column to store the reverse order of the string, but if you are designing the database system, you can add the additional column if it is something that will be used frequently with wildcard searches where "%" is at the beginning of the string. The performance gain with this approach could be bigger than the additional storage it uses. It's your choice!

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Susantha Bathige Susantha Bathige currently works at Pearson North America as a Production DBA. He has over ten years of experience in SQL Server as a Database Engineer, Developer, Analyst and Production DBA.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2018-09-13

Comments For This Article




Monday, June 12, 2023 - 5:54:51 AM - Boris Back To Top (91273)
These two approaches are not equivalent because using OR does not select equivalent set of data, Data in column has to BEGIN or END with given strings so query does not really search in the middle in text. Example: Peter Novak, search parameter Nov does not search anything.

Monday, November 9, 2020 - 9:48:05 AM - Greg Robidoux Back To Top (87775)
Hi Josh, if you use '%Co%' this won't use the index, so it will need to scan through all of the data and will not be any faster.

-Greg

Sunday, November 8, 2020 - 11:16:31 AM - josh Back To Top (87773)
You mentioned searching: '%Co% , but I can't tell if you improved it ?

Thursday, January 17, 2019 - 12:01:46 PM - Susantha Bathige Back To Top (78803)

I do not think, it will faster than the single SELECT. But by doing a simple test, it is easy to find it. Thanks. 


Thursday, January 17, 2019 - 4:03:34 AM - flash Back To Top (78792)

Supposing I want to find %string% , is it still faster to perform two Selects ( one for string% and another string% for 'reversed' column)?


Saturday, September 29, 2018 - 4:55:01 AM - Hasitha Kanchana Back To Top (77769)

very useful trick! Thank you Susantha!


Monday, September 17, 2018 - 5:16:30 AM - Bassam Abdelaal Back To Top (77617)

 Man , this can never cross my wildest imagination !! very nice idea

thanks

 

 


Friday, September 14, 2018 - 10:43:25 AM - Susantha Back To Top (77566)

 Hi Thomas,

Great point!


Friday, September 14, 2018 - 2:42:53 AM - Thomas Franz Back To Top (77557)

 It would be better, to create a computed column for the reverse column (cust_name_reverse AS REVERSE(cust_name)). Since the cust_name_reverse will usually not be selected, there is no need to add the PERSISTED keyword to the column definition.

This way you don't have to worry about inconsitent data, because someone run somewhere an update without manually adjusting the cust_name_reverse and / or do not need triggers to prevent this. Computed columns can be indexed too (and will be persisted in the index).


Thursday, September 13, 2018 - 9:59:42 PM - Alex Diaz Back To Top (77554)

 Awesome! Thank you.


Thursday, September 13, 2018 - 10:37:26 AM - Erika Romero Back To Top (77540)

Great Tip. Thanks Susantha.















get free sql tips
agree to terms