By: Eric Blinn | Updated: 2024-08-29 | Comments | Related: > TSQL
Problem
I've seen the EXISTS keyword in Microsoft SQL Server T-SQL code and don't understand it well. What does it do? How do I use it? Are there best practices around SQL EXISTS?
This SQL tutorial will explain what the keyword EXISTS does and show several different use cases.
Solution
The EXISTS keyword is a Boolean function that returns either true or false. Since it is a function, it expects a parameter within a set of parentheses (…). The single parameter accepted by EXISTS is a SELECT statement. The function will return TRUE if the SELECT statement parameter returns at least 1 row and FALSE if exactly 0 rows are returned.
EXISTS is most commonly used as an argument in IF statements, WHILE loops, and WHERE clauses. While it can be used in JOIN predicates, this is exceedingly rare.
The demos in this tip utilize the WideWorldImporters sample SQL database, which can be downloaded for free from Github. All demos are shown using SQL Server Management Studio (SSMS) and SQL Server 2022, but the information in this tip is valid going back multiple versions of SQL Server.
Calling the EXISTS Function
Consider this SELECT statement. It should return at least four rows on most SQL Server installations and perhaps two rows on Azure DBaaS instances. It should never return 0 rows.
SELECT * FROM sys.databases
Since we know that statement contains several rows, what happens if we paste that query within the parameter section of an EXISTS function? Notice that since the EXISTS function naturally returns a Boolean value (true or false), there is no need for a comparison operator. There is no reason to include an "= 1" or "= TRUE" in the argument.
IF EXISTS(SELECT * FROM sys.databases) PRINT 'EXISTS evaluated to true' ELSE PRINT 'EXISTS evaluated to false'
We would expect the output to be something like this in every SQL query.
What if the SELECT statement returned exactly one row? This script should return the same as above since the EXISTS function considers one row or 1 million rows as TRUE. Only queries that produce 0 rows produce a FALSE.
IF EXISTS(SELECT * FROM sys.databases WHERE name = 'master') PRINT 'EXISTS evaluated to true' ELSE PRINT 'EXISTS evaluated to false'
This is an example of EXISTS with a query that returns 0 rows. It is a perfectly valid query that produces an empty result set. For the first time, we see the FALSE output.
IF EXISTS(SELECT * FROM sys.databases WHERE database_id = -1) PRINT 'EXISTS evaluated to true' ELSE PRINT 'EXISTS evaluated to false'
Now consider this query. On my system, this returns exactly one row and exactly one column that has a NULL value.
SELECT source_database_id FROM sys.databases WHERE name = 'master';
What will EXISTS do with this? EXISTS is not concerned with the number of columns or the value of any column. It is only interested in the existence or lack thereof of any rows. As such, this will also be evaluated to be true.
IF EXISTS(SELECT source_database_id FROM sys.databases WHERE name = 'master') PRINT 'EXISTS evaluated to true' ELSE PRINT 'EXISTS evaluated to false'
Next, we will look at a stored procedure. This may not evaluate on Azure DBaaS but should work on any full instance of SQL Server.
exec msdb.dbo.sp_help_job;
This returns a long list of SQL Server Agent jobs on my laptop.
What will EXISTS think of this? In short, it won't like this at all. Attempting to run this query will result in a syntax error. EXISTS expects a SELECT statement, not just a T-SQL statement, that could potentially return rows.
Finally, what about a variable? Once again, this produces a syntax error. You cannot simply send a variable to the EXISTS function.
I'm not sure what the use case would be here, but putting SELECT in front of the variable returns a single-row result set and causes this script to return TRUE.
DECLARE @MSSQLTips INT; IF EXISTS(SELECT @MSSQLTips) PRINT 'EXISTS evaluated to true' ELSE PRINT 'EXISTS evaluated to false'
EXISTS in WHILE Loops
Each of the prior examples shows EXISTS as part of an IF statement. It is also common to use them in WHILE loops. This method is an easy way to perform one task per row in a query without using a cursor. While this code is easy to follow and understand, it should not replace set-based coding. If the action to be completed per row can be done with an insert/update/delete statement, then that would be preferred.
This example calls a stored procedure, sp_send_dbmail, from MSDB once per row.
--!!!! Be careful running this script!!!! -- It does attempt to call sp_send_dbmail. -- --From MSSQLTips.com DECLARE @CustomerID INT; DECLARE @Email NVARCHAR(256); WHILE EXISTS(SELECT * FROM Sales.Customers WHERE IsStatementSent = 0) BEGIN SELECT TOP 1 @CustomerID = CustomerID , @Email = EmailAddress FROM Sales.Customers SC INNER JOIN [Application].People P -- SQL JOIN ON SC.PrimaryContactPersonID = P.PersonID WHERE IsStatementSent = 0; EXEC msdb.dbo.sp_send_dbmail @recipients = @Email, @subject = 'Please pay us'; UPDATE Sales.Customers -- UPDATE statement SET IsStatementSent = 1 WHERE CustomerID = @CustomerID; END;
EXISTS in a WHERE Clause
Using an EXISTS function call in a WHERE clause is probably the most common use case. The function will work exactly the same as in each earlier example, but there is one noticeable change. The subquery will almost always reference a column in a table that is otherwise out of the scope of the subquery. This can be difficult to understand, so we will jump right into an example.
SELECT * FROM Sales.Invoices SI WHERE EXISTS(SELECT * FROM Sales.Customers SC WHERE SC.CustomerCategoryID = 4 AND SC.CustomerID = SI.CustomerID);
In this example, the subquery has only one table listed in the FROM clause, Sales.Customers, aliased as SC. Yet the WHERE clause of that subquery references a column, SI.CustomerID, that could only come from the main query outside of the parentheses. This can initially seem counter-intuitive, but this is exactly how it should work.
While the optimizer treats it differently, you can think of it as calling that subquery repeatedly -- once for every row in the main query. An EXISTS function call in a WHERE clause that did not include a reference to the primary query would only have to run once and would either evaluate as true or false for the entire rowset of the primary query.
When to use EXISTS in WHERE Clauses
The example query above could easily have been written with an IN statement, as seen below. In fact, modern SQL Server engines will usually compile a query like this (with IN) and the one above (with EXISTS) exactly the same way. Deciding between the two is largely a style-based decision.
SELECT * FROM Sales.Invoices SI WHERE SI.CustomerID IN (SELECT SC.CustomerID FROM Sales.Customers SC WHERE SC.CustomerCategoryID = 4);
In that demo query, there is only one matching column between the primary and subqueries, so replacing EXISTS with an IN statement is pretty easy. However, if the comparison includes more than one shared column, the IN statement won't work, as seen in the following query.
SELECT * FROM Sales.Invoices SI WHERE EXISTS(SELECT * FROM Sales.Customers SC WHERE SC.CustomerCategoryID = 4 AND SC.CustomerID = SI.CustomerID AND SC.DeliveryMethodID = SI.DeliveryMethodID);
A query such as this is a great example where EXISTS makes sense.
Final Thoughts
EXISTS is a more advanced function that not every T-SQL writer uses. But knowing what it is and how it works is still an important step in the journey to becoming an expert in the field.
Next Steps
- SQL Server WHERE Clause Basics
- SQL Server IN Operator
- Performance Comparison of IN vs EXISTS
- Replace a WHILE Loop with Set Based Logic
- JOINs - SQL Joins Example, SQL LEFT JOIN Examples and SQL RIGHT JOIN Examples
- NULLs - What does SQL NULL mean and SQL WHERE IS NOT NULL
- SQL Server Uncorrelated and Correlated Subquery
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-08-29