By: Jim Evans | Updated: 2019-10-23 | Comments (8) | Related: > TSQL
Problem
You want to compare the results from two similar SQL Server T-SQL queries to see if the results are identical. Possibly, you have a complex query that returns many rows that you want to modify to simplify or optimize or for any other reason. You want to ensure the new result set is identical to the original result set. This require more than just comparing row counts. Saving the data to a file and using a compare tool is one way. However, you can use the power of SQL Server to do the compare to let you know if the results including all column values, are identical or not.
Solution
This tip will use Microsoft's sample database WideWorldImporters. I will show you how to quickly check the results from 2 queries to see if they are identical.
First Query – Overly complex
I mocked up an overly complex query that returns 20,895 rows. This query has subqueries in the main SELECT and the WHERE clause. Though I made up this example it is common to come across overly complex queries like this.
USE WideWorldImporters; GO SELECT DISTINCT ISNULL((SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID),pt.PackageTypeName) AS 'ColorOrType', ISNULL(pt.PackageTypeName,(SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID)) AS 'TypeOrColor', si.StockItemID, si.StockItemName, pt.PackageTypeName, (SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID) AS ColorName, CASE WHEN p.SearchName LIKE '% H%' THEN SUBSTRING(p.SearchName,1,CHARINDEX(' ',p.SearchName)) ELSE 'XXXXXX' END AS 'FirstName', TRIM(SUBSTRING(TRIM(SUBSTRING(p.SearchName, CHARINDEX(' ',p.SearchName), 50)), CHARINDEX(' ',p.SearchName), 50) ) AS 'LastName', il.* FROM Warehouse.StockItems AS si LEFT JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID LEFT JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID LEFT JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID LEFT JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE (pt.PackageTypeName NOT like '__ch' OR EXISTS (SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID AND c.ColorName NOT like '%K')) AND il.InvoiceID IN (SELECT i.InvoiceID From [Sales].[Invoices] i INNER JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE p.IsSalesperson = 1 AND p.SearchName LIKE '% H[io]%')
GO
Second Query - Simplified
After inspecting the first query and working to simplify it by making the following changes:
- LEFT JOIN to [Warehouse].[Colors] used vs 4 Subquery
- Switching [Application].[People] p.SearchName to P.FullName and simplifying the Substring function and Case Statement
- Changed the [Sales].[Invoices] Subquery to INNER JOIN
- Changing LEFT JOINS to INNER JOINS and removing Select DISTINCT
This query also returns 20,895 rows! However, are the result sets identical?
USE WideWorldImporters; GO SELECT ISNULL(c.ColorName,pt.PackageTypeName) AS 'ColorOrType', ISNULL(pt.PackageTypeName,c.ColorName) AS 'TypeOrColor', si.StockItemID, si.StockItemName, pt.PackageTypeName, c.ColorName, CASE WHEN P.FullName LIKE '% H%' THEN SUBSTRING(P.FullName,1,CHARINDEX(' ',P.FullName)) ELSE 'XXXXXX' END AS 'FirstName', TRIM(SUBSTRING(p.FullName,CHARINDEX(' ',p.FullName), 50)) AS 'LastName', il.* FROM Warehouse.StockItems AS si LEFT JOIN [Warehouse].[Colors] as c on c.ColorID = si.ColorID INNER JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID INNER JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID INNER JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID INNER JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE (pt.PackageTypeName <> 'Each' OR c.ColorName <> 'Black') AND p.IsSalesperson = 1 AND p.FullName LIKE '% H%'
GO
Comparing the Results of the Two Queries
The solution to this is very simple. Run both queries using a UNION to combine the results!
The UNION operator returns unique records. If the two results sets are identical the row count will remain the same as the original query. If any rows do not match the original query those rows will be included in the result set thus the total rows returned by the UNIONed queries will exceed the row count of the original.
In this case the result sets match, so I can feel confident that the new optimized simplified query is a better solution!
USE WideWorldImporters; GO --Example 1: Overly complex query that returns 20,895 rows SELECT DISTINCT ISNULL((SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID),pt.PackageTypeName) AS 'ColorOrType', ISNULL(pt.PackageTypeName,(SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID)) AS 'TypeOrColor', si.StockItemID, si.StockItemName, pt.PackageTypeName, (SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID) AS ColorName, CASE WHEN p.SearchName LIKE '% H%' THEN SUBSTRING(p.SearchName,1,CHARINDEX(' ',p.SearchName)) ELSE 'XXXXXX' END AS 'FirstName', TRIM(SUBSTRING(TRIM(SUBSTRING(p.SearchName, CHARINDEX(' ',p.SearchName), 50)), CHARINDEX(' ',p.SearchName), 50) ) AS 'LastName', il.* FROM Warehouse.StockItems AS si LEFT JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID LEFT JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID LEFT JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID LEFT JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE (pt.PackageTypeName NOT like '__ch' OR EXISTS (SELECT c.ColorName FROM [Warehouse].[Colors] c WHERE c.ColorID = si.ColorID AND c.ColorName NOT like '%K')) AND il.InvoiceID IN (SELECT i.InvoiceID From [Sales].[Invoices] i INNER JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE p.IsSalesperson = 1 AND p.SearchName LIKE '% H[io]%') UNION -- Use UNION to Ensure the result set are identical! --Example 2: Clean up Query SELECT ISNULL(c.ColorName,pt.PackageTypeName) AS 'ColorOrType', ISNULL(pt.PackageTypeName,c.ColorName) AS 'TypeOrColor', si.StockItemID, si.StockItemName, pt.PackageTypeName, c.ColorName, CASE WHEN P.FullName LIKE '% H%' THEN SUBSTRING(P.FullName,1,CHARINDEX(' ',P.FullName)) ELSE 'XXXXXX' END AS 'FirstName', TRIM(SUBSTRING(p.FullName,CHARINDEX(' ',p.FullName), 50)) AS 'LastName', il.* FROM Warehouse.StockItems AS si LEFT JOIN [Warehouse].[Colors] as c on c.ColorID = si.ColorID INNER JOIN [Warehouse].[PackageTypes] as pt ON pt.PackageTypeID = si.OuterPackageID INNER JOIN [Sales].[InvoiceLines] as il ON il.StockItemID = si.StockItemID INNER JOIN [Sales].[Invoices] as i ON i.InvoiceID = il.InvoiceID INNER JOIN [Application].[People] p ON p.PersonID = i.SalespersonPersonID WHERE (pt.PackageTypeName <> 'Each' OR c.ColorName <> 'Black') AND p.IsSalesperson = 1 AND p.FullName LIKE '% H%' GO
Conclusion – Reviewing the Results
Now you know the secret to quickly use the power of SQL Server to compare two query results to ensure they are identical.
I have often used this technique when optimizing queries and stored procedures. It is much easier and faster than extracting the result set to another tool. The SQL Engine is very powerful and can compare millions of rows with a UNION very quickly. Using the power of SQL Server is the way to go!
If the results set row counts match you are golden. However, if the result sets do not match in row counts it may be difficult to find the rows that do not match especially with large result sets. Some tips to find the rows that do not match include running both result sets with output to grid (in SQL Server Management Studio), trying different sort orders on rows such as dates, IDs, names, numbers like invoice numbers or order numbers. Inspect the 2 result sets and look at both ends of the result set looking for the unmatched rows. Also, you can comment out some of the columns from both queries until you do get the query results to match, then use process of elimination to figure out the columns that do not match.
What if the you want to compare the results of queries or complex stored procedure or those that do inserts or updates? In these cases, during testing add a Global Temporary Table (##temp) to hold the result sets by replacing the INSERT or UPDATE with a SELECT INTO ##temp statement or adding an Output Clause to the INSERT or UPDATE Statement. After executing both stored procs, compare the temporary tables with a UNION to ensure the record counts match the original query.
Note: In stored procedures, to compare the results, you must use global temporary tables because local temporary tables are dropped after a store procedure executes.
Next Steps
- To find the free WorldWideImports sample database do a web search on “WideWorldImporters download”
- Read more about temporary tables
- Learn about Local vs Global SQL Server Temporary Tables
- Read more on UNION vs. UNION ALL in SQL Server
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2019-10-23