Removing Function Calls for Better Performance in SQL Server

By: Ben Snaidero | Updated: 2012-07-10 | Comments (11) | Related: > Performance Tuning

Problem

Everyone knows it's best not to use function calls in your WHERE clause as it will affect your performance (if you did not know this already, this article will convince you not to), but what about using them say in the SELECT list. This tip will look at how removing function calls from the SELECT list can dramatically improve performance, especially when returning larger data sets i.e. search query results.

Solution

Sample SQL Server Table and Functions

For this example we will setup two sample tables and two functions that access these tables. As for populating the tables you will have to use a tool like Visual Studio to populate them with some reasonably realistic data so as not to skew the results. In my case I populated each table with about 200,000 records. One thing to note with these examples is that there is a requirement for the functions to only return one record per sale, even if there is actually multiple buyers.

Here is the DDL code:

-- Table creation logic
CREATE TABLE [dbo].[CarSale](
 [CarSaleID] [int] IDENTITY(1,1) NOT NULL,
 [PurchaseDate] [smalldatetime] NOT NULL,
    CONSTRAINT [PK_CarSale] PRIMARY KEY CLUSTERED ([CarSaleID] ASC)
);

CREATE TABLE [dbo].[Buyer](
 [BuyerID] [int] IDENTITY(1,1) NOT NULL,
 [CarSaleID] [int] NOT NULL,
 [LastName] [varchar](50) NULL,
 [FirstName] [varchar](100) NULL,
 [CompanyName] [varchar](200) NULL,
 CONSTRAINT [PK_Buyer] PRIMARY KEY NONCLUSTERED ([BuyerID] ASC)
);

ALTER TABLE [dbo].[Buyer]  WITH CHECK ADD CONSTRAINT [FK_Buyer_CarSale] FOREIGN KEY([CarSaleID])
REFERENCES [dbo].[CarSale] ([CarSaleID]) ON UPDATE CASCADE ON DELETE CASCADE;

CREATE CLUSTERED INDEX [IX_Buyer_CarSalelID] ON [dbo].[Buyer]([CarSaleID] ASC);

-- Function creation logic
CREATE FUNCTION [dbo].[fnGetBuyerFirstName]
(@CarSaleID INT)
RETURNS VARCHAR (500)
AS
BEGIN
RETURN (SELECT Top 1 FirstName
FROM Buyer
WHERE CarSaleID= @CarSaleID
ORDER BY BuyerID)
END
GO

CREATE FUNCTION [dbo].[fnGetBuyerLastName]
(@CarSaleID INT)
RETURNS VARCHAR (500)
AS
BEGIN
RETURN (SELECT Top 1 coalesce(LastName,CompanyName)
FROM Buyer
WHERE CarSaleID= @CarSaleID
ORDER BY BuyerID)
END
GO

Original Query Example

Our original query is pretty straightforward. It does a simple SELECT and uses two functions to retrieve the first and last name from the other table.

Here is the code:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
ORDER BY CarSaleID;

Looking at the explain plan for our original query we can see that for each record returned by the query we are going to call these functions and basically do a lookup on the Buyer table twice (once for each function). This is not going to be very efficient if the CarSale table contains a large number of records.

Even if we restrict our query to a very specific WHERE clause and retrieve only one record and check the explain plan for this updated query, we still see that it's going to have to do two lookups on the Buyer table.

Below is the code for the updated query:

SELECT cs.PurchaseDate,
       dbo.fnGetBuyerFirstName(cs.CarSaleID),
       dbo.fnGetBuyerLastName(cs.CarSaleID)
FROM CarSale cs
WHERE CarSaleID=5
ORDER BY CarSaleID;

One thing to note here is that this is the best case scenario as this query is only going to return one record. Any query with a wider WHERE clause returning more data will become slower as the data set returned grows.

Query Update to Remove Function Calls Example

Now let's try removing the function calls and getting our query result using a table join and see what happens. Here are both new queries, the first without a WHERE clause and the second with one.

SELECT cs.PurchaseDate,FirstName,LastName
FROM CarSale cs 
INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 
          ON cs.CarSaleID=m2.CarSaleID
INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID
ORDER BY cs.CarSaleID;


SELECT cs.PurchaseDate,FirstName,LastName
FROM CarSale cs 
INNER JOIN (SELECT CarSaleID,MIN(BuyerID) AS SingleBuyerID FROM Buyer GROUP BY CarSaleID) m2 
          ON cs.CarSaleID=m2.CarSaleID
INNER JOIN Buyer m ON m2.CarSaleID=cs.CarSaleID AND m2.SingleBuyerID=m.BuyerID
WHERE cs.CarSaleID=5
ORDER BY cs.CarSaleID;

Looking at the explain plans for the query without the WHERE clause and the query with the WHERE clause we can see that in both cases the new explain plans should perform better since we no longer have to perform a lookup for each record returned by the query. This is now handled by the MERGE join.

To confirm this and see just how much improvement we get from removing the function calls I ran a SQL Profiler trace and ran each of these queries. Here are the results:

Query	WHERE Clause	CPU (ms)	Reads	Duration
Original	NO	10734	1239655	25879
Original	YES	0	9	0
No Function Call	NO	578	16337	2457
No Function Call	YES	0	11	0

Looking at the SQL Profiler trace results we can see that by removing the function calls we get a considerable benefit when the result set of the query is large in all resource areas, read, cpu and total duration. When just returning a single record though the original query actually performs slightly better. Again, this WHERE clause is a best case scenario, as the data set returned grows the query without function calls will start to outperform the one with them.

Final Update Adding Common Table Expression (CTE) Example

Because we have this requirement from the function for only returning a single buyer we can use a CTE to further optimize this query.

Here is the code both with and without the WHERE clause:

WITH summary AS (SELECT CarSaleID,            
                        BuyerID,           
                        FirstName,
                        LastName,
                        ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk      
                 FROM Buyer) 
SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s  
ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1;


WITH summary AS (SELECT CarSaleID,            
                        BuyerID,           
                        FirstName,
                        LastName,
                        ROW_NUMBER() OVER(PARTITION BY CarSaleID ORDER BY CarSaleID) AS rk      
                 FROM Buyer) 
SELECT PurchaseDate,s.FirstName,s.LastName FROM CarSale cs INNER JOIN summary s  
ON s.CarSaleID=cs.CarSaleID WHERE s.rk = 1 AND cs.CarSaleID=5;

After generating the explain plans for both of these queries, no WHERE clause and WHERE clause, it's easy to see how much more streamlined these two queries are compared to the originals. As well, looking at the trace results below we can see that this version of the query outperforms all previous examples both with and without the WHERE clause.

Query	WHERE Clause	CPU (ms)	Reads	Writes	Duration
No Function Call add WITH statement	NO	266	15796	0	1931
No Function Call add WITH statement	YES	0	6	0	0

Summary

I would agree with anyone that wanted to argue the first example is much easier to read and understand, but given the performance improvements we get by removing these function calls I think it's pretty obvious that it's worth eliminating them and saving on valuable CPU and I/O cycles.

Next Steps

How to read SQL Server Explain Plans
Other common uses for Common Table Expressions (CTE)
More information on ROW_NUMBER() and RANK() functions

About the author

Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2012-07-10

Tuesday, July 10, 2012 - 9:46:31 AM - Dom	Back To Top (18422)
Good article. But I think in the first code-listing, you meant to give a CREATE FUNCTION for fnGetBuyerFirstName. Instead, you just gave two copies of fnGetBuyerLastName.