SQL Local Variables and Performance Issues

By:   |   Updated: 2024-07-31   |   Comments (6)   |   Related: > Performance Tuning


Problem

Often, developers use local variables when writing ad hoc queries or stored procedures for many reasons. You might hear "never repeat code" or "avoid using magic numbers." While writing a lengthy stored procedure, I might include a few. However, did you know that local variables can hurt the performance of your queries? How can you keep local variables from negatively affecting performance? Keep reading to find those answers and more.

Solution

In this article, I'll examine the effect of local variables on your T-SQL queries. The primary questions I'll answer are: what's causing the performance issue, and how can you tell it's happening? We'll also examine a few ways to work around it. By the end of this article, you'll better understand how local variables affect your queries' performance. Share this article the next time someone notices a performance problem with local variables and is left scratching their head.

Exploring Local Variables

If you've been in the SQL game for any time, I'm sure you've used a variable. Microsoft defines a variable as an object that can hold a specific data value of a specific type. The code below declares a variable and populates it with a string value. The SELECT statement then returns the value.

-- mssqltips.com
DECLARE @message VARCHAR(100) = 'Hello! World';
SELECT @message;

-- Using SET to declare the variable
DECLARE @message2 NVARCHAR(100)
SET @message2 = 'Hello World!'
SELECT @message2;

Results:

----------------
Hello! World
(1 row affected)

----------------
Hello World!
(1 row affected)

Unlike a temporary table, a variable only exists until the batch completes. If you use a GO command, the variable vanishes. The code below returns an error stating the variable doesn't exist, plus the second reference to the variable has a red squiggly line, which is never a good sign of things to come.

-- mssqltips.com
DECLARE @message NVARCHAR(100) = 'Hello! World';
SELECT @message;
GO

SELECT @message;

Results:

Msg 137, Level 15, State 2, Line 7
Must declare the scalar variable "@message".

Also, a variable can exist inside and outside a BEGIN/END block. For example, when executing the code below, the second statement still returns the value.

-- mssqltips.com
BEGIN
 DECLARE @varMessage NVARCHAR(100) = 'Hello! World';
 SELECT @varMessage;
END

SELECT @varMessage;

I add a variable in a stored procedure or ad hoc query when calling the GETDATE() function. Instead of calling GETDATE() multiple times in the query body, I'll declare it once and then use the variable. One reason to do this is so the date and time are consistent.

-- mssqltips.com
DECLARE @varDemoTable TABLE (Id INT, CreatedDate DATETIME);
DECLARE @varCurrentDate DATETIME = GETDATE();

INSERT INTO @varDemoTable (Id, CreatedDate)
VALUES
(1, @varCurrentDate),
(2, @varCurrentDate),
(3, @varCurrentDate);

SELECT Id,
       CreatedDate 
 FROM @varDemoTable;

Results:

Id          CreatedDate
----------- -----------------------
1           2024-06-21 03:18:29.980
2           2024-06-21 03:18:29.980
3           2024-06-21 03:18:29.980

The Downside

It's common for developers to use variables instead of literal values in a WHERE clause. Overall, this practice is a great idea. You start by declaring the variable and then populate it towards the top of the query. This method is handy if you reference the variable multiple times and if you populate the variable based on a row value.

-- mssqltips.com
DECLARE @colors TABLE
(
    Id INT,
    Name VARCHAR(20)
);
INSERT INTO @colors
(
    Id,
    Name
)
VALUES
(1, 'Red'),
(2, 'Blue'),
(3, 'Green')

DECLARE @color VARCHAR(20) = (SELECT 'Blue')

SELECT Id
FROM @colors
WHERE Name = @color;
GO

Results:

Id
---
2

However, as we'll see in the following sections, this practice can cause performance problems. Now, I'll say upfront that the downside is related to a specific pattern and shouldn't cause anyone to write out the utility of variables.

Demo Dataset

To illustrate the negative impact of variables, we need to build a larger dataset. The code below creates a database called LocalVariablesDemo, along with two tables. The first numbers table populates the Colors table.

-- mssqltips.com
USE master;

IF DATABASEPROPERTYEX('LocalVariablesDemo', 'Version') IS NOT NULL
BEGIN
    ALTER DATABASE LocalVariablesDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE LocalVariablesDemo;
END;
GO

CREATE DATABASE LocalVariablesDemo;

ALTER DATABASE LocalVariablesDemo SET RECOVERY SIMPLE;
GO

USE LocalVariablesDemo;
GO

DECLARE @upperBound INT = 5000000;

;WITH cteN (Number)
AS (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
    FROM sys.all_columns AS s1
        CROSS JOIN sys.all_columns AS s2
   )
SELECT [Number]
INTO dbo.Numbers
FROM cteN
WHERE [Number] <= @UpperBound;
GO


CREATE TABLE dbo.Colors
(
    Id INT IDENTITY(1, 1),
    FavoriteColor VARCHAR(200) NOT NULL,
    SomeRandomNumber DECIMAL(10, 2) NOT NULL,
    Person VARCHAR(10) NOT NULL,
    IsActive BIT NOT NULL
        CONSTRAINT PK_ColorValue
        PRIMARY KEY CLUSTERED (Id)
)

INSERT INTO dbo.Colors
(
    FavoriteColor,
    Person,
    SomeRandomNumber,
    IsActive
)
SELECT CASE
           WHEN n.Number % 2 = 0 THEN
               'Green'
           WHEN n.Number % 1 = 0 THEN
               'Blue'
       END AS FavoriteColor,
       CASE
           WHEN n.Number % 2 = 0 THEN
               'Henry'
           WHEN n.Number % 1 = 0 THEN
               'Liam'
       END AS Person,
       (ABS(CHECKSUM(NEWID()) % (100 - 1 + 1)) + 1) AS SomeRandomNumber,
       CASE
           WHEN n.Number % 100000 = 0 THEN
               0
           ELSE
               1
       END AS IsActive
FROM dbo.Numbers n;

CREATE NONCLUSTERED INDEX IX_IsActive
ON dbo.Colors (IsActive)
INCLUDE
(
    SomeRandomNumber,
    Person
);
GO

SELECT COUNT(1) AS ColorCount,
       FavoriteColor,
       Person,
       IsActive
FROM dbo.Colors
GROUP BY FavoriteColor,
         Person,
         IsActive;

Let's look at the spread of data in our table. Notice in the screenshot below that we only have 50 records where IsActive equals 0. Large skews toward specific values are when things fall apart.

Table Results

Now, let's run the query below without using a local variable and with the actual execution plan enabled.

-- mssqltips.com
SELECT COUNT(1) AS ColorCount,
       Person,
       FavoriteColor
FROM dbo.Colors
WHERE IsActive = 0
GROUP BY FavoriteColor,
         Person;
Good Execution Plan

Given the query and indexes, I anticipated the plan above. Below are three points to highlight:

  1. SQL estimates 50 rows, which matches the actual count.
  2. SQL performs a standard small memory grant. I set the minimum memory per query to 1024KB.
  3. The query takes no time at all, according to the QueryTimeStats.

It's time for things to get interesting. I'll rerun the query, but this time with a local variable.

-- mssqltips.com
DECLARE @inactive BIT = 0

SELECT COUNT(1),
       Person,
       FavoriteColor
FROM dbo.Colors
WHERE IsActive = @inactive
GROUP BY FavoriteColor,
         Person;
Inaccurate Execution Plan

Several new things are happening in the plan:

  1. SQL estimates there are 2.5 million rows where IsActive equals 0.
  2. The CPU and elapsed time increased.
  3. The memory grant is 12MB versus 1024KB as in the first execution.
  4. Finally, the cost increased and caused parallelism. I set the cost threshold for parallelism to 20 on this server.

Why does SQL estimate 2.5 million rows when we use a local variable? It's simple. SQL doesn't know the variable's value when compiling the execution plan. So, it guesses. You can figure out the guess by looking at the statistics. Run the statement below.

-- mssqltips.com
DBCC SHOW_STATISTICS ('dbo.Colors', 'IX_IsActive');
GO
Statistics

From the screenshot above, SQL takes the total rows multiplied by all density, which turns out to be 2.5 million. Using this sort of guess falls apart when the data in your table is skewed to one value, as in our example of only 50 out of 5 million records. Also, updating statistics will not fix this issue. If you don't believe me, run the command below to update all statistics on the Colors table. After updating stats, run the query with the local variable, and you'll notice no difference.

-- mssqltips.com
UPDATE STATISTICS dbo.Colors;
GO

Workarounds

If local variables cause you problems, a few solutions exist. The first is to skip using local variables. You may not have that option, but it would be the easiest way to avoid falling victim. It's like the old joke about a person going to the doctor and saying it hurts when I do this, and the doctor says then don't do that.

OPTION (RECOMPILE)

The second option is to recompile the execution plan each time the statement executes. Placing the option at the end of the statement only recompiles the single statement. For example, if you run 10 or 15 different statements in a query or stored procedure, you might only need this option performed for one of them.

-- mssqltips.com
DECLARE @inactive BIT = 0;

SELECT COUNT(1),
       Person,
       FavoriteColor
FROM dbo.Colors
WHERE IsActive = @inactive
GROUP BY FavoriteColor,
         Person
OPTION (RECOMPILE);
Accurate Plan

The plan looks much better now. From the screenshot above, SQL returns an accurate estimate of 50 rows. What else can you try?

Dynamic SQL

Another option worth mentioning is dynamic SQL. However, I find this option more trouble than it's worth. The code is difficult to write and maintain. Maybe you've had a better experience. The example below provides an accurate estimate of the IsActive column distribution using the system stored procedure sp_executesql.

-- mssqltips.com
DECLARE @inputVariable BIT = 0;
DECLARE @sqlStatement NVARCHAR(500) = N'';
DECLARE @parmDefinition NVARCHAR(500) = N'';
SET @sqlStatement
    = N'SELECT COUNT(1),
       Person,
       FavoriteColor
FROM dbo.Colors
WHERE IsActive = @isActive
GROUP BY FavoriteColor,
         Person';

SET @parmDefinition = N'@isActive BIT';

EXEC sys.sp_executesql @sqlStatement,
                       @parmDefinition,
                       @isActive = @inputVariable;
GO

However, when you change the input variable to 1, we fall victim to parameter sniffing. That's about as bad as the inaccurate statistics.

Execution Plan Parameter Sniffing

Parameter sniffing is another reason I prefer the recompile option in situations like this.

Clean Up

Don't forget to run the script below to drop the demo database.

-- mssqltips.com
USE master;

IF DATABASEPROPERTYEX('LocalVariablesDemo', 'Version') IS NOT NULL
BEGIN
    ALTER DATABASE LocalVariablesDemo SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
    DROP DATABASE LocalVariablesDemo;
END;
GO

Wrapping Up

More options exist than the two outlined above. Erik Darling wrote an article titled "Yet Another Post About Local Variables In SQL Server," which covers a few more, and I highly recommend you check it out. I don't suggest avoiding local variables because of this issue since it's specific to skewed data distribution unless all your data is skewed, then I would avoid them. But it's nice to know you have options when it shows up in your environment. I prefer the recompile option or removing variables since the code changes are minimal and easy to maintain.

Key Points

  • Local variables are a great way to eliminate redundancy in code. For example, create a local variable with the date and time and assign it to one at the beginning of your script.
  • As with everything great in SQL Server, there tends to be a downside. Local variables are no different, but the above workarounds should help resolve the issue.
  • Try the other options and see if they work better than a recompile. Let us know in the comments if you have a better method.
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jared Westover Jared Westover is a passionate technology specialist at Crowe, helping to build data solutions. For the past two decades, he has focused on SQL Server, Azure, Power BI, and Oracle. Besides writing for MSSQLTips.com, he has published 12 Pluralsight courses about SQL Server and Reporting Services. Jared enjoys reading and spending time with his wife and three sons when he's not trying to make queries go faster.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2024-07-31

Comments For This Article




Wednesday, October 9, 2024 - 3:14:44 AM - Uwe Ricken Back To Top (92552)
Hallo Olivier,

your observations MUST have a difference reason. The behavior is exeactly asl written by the author.
Reason for this behavior is that sql server cannot use the histogram of the statistics but mus use the formula described by the author.

This behavior is independent from the data type.
I would check the coding.

Or maybe you can send a link to the execution plan of YOUR observed query.
Share it here:

https://www.brentozar.com/pastetheplan/

Monday, August 26, 2024 - 4:49:53 AM - Olivier Back To Top (92466)
Hello Jahred,

thanks for answering to my comment.
I''ll try to explain better what I meant:
Here is some SQL.

/*
STEP 1, I reproduce the issue you described:
***************************************************************************************
*/
SET STATISTICS IO OFF
SET STATISTICS TIME OFF
GO

DROP TABLE TEST;
GO

WITH cteN (Number)
AS (
SELECT TOP 50000000 ROW_NUMBER() OVER (
ORDER BY s1.[object_id]
)
FROM sys.all_columns AS s1
CROSS JOIN sys.all_columns AS s2
)
SELECT cast(CASE
WHEN Number % 1000000 = 0
THEN 0
ELSE Number
END AS BIT) AS Number
,N'some text some text some text some text some text some text some text some text' AS TXT
INTO TEST
FROM cteN
GO

CREATE INDEX ix1 ON TEST (Number)
GO

SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

-- without local variable: "FAST" when the Number data type is bit or int
/*
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'TEST'. Scan count 1, logical reads 53, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
*/
SELECT count(1)
,txt
FROM TEST
WHERE number = 0 -- will use ix1
GROUP BY txt
GO

-- with local variable: "SLOW" when the Number data type is bit, "FAST" for int
/* when the Number data type is bit:
Table 'TEST'. Scan count 13, logical reads 1063830, physical reads 0, page server reads 0, read-ahead reads 113078, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
when the Number data type is int:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'TEST'. Scan count 1, logical reads 53, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
*/
DECLARE @P1 BIT = 0
SELECT count(1)
,txt
FROM TEST
WHERE number = @P1 -- will NOT use ix1 when the Number data type is bit (table scan)
GROUP BY txt
GO

/*
STEP 2, I try with a Number of type int instead of bit
(and both SELECT queries are "fast",
even when using a local variable in the WHERE clause:
***************************************************************************************
(same SQL as above except that the the Number and the @P1 data type is int instead of bit)
*/

SET STATISTICS IO OFF
SET STATISTICS TIME OFF
GO

DROP TABLE TEST;
GO

WITH cteN (Number)
AS (
SELECT TOP 50000000 ROW_NUMBER() OVER (
ORDER BY s1.[object_id]
)
FROM sys.all_columns AS s1
CROSS JOIN sys.all_columns AS s2
)
SELECT cast(CASE
WHEN Number % 1000000 = 0
THEN 0
ELSE Number
END AS INT) AS Number
,N'some text some text some text some text some text some text some text some text' AS TXT
INTO TEST
FROM cteN
GO

CREATE INDEX ix1 ON TEST (Number)
GO

SET STATISTICS IO ON
SET STATISTICS TIME ON
GO

-- without local variable: "FAST" when the Number data type is bit or int
/*
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'TEST'. Scan count 1, logical reads 53, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
*/
SELECT count(1)
,txt
FROM TEST
WHERE number = 0 -- will use ix1
GROUP BY txt
GO

-- with local variable: "SLOW" when the Number data type is bit, "FAST" for int
/*
when the Number data type is bit:
Table 'TEST'. Scan count 13, logical reads 1063830, physical reads 0, page server reads 0, read-ahead reads 113078, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.

when the Number data type is int:
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
Table 'TEST'. Scan count 1, logical reads 53, physical reads 0, page server reads 0, read-ahead reads 0, page server read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob page server reads 0, lob read-ahead reads 0, lob page server read-ahead reads 0.
*/
DECLARE @P1 INT = 0
SELECT count(1)
,txt
FROM TEST
WHERE number = @P1 -- will NOT use ix1 when the Number data type is bit (table scan)
GROUP BY txt
GO

Best regards,
Olivier GG.

Sunday, August 25, 2024 - 7:57:49 AM - Jared Westover Back To Top (92464)
@Olivier

Thank you for taking the time to read and comment. I swapped out the BIT in IsActive for an INT and experienced the same behavior as in the article. Could you post your code? I'm using SQL Server 2019 developer edition as well.

Sunday, August 25, 2024 - 7:47:49 AM - Jared Westover Back To Top (92463)
@Oleg

Thank you for your feedback! In the query shown below, I included the IsActive column to display alongside the other columns:

SELECT COUNT(1) AS ColorCount,
FavoriteColor,
Person,
IsActive
FROM dbo.Colors
GROUP BY FavoriteColor,
Person,
IsActive;

However, since IsActive isn't displayed (only filtered on) in the subsequent queries, I didn't include it in the GROUP BY. If I've misunderstood your comment, please let me know.

Friday, August 23, 2024 - 7:48:37 AM - Olivier Back To Top (92462)
Hello,

this is very interesting.
I have tried to use an integer instead of a bit, with SQL Server 2019 developer edition, and I don't have the same performance issue with integer.

Of course, I do not understand what the reason is...
Do you have an explanation for the different behaviors please?

Thank you.

Wednesday, July 31, 2024 - 3:56:27 AM - Oleg Back To Top (92426)
Thanks for the article,
one note: after the "Let's look at the spread of data in our table. Notice in the screenshot below that we only have 50 records where IsActive equals 0. Large skews toward specific values are when things fall apart." the result should have another order














get free sql tips
agree to terms