SQL Server COUNT() Function Performance Comparison

By:   |   Updated: 2016-09-19   |   Comments (7)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | > Functions System


Problem

SQL Server Developers have a variety of innovative ways to use the COUNT function in SQL Server.  Often times the assumption is that one syntax provides better performance than the others. This tip will explain the differences between the following COUNT function varieties: COUNT(*) vs. COUNT(1) vs. COUNT(column_name) to determine if there is a performance difference.

Solution

There are more efficient ways than using the COUNT() function if the goal is just to retrieve the total row count from a table. One way is to retrieve the row count directly from SQL Server sys.partitions DMV. But most of the time, the COUNT function is still used when counting a subset of rows based on filter criteria specified with in the WHERE clause of a T-SQL statement.

In terms of behavior, COUNT(1) gets converted into COUNT(*) by SQL Server, so there is no difference between these. The 1 is a literal, so a COUNT('whatever') is treated as equivalent.

COUNT(column_name) behaves differently. If the column_name definition is NOT NULL, this gets converted to COUNT(*). If the column_name definition allows NULLs, then SQL Server needs to access the specific column to count the non-null values on the column.

COUNT(*) vs. COUNT(1) vs. COUNT(column_name) Performance Comparison

A test table is created in SQL Server 2016 Developer Edition on RTM CU1.

SET NOCOUNT ON;
CREATE TABLE #Count 
(ID INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
, ColumnWithNulls CHAR(100) NULL DEFAULT 'Allow NULLs'
, ColumnWithNoNulls CHAR(100) NOT NULL DEFAULT 'Do not allow NULLs'
, LargeColumn CHAR(5000) NOT NULL DEFAULT 'Large column');
GO

CREATE NONCLUSTERED INDEX NCI_Count_ColumnWithNoNulls ON #Count (ColumnWithNoNulls)
GO

INSERT INTO #Count DEFAULT VALUES 
GO 10000

UPDATE #Count SET  ColumnWithNulls = NULL WHERE ID % 2 = 0
GO

Here are a few COUNT query runs with row counts.  Note: the above code creates 5000 records and 2500 of the records have a NULL value for column ColumnWithNulls.

Query

Row Count

SELECT COUNT(*) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(1) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(LargeColumn) FROM #Count WHERE ID > 5000 5000
SELECT COUNT(ColumnWithNulls) FROM #Count WHERE ID > 5000 2500

Interpretation of Results

SELECT COUNT(*)

Looking at the execution plan, the Aggregate operation AggType is countstar and the ScalarString is Count(*). SQL Server is actually pretty smart as it can choose an efficient route to get the record count. In this case, there is a clustered index and a non-clustered index which does not allow NULL. SQL Server uses index NCI_Count_ColumnWithNonNulls and performs a smaller index scan to get the table row count.

SQL Server Execution Plan for COUNT(*)

SELECT COUNT(1)

In this execution plan, COUNT(1) gets converted to use AggType countstar and ScalarString is Count(*).

SQL Server Execution Plan for COUNT(1)

SELECT COUNT(LargeColumn)

There is no index defined on column LargeColumn, but this column definition does not allow NULLs. When a count is performed on this column, SQL Server then uses COUNT(*) and has the flexibility to choose a path it deems efficient to return the row count.

In this case, SQL Server can still maintain a scan on the nonclustered index NCI_Count_ColumnWithNoNulls to get the table row count.

SQL Server Execution Plan for COUNT(LargeColumn)

SELECT COUNT(ColumnWithNulls)

There is no index defined on column ColumnWithNulls and the column definition allows NULLs. SQL Server performs a count on the column using the clustered index. This would make sense because SQL Server needs to exclude counting rows with NULLs, so it will need to access all rows in the column.

The COUNT function itself is converted to use AggType COUNT_BIG. Typically, you would want to create a nonclustered index here, so that SQL Server would be able to perform a scan on a smaller index.

SQL Server Execution Plan for COUNT(Column With Nulls)

Summary

Typically I would recommend using COUNT(*) instead of many other options as it is also what SQL Server interprets. There is no performance gain by specifying a literal in the COUNT function.

COUNT(column_name) is applicable when the row count should exclude null values from the table, but if the column_name definition does not allow NULLs, then it is equivalent to COUNT(*).

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Simon Liew Simon Liew is an independent SQL Server Consultant in Sydney, Australia. He is a Microsoft Certified Master for SQL Server 2008 and holds a Master’s Degree in Distributed Computing.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2016-09-19

Comments For This Article




Wednesday, December 18, 2019 - 7:57:16 PM - Simon Liew Back To Top (83475)

Hi Jeff,

The tip is about COUNT(*) vs COUNT(1) vs COUNT(column_name) performance. Internally, SQL Server converts COUNT(1) and COUNT(column_name) into COUNT(*) when the column definition is NOT NULL.

I'm not sure if CPU, I/O stats, duration, reads and writes are necessary when its obvious SQL Server is using the same COUNT(*) function underneath despite the variance in code which is evident in the query plan.

But I take your point to provide measurement as proof to be clearer and thanks for that.

Just curious, the example that you've provided doesn't seem to have reference to COUNT() function?


Wednesday, December 11, 2019 - 10:39:54 AM - Jeff Moden Back To Top (83394)

I realize that I'm a bit late to this thread but you've not actually proven your conclusion of no performance advantage because you've not actually measured CPU usage, duration, reads, or writes.  There's no question that execution plans can aid you in figuring out what needs to be tuned but you MUST NOT EVER make a decision as to which code is better based soley on execution plans.

In fact, here's a simple example that has led thousands of people into thinking that using a Recursive CTE (rCTE) to produce incremental counts is a "Best Practice".  Turn on the Actual Execution plan, run it, and see.  Then, turn off the exectuion plan (it actually imparts some delays in running code) and see messages tab where you'll see proof that the exact opposite of what the execution plan seems to say is the actual truth.

DECLARE @BitBucket INT
;
-------------------------------------------------------------------------------
        RAISERROR('--===== Reursive CTE =====',0,0) WITH NOWAIT;
    SET STATISTICS TIME,IO ON;

   WITH cte AS
(
 SELECT N = 1
  UNION ALL
 SELECT N = N + 1
   FROM cte
  WHERE N < 1000000
)
 SELECT @BitBucket = N
   FROM cte
 OPTION (MAXRECURSION 0)
;
    SET STATISTICS TIME,IO OFF;
  PRINT REPLICATE('-',119)
;
-------------------------------------------------------------------------------
        RAISERROR('--===== Pseudo-Cursor =====',0,0) WITH NOWAIT;
    SET STATISTICS TIME,IO ON;
 SELECT TOP 1000000
        @BitBucket  = ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
   FROM      sys.all_columns ac1
  CROSS JOIN sys.all_columns ac2
;
    SET STATISTICS TIME,IO OFF;
PRINT REPLICATE('-',119)
;

Friday, September 23, 2016 - 10:45:06 AM - Simon Liew Back To Top (43401)

Hi Sahir,

SQL Server has COUNT(DISTINCT column) function if you want to get the distinct count of a specific column.


Tuesday, September 20, 2016 - 7:59:15 AM - sahir Back To Top (43366)

What's the best way to get a count of distinct values in the column? 


Monday, September 19, 2016 - 6:30:28 PM - Simon Liew Back To Top (43357)

Hi Ryan,

COUNT(*) and COUNT(1) behaviour has been the same since SQL Server 2008 to the current SQL Server 2016. 

 


Monday, September 19, 2016 - 3:12:12 PM - Ryan Cerney Back To Top (43355)

Great article, thank you!  Quick question, does the way SQL handles the count(1) vs count(*) the same for all versions or is the engine slightly different in 2008R2 and newer, etc?   

 


Monday, September 19, 2016 - 5:25:25 AM - VJ Back To Top (43353)

 "COUNT(column_name) is applicable when the row count should exclude non-null values from the table"

Mayby quite opposite ...?

 















get free sql tips
agree to terms