Persisting SQL Server Index-Usage Statistics with MERGE

By: Marios Philippopoulos | Updated: 2015-06-02 | Comments (2) | Related: > Indexing

Problem

Index-usage statistics get re-initialized when the SQL Server instance in question gets restarted and, starting with SQL Server 2012, after an index rebuild. This can obscure the true impact of indexes in day-to-day operations and lead to false determinations of used vs. unused indexes. How can one persist this valuable data over a longer period of time and obtain a meaningful repository of information for performance tuning and troubleshooting?

Solution

In this tip I use the MERGE statement to maintain the incremental nature of index-usage statistics between SQL Server instance restarts and index rebuilds. This solution also accounts for newly added/deleted indexes. The end result is a reliable source of data that can serve as a tool for index analysis.

Setup

To demonstrate I use a copy of the AdventureWorks2012 database residing on a SQL Server 2012 instance. Two tables on a separate database, [IndexUsageStatsDB], are used to store the index-usage statistics: [IndexUsageStats] and [IndexUsageStatsSnap]. The [IndexUsageStats] table holds the long-term values persisted between instance restarts and index rebuilds, whereas the [IndexUsageStatsSnap] table records "snapshot" data obtained at the moment of collection. The [IndexUsageStatsSnap] table would need to get re-populated at scheduled intervals (eg. every hour) and is the basis for updating the [IndexUsageStats] table through a MERGE statement.

The two tables are listed below:

-----------
--Script 1:
-----------

USE [IndexUsageStatsDB];
GO

CREATE TABLE [dbo].[IndexUsageStats]
(
	[IndexUsageStatsID] INTEGER IDENTITY(1,1) NOT NULL,
	[dbid] SNALLINT NOT NULL,
	[table_name] SYSNAME NOT NULL,
	[index_id] INTEGER NULL,
	[index_name] NVARCHAR(128) NOT NULL,
	[user_scans] BIGINT NOT NULL,
	[user_seeks] BIGINT NOT NULL,
	[user_lookups] BIGINT NOT NULL,
	[user_updates] BIGINT NOT NULL,
	[last_user_scan] DATETIME NULL,
	[last_user_seek] DATETIME NULL,
	[last_user_lookup] DATETIME NULL,
	[last_user_update] DATETIME NULL,
Constraint [PK_IndexUsageStats] PRIMARY KEY ([IndexUsageStatsID]) ON [PRIMARY]
) 
ON [PRIMARY]
GO

CREATE TABLE [dbo].[IndexUsageStatsSnap]
(
	[IndexUsageStatsSnapID] INTEGER IDENTITY(1,1) NOT NULL,
	[dbid] SNALLINT NOT NULL,
	[table_name] SYSNAME NOT NULL,
	[index_id] INTEGER NULL,
	[index_name] NVARCHAR(128) NOT NULL,
	[user_scans] BIGINT NOT NULL,
	[user_seeks] BIGINT NOT NULL,
	[user_lookups] BIGINT NOT NULL,
	[user_updates] BIGINT NOT NULL,
	[last_user_scan] DATETIME NULL,
	[last_user_seek] DATETIME NULL,
	[last_user_lookup] DATETIME NULL,
	[last_user_update] DATETIME NULL,
Constraint [PK_IndexUsageStatsSnap] PRIMARY KEY ([IndexUsageStatsSnapID]) ON [PRIMARY]
) 
ON [PRIMARY]
GO

The following statement is used to re-populate the table [IndexUsageStatsSnap] at scheduled intervals:

-----------
--Script 2:
-----------

USE [AdventureWorks2012];
GO

TRUNCATE TABLE [IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap];

INSERT INTO	
	[IndexUsageStatsDB].[dbo].[IndexUsageStatsSnap]
(
	[dbid]
,	[table_name]
,	[index_id]
,	[index_name]
,	[user_scans]
,	[user_seeks]
,	[user_lookups]
,	[user_updates]
,	[last_user_scan]
,	[last_user_seek]
,	[last_user_lookup]
,	[last_user_update]
)
SELECT 
	DB_ID() AS [dbid]
,	OBJECT_NAME(i.[object_id], DB_ID()) AS [table_name]
,	i.index_id
,	i.[name]
,	COALESCE(ddius.user_scans, 0)
,	COALESCE(ddius.user_seeks, 0)
,	COALESCE(ddius.user_lookups, 0)
,	COALESCE(ddius.user_updates, 0)
,	ddius.last_user_scan
,	ddius.last_user_seek
,	ddius.last_user_lookup
,	ddius.last_user_update
FROM 
	sys.indexes i
LEFT OUTER JOIN 
	sys.dm_db_index_usage_stats ddius
ON 
	ddius.index_id = i.index_id
	AND ddius.[object_id] = i.[object_id]
WHERE 
	(ddius.database_id = DB_ID() OR ddius.database_id IS NULL) -- this DB only
	AND OBJECTPROPERTY(i.[object_id], 'IsUserTable') = 1
	AND i.index_id > 0 	-- filter out heaps
ORDER BY
	OBJECT_NAME(i.[object_id], DB_ID())
,	i.index_id;
GO

The MERGE statement

As mentioned earlier, a MERGE statement is used to update the [IndexUsageStats] table on the basis of data from table [IndexUsageStatsSnap]. In the first part of this section I describe the logic (pseudo code) behind the MERGE statement. In the second part I show the actual T-SQL code.

Logic and Pseudo code

The terms used in the pseudo code below are defined as follows:

Counter1: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStats] table.
Counter2: counter value (number of scans/seeks/lookups/updates) of a specific index in the [IndexUsageStatsSnap] table.
LastOccurence1: last time Counter1 changed in [IndexUsageStats] table (eg. last_user_seek value).
LastOccurence2: last time Counter2 changed in [IndexUsageStatsSnap] table (eg. last_user_seek value).

Here is the pseudo code and associated logic:

-----------
--Script 3:
-----------
If Index In Both [IndexUsageStats] And [IndexUsageStatsSnap] Tables
Then
	If 	LastOccurence2 More Recent Than LastOccurence1 
		And 
		(Counter1 < Counter2 OR LastOccurence1 IS NULL)
	Then 	Set Counter1 Equal To Counter2 
		And Set LastOccurence1 Equal To LastOccurence2
		(Logic: there was likely no instance restart/index rebuild since 
		the last time the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated,
		as evidenced by the higher value of Counter2.)
	
	If 	LastOccurence2 More Recent Than LastOccurence1 
		And
		Counter1 >= Counter2
	Then	Increment Counter1 By Counter2 
		And Set LastOccurence1 Equal To LastOccurence2
		(Logic: an instance restart/index rebuild since the last time
		the [IndexUsageStats and [IndexUsageStatsSnap] tables were updated must have 
		caused the database engine to re-initialize the index-usage statistics, 
		as evidenced by the lower value of Counter2.)
	
	If 	( (LastOccurence2 Equal To LastOccurence1) Or (LastOccurence2 IS NULL) )
	Then 	Do Nothing
		(Logic: index in question has not been used since the last time
		the [IndexUsageStats] and [IndexUsageStatsSnap] tables were updated. 
		No change to the Counter1 and LastOccurence1 values is required.)
			
Else
	If 	Index Only In [IndexUsageStatsSnap] Table
	Then 	Insert Record Into [IndexUsageStats] Table
		(Logic: newly created index; add to [IndexUsageStats] table.)
		
	If 	Index Only In [IndexUsageStats] Table
	Then 	Delete Record From [IndexUsageStats] Table
		(Logic: newly deleted index; remove from [IndexUsageStats] table.)

T-SQL Code

Here is the T-SQL code of the MERGE statement:

-----------
--Script 4:
-----------

USE [IndexUsageStatsDB];

MERGE [dbo].[IndexUsageStats] AS [is]
USING [dbo].[IndexUsageStatsSnap] AS [iss]
ON 
	[is].[dbid] = [iss].[dbid]
	AND [is].[table_name] = [iss].[table_name]
	AND [is].[index_id] = [iss].[index_id]
	AND [is].[index_name] = [iss].[index_name]
WHEN MATCHED 
	THEN UPDATE SET 
		[is].[user_scans] = 
			CASE WHEN (([iss].[last_user_scan] > [is].[last_user_scan] OR [is].[last_user_scan] IS NULL) AND [iss].[user_scans] > [is].[user_scans]) THEN [iss].[user_scans] 
				 WHEN ([iss].[last_user_scan] > [is].[last_user_scan] AND [iss].[user_scans] <= [is].[user_scans]) THEN [is].[user_scans] + [iss].[user_scans] 
				 ELSE [is].[user_scans] -- DO NOTHING
			END
	, 	[is].[user_seeks] = 
			CASE WHEN (([iss].[last_user_seek] > [is].[last_user_seek] OR [is].[last_user_seek] IS NULL) AND [iss].[user_seeks] > [is].[user_seeks]) THEN [iss].[user_seeks] 
				 WHEN ([iss].[last_user_seek] > [is].[last_user_seek] AND [iss].[user_seeks] <= [is].[user_seeks]) THEN [is].[user_seeks] + [iss].[user_seeks] 
				 ELSE [is].[user_seeks] -- DO NOTHING
			END
	, 	[is].[user_lookups] = 
			CASE WHEN (([iss].[last_user_lookup] > [is].[last_user_lookup] OR [is].[last_user_lookup] IS NULL) AND [iss].[user_lookups] > [is].[user_lookups]) THEN [iss].[user_lookups] 
				 WHEN ([iss].[last_user_lookup] > [is].[last_user_lookup] AND [iss].[user_lookups] <= [is].[user_lookups]) THEN [is].[user_lookups] + [iss].[user_lookups] 
				 ELSE [is].[user_lookups] -- DO NOTHING
			END
	, 	[is].[user_updates] = 
			CASE WHEN (([iss].[last_user_update] > [is].[last_user_update] OR [is].[last_user_update] IS NULL) AND [iss].[user_updates] > [is].[user_updates]) THEN [iss].[user_updates] 
				 WHEN ([iss].[last_user_update] > [is].[last_user_update] AND [iss].[user_updates] <= [is].[user_updates]) THEN [is].[user_updates] + [iss].[user_updates]
				 ELSE [is].[user_updates] -- DO NOTHING
			END
	, 	[is].[last_user_scan] = COALESCE([iss].[last_user_scan], [is].[last_user_scan])
	, 	[is].[last_user_seek] = COALESCE([iss].[last_user_seek], [is].[last_user_seek])
	, 	[is].[last_user_lookup] = COALESCE([iss].[last_user_lookup], [is].[last_user_lookup])
	, 	[is].[last_user_update] = COALESCE([iss].[last_user_update], [is].[last_user_update])
WHEN NOT MATCHED BY TARGET
	THEN INSERT 
	(
	[dbid]
	, [table_name]
	, [index_id]
	, [index_name]
	, [user_scans]
	, [user_seeks]
	, [user_lookups]
	, [user_updates]
	, [last_user_scan]
	, [last_user_seek]
	, [last_user_lookup]
	, [last_user_update]
	)
	VALUES
	(
	[iss].[dbid]
	,[iss].[table_name]
	,[iss].[index_id]
	,[iss].[index_name]
	,[iss].[user_scans]
	,[iss].[user_seeks]
	,[iss].[user_lookups]
	,[iss].[user_updates]
	,[iss].[last_user_scan]
	,[iss].[last_user_seek]
	,[iss].[last_user_lookup]
	,[iss].[last_user_update]
	)
WHEN NOT MATCHED BY SOURCE
	THEN DELETE;
GO

Procedure

Here are the two steps needed to maintain the data in table [IndexUsageStats]; this procedure would need to be repeated at regular intervals during the day, especially prior to scheduled maintenance work that includes SQL Server instance restarts or index rebuilds:

Repopulate table [IndexUsageStatsSnap] to capture a snapshot of the statistics at that moment in time - script 2.
Execute the MERGE statement to update the [IndexUsageStats] table with the latest data from [IndexUsageStatsSnap] - script 4.

Results

To demonstrate how this works I will run queries against the [AdventureWorks2012].[Sales].[SalesOrderDetail] table. I first restart my SQL Server instance, to reset all index usage statistics to 0, and then apply the following query to perform 10 seeks on index IX_SalesOrderDetail_ProductID and 10 lookups on index PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID:

-----------
--Script 5:
-----------

SELECT 
	[SalesOrderID]
,	[SalesOrderDetailID]
,	[CarrierTrackingNumber]
,	[OrderQty]
,	[UnitPrice]
,	[LineTotal]
FROM 
	[AdventureWorks2012].[Sales].[SalesOrderDetail]
WHERE
	[ProductID] = 897;
GO 10

Next, I populate table IndexUsageStatsSnap (script 2) and execute the MERGE statement (script 4) to update table IndexUsageStats; the result, for both tables, is this:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	10	NULL	2015-04-29 06:54:51.740
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	10	0	2015-04-29 06:54:51.740	0

If I now rerun script 5 and repopulate table IndexUsageStatsSnap (script 2) the data in table IndexUsageStatsSnap will look like this:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

According to the rules outlined in the above pseudo code (script 3) the data in table [IndexUsageStats] would need to be updated to match the values in [IndexUsageStatsSnap]. So both tables should be made to show 20 lookups for index [PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID] and 20 seeks for index [IX_SalesOrderDetail_ProductID]. True enough, after running MERGE (script 4) both tables now hold the following values:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

Let me now rebuild index [IX_SalesOrderDetail_ProductID]:

-----------
--Script 6:
-----------
USE AdventureWorks2012;
GO
ALTER INDEX [IX_SalesOrderDetail_ProductID] ON [Sales].[SalesOrderDetail] REBUILD;
GO

After refreshing table [IndexUsageStatsSnap] (script 2) the data in that table will show zeroed values for index [IX_SalesOrderDetail_ProductID] (highlighted in yellow):

index_name	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	NULL	NULL
IX_SalesOrderDetail_ProductID	0	NULL	0

This is expected behavior with index rebuilds in SQL Server 2012. Issuing the MERGE statement, I end up with the following data in table [IndexUsageStats]:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	20	NULL	2015-04-29 07:21:11.147
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	20	0	2015-04-29 07:21:11.147	0

The data is unchanged, as expected. The data in table [IndexUsageStats] has been shielded from the earlier index rebuild.

Rerunning the query against the [SalesOrderDetail] table (script 5), then refreshing table [IndexUsageStatsSnap] (script 2) and lastly running the MERGE statement (script 4), I end up with the following values in table [IndexUsageStats]:

index_name	user_seeks	user_lookups	last_user_seek	last_user_lookup
PK_SalesOrderDetail_ SalesOrderID_SalesOrderDetailID	0	30	NULL	2015-04-30 14:52:20.713
AK_SalesOrderDetail_rowguid	0	0	NULL	NULL
IX_SalesOrderDetail_ProductID	30	0	2015-04-30 14:52:20.713	0

The counters for seeks and lookups have been incremented correctly and the times of the last seek and lookup have also been updated with the latest values.

Conclusion

In this tip I have shown how to use the MERGE statement to maintain index usage statistics information for a SQL Server instance even after the instance restarts and index rebuilds. The end result is a robust set of data that can be relied on to draw conclusions on used vs. unused indexes over long time spans. Using this methodology one can be certain that low/zero usage values for an index are truly an indication that this index is not needed and should probably be dropped, and not an artifact of frequent rebuilds of that index. New and dropped indexes are also dealt with in the same MERGE statement.

An implicit assumption (and limitation) of this approach is that higher values in the [IndexUsageStatsSnap] table, relative to the [IndexUsageStats] table, necessarily mean that no re-initialization of the index-usage statistics has taken place in the database engine since the last time the two tables were refreshed (see first If statement in pseudo code - script 3). This may lead to inaccuracies early on, when values in the 2 tables are of comparable magnitude. However, as the data in the [IndexUsageStats] table gradually spans a longer time range and individual data points increase in magnitude, this assumption will become increasingly more accurate.

Next Steps

Apply the solution outlined in this tip to your environment to obtain a reliable repository of index usage statistical information.
Review these related links:

About the author

Marios Philippopoulos has been a SQL Server DBA for over 10 years. He is based in the Toronto area, Canada.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2015-06-02

Tuesday, June 2, 2015 - 9:56:48 AM - Ben Miller	Back To Top (37360)
Just remember that if you use NOT MATCHED BY SOURCE you could be forcing a CROSS JOIN to happen to your tables. Check out BOL.