By: Tim Ford | Updated: 2009-07-08 | Comments (16) | Related: 1 | 2 | 3 | > Indexing
Problem
In this, the third of the four part series on persisting the rowset results from the indexing Dynamic Management Views cached in temporary internal SQL Server structures we will explore what is required to store the missing index-related metadata. Part one of this series examined the process for persisting data for sys.dm_dm_index_usage_stats; the DMV pertaining to usage metrics for all the indexes on a Microsoft SQL Server instance. Part two centered on constructing a similar process for the sys.dm_db_index_operational_stats Dynamic Management Function. I suggest reviewing these precursor tips prior to tackling this one. As with the previous tips, the reason I perform these processes is to protect valuable index tuning information from being lost if the SQL Server instance is restarted. If that occurs, this information is lost; gone with it is important historical usage, structural, and operation information that can be used to tweak existing indexes or create new ones that improve performance.
Solution
As with the previous tips in this series we'll be working with the dedicated database and schema set aside for these structures. I'm including the code to create the database once again here, but it is identical to what was presented earlier. I use a dedicated schema for the processes that collect this information in my environment. If you prefer to simply use the dbo schema you may do so with a simple find and replace process in the code that follows.
Step One: Create the Administrative Database and Necessary Supporting Objects
USE [master]
GO
CREATE DATABASE [iDBA] ON PRIMARY
(
NAME = N'iDBA',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\iDBA.mdf' ,
SIZE = 10MB ,
FILEGROWTH = 5MB
)
LOG ON
(
NAME = N'iDBA_log',
FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\iDBA_log.ldf' ,
SIZE = 5120KB ,
FILEGROWTH = 5120KB
)
GO
USE [iDBA]
GO
CREATE SCHEMA [MetaBot] AUTHORIZATION [dbo]
GO
At this point, we need to create three tables to persist the data and a stored procedure to collect the rowsets from each Dynamic Management View. The code below creates tables for rowset data from:
- sys.dm_db_missing_index_details
- sys.dm_db_missing_index_group_stats
- sys.dm_db_missing_index_groups
- sys.dm_db_missing_index_columns
You may notice that there is no corresponding structure for persisting sys.dm_db_missing_index_columns. That is because this object is a Dynamic Management Function whose results are not affected by a service restart. It will use the information persisted in the four tables created here to return results. Let's take a look at the code to create the tables:
USE [iDBA];
--Create MetaBOT schema if it does not already exist:
CREATE SCHEMA [MetaBOT] AUTHORIZATION [dbo];
--Create repository table for sys.dm_db_missing_index_details:
CREATE TABLE [MetaBOT].[dm_db_missing_index_details](
[index_handle] [int] NOT NULL,
[database_id] [smallint] NOT NULL,
[object_id] [int] NOT NULL,
[equality_columns] [nvarchar](4000) NULL,
[inequality_columns] [nvarchar](4000) NULL,
[included_columns] [nvarchar](4000) NULL,
[statement] [nvarchar](4000) NULL,
[date_stamp] [datetime] NOT NULL
CONSTRAINT [PK_dm_db_missing_index_details] PRIMARY KEY CLUSTERED
(
[index_handle] ASC
)WITH
(
PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90
) ON [PRIMARY]
) ON [PRIMARY]
--Create repository table for sys.dm_db_missing_index_group_stats:
CREATE TABLE [MetaBOT].[dm_db_missing_index_group_stats](
[group_handle] [int] NOT NULL,
[unique_compiles] [bigint] NOT NULL,
[user_seeks] [bigint] NOT NULL,
[user_scans] [bigint] NOT NULL,
[last_user_seek] [datetime] NULL,
[last_user_scan] [datetime] NULL,
[avg_total_user_cost] [float] NULL,
[avg_user_impact] [float] NULL,
[system_seeks] [bigint] NOT NULL,
[system_scans] [bigint] NOT NULL,
[last_system_seek] [datetime] NULL,
[last_system_scan] [datetime] NULL,
[avg_total_system_cost] [float] NULL,
[avg_system_impact] [float] NULL,
[last_poll_unique_compiles] [bigint] NULL,
[last_poll_user_seeks] [bigint] NULL,
[last_poll_user_scans] [bigint] NULL,
[last_poll_system_seeks] [bigint] NULL,
[last_poll_system_scans] [bigint] NULL,
[date_stamp] [datetime] NOT NULL
CONSTRAINT [PK_dm_db_missing_index_group_stats] PRIMARY KEY CLUSTERED
(
[group_handle] ASC
)WITH
(PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90
) ON [PRIMARY]
) ON [PRIMARY]
--Create repository table for sys.dm_db_missing_index_groups:
CREATE TABLE [MetaBOT].[dm_db_missing_index_groups](
[index_group_handle] [int] NOT NULL,
[index_handle] [int] NOT NULL,
[date_stamp] [datetime] NOT NULL
CONSTRAINT [PK_dm_db_missing_index_groups] PRIMARY KEY CLUSTERED
(
[index_group_handle] ASC,
[index_handle] ASC
)WITH
(PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90
) ON [PRIMARY]
) ON [PRIMARY]
--Create repository table for sys.dm_db_missing_index_columns:
CREATE TABLE [MetaBOT].[dm_db_missing_index_columns]
(
[index_handle] INT NOT NULL, [column_id] INT NOT NULL,
[column_name] NVARCHAR(128) NOT NULL, [column_usage] VARCHAR(20) NOT NULL,
[date_stamp] datetime NOT NULL
CONSTRAINT [PK_dm_db_missing_index_columns] PRIMARY KEY CLUSTERED
(
[index_handle] ASC,
[column_id] ASC
)WITH
(
PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF,
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 85
) ON [PRIMARY]
) ON [PRIMARY]
As with the previously presented tips on the matter, the tables created here are quite similar in structure to the source DMVs. The exceptions are additions of columns starting with "last_poll_" for any columns that store running total values. These will aid in accounting for different methods of incrementing values depending upon whether a service restart has been encountered since the last time the data has been collected from the DMVs for physical storage. There is also a column named "date_stamp" to record the last time the data was pulled from the DMVs. The last deviation is the addition of the index_handle column in MetaBOT.dm_db_missing_index_columns. Since the source is a DMF that accepts index_handle as a parameter (from sys.dm_db_missing_index_details) and does not produce it as an retuned column, I did have to add the column to the persisted table, and then populate the value (as you will see in the stored procedure code) from a CROSS APPLY between sys.dm_db_missing_index_details and sys.dm_db_missing_index_columns.
At this point we're prepared to outline the code for collecting this information. The code will first insert new records into MetaBOT.dm_db_missing_index_groups, MetaBOT.dm_db_missing_index_details, and MetaBOT.dm_db_missing_index_columns. There is no need to update existing records in either of these tables, as they do not store any incremental/running total values. They simply store indentifying columns that reference running total columns in sys.dm_db_missing_index_group_stats. After these records are inserted, the last system restart date (as calculated by the creation date of tempdb, which is recreated upon each SQL Server service start) is compared to the last time the data was persisted from the missing index DMVs, as is available from the date_stamp column in MetaBOT.dm_db_missing_index_group_stats. Depending upon whether the last restart date took place before or after the last time data was collected, the procedure code will properly calculate and persist the running total values from the sys.dm_db_missing_index_group_stats DMV along with indentifying columns and a new (current) date_stamp value. Finally, any records that exist in the DMV, but are not yet recorded in the physical table, are inserted via the final DML INSERT statement.
--USE [iDBA];
CREATE PROCEDURE MetaBot.usp_persist_missing_index_DMV_data AS
DECLARE @last_service_start_date datetime
DECLARE @last_data_persist_date datetime
/*
Two of the three DMVs that comprise the suite of missing index DMVs do not
have running total columns. Indentifier columns compose all their columns.
Therefore there is only the need to insert new records that do not exist in the
persisted data tables; not perform updates or maintain running totals.
*/
--Insert new records into MetaBOT.dm_db_missing_index_groups
INSERT INTO [iDBA].[MetaBOT].[dm_db_missing_index_groups]
([index_group_handle], [index_handle], [date_stamp])
SELECT SDDMIG.[index_group_handle], SDDMIG.[index_handle], GETDATE()
FROM sys.[dm_db_missing_index_groups] SDDMIG
LEFT JOIN [iDBA].[MetaBOT].[dm_db_missing_index_groups] MDDMIG
ON SDDMIG.[index_group_handle] = MDDMIG.[index_group_handle]
AND SDDMIG.[index_handle] = MDDMIG.[index_handle]
WHERE MDDMIG.[index_group_handle] IS NULL
--Insert new records into MetaBOT.dm_db_missing_index_details
INSERT INTO [iDBA].[MetaBOT].[dm_db_missing_index_details]
(
[index_handle], [database_id], [object_id],
[equality_columns], [inequality_columns], [included_columns],
[statement], [date_stamp]
)
SELECT SDDMID.[index_handle], SDDMID.[database_id], SDDMID.[object_id],
SDDMID.[equality_columns], SDDMID.[inequality_columns], SDDMID.[included_columns],
SDDMID.[statement], GETDATE()
FROM sys.[dm_db_missing_index_details] SDDMID
LEFT JOIN [iDBA].[MetaBOT].[dm_db_missing_index_details] MDDMID
ON SDDMID.[index_handle] = MDDMID.[index_handle]
WHERE MDDMID.[index_handle] IS NULL
/*
The sys.dm_db_missing_index_columns DMF only provides normalized
column metadata from sys.dm_db_missing_index_details. I only insert new rows into
the table upon each cycle of persisting data from the DMOs.
*/
INSERT INTO [iDBA].[MetaBOT].[dm_db_missing_index_columns]
([index_handle] ,[column_id] ,[column_name], [column_usage], date_stamp)
SELECT SDDMID.[index_handle], SDDMIC.[column_id], SDDMIC.[column_name], SDDMIC.[column_usage], GETDATE()
FROM sys.[dm_db_missing_index_details] SDDMID
CROSS APPLY sys.[dm_db_missing_index_columns](SDDMID.index_handle) SDDMIC
LEFT JOIN [iDBA].[MetaBOT].[dm_db_missing_index_columns] MDDMIC
ON SDDMID.[index_handle] = MDDMIC.[index_handle]
AND SDDMIC.[column_id] = MDDMIC.[column_id]
WHERE MDDMIC.[index_handle] IS NULL
ORDER BY SDDMID.[index_handle]
--Determine last service restart date based upon tempdb creation date
SELECT @last_service_start_date =
SD.[create_date]
FROM sys.databases SD
WHERE SD.[name] = 'tempdb'
--Return the value for the last refresh date of the persisting table
SELECT @last_data_persist_date =
MAX(MDDMIGS.[date_stamp])
FROM [iDBA].[MetaBOT].[dm_db_missing_index_group_stats] MDDMIGS
--Take care of updated records first
IF @last_service_start_date < @last_data_persist_date
BEGIN
--Service restart date > last poll date
PRINT 'The latest persist date was ' +
CAST(@last_data_persist_date AS VARCHAR(50)) +
'; no restarts occurred since ' +
CAST(@last_service_start_date AS VARCHAR(50)) +
' (' + CAST(DATEDIFF(d, @last_service_start_date, @last_data_persist_date) AS VARCHAR(10)) +
' days ago.)'
UPDATE MDDMIGS
SET
MDDMIGS.unique_compiles = MDDMIGS.[unique_compiles] + (SDDMIGS.[unique_compiles] - MDDMIGS.[last_poll_unique_compiles]),
MDDMIGS.[user_seeks] = MDDMIGS.[user_seeks] + (SDDMIGS.[user_seeks] - MDDMIGS.[last_poll_user_seeks]),
MDDMIGS.[user_scans] = MDDMIGS.[user_scans] + (SDDMIGS.[user_scans] - MDDMIGS.[last_poll_user_scans]),
MDDMIGS.[last_user_seek] = SDDMIGS.[last_user_seek],
MDDMIGS.[last_user_scan] = SDDMIGS.[last_user_scan],
MDDMIGS.[avg_total_user_cost] = SDDMIGS.[avg_total_user_cost],
MDDMIGS.[avg_user_impact] = SDDMIGS.[avg_user_impact],
MDDMIGS.[system_seeks] = MDDMIGS.[system_seeks] + (SDDMIGS.[system_seeks] - MDDMIGS.[last_poll_system_seeks]),
MDDMIGS.[system_scans] = MDDMIGS.[system_scans] + (SDDMIGS.[system_scans] - MDDMIGS.[last_poll_system_scans]),
MDDMIGS.[last_system_seek] = SDDMIGS.[last_system_seek],
MDDMIGS.[last_system_scan] = SDDMIGS.[last_system_scan],
MDDMIGS.[avg_total_system_cost] = SDDMIGS.[avg_total_system_cost],
MDDMIGS.[avg_system_impact] = SDDMIGS.[avg_system_impact],
MDDMIGS.[last_poll_unique_compiles] = SDDMIGS.[unique_compiles],
MDDMIGS.[last_poll_user_seeks] = SDDMIGS.[user_seeks],
MDDMIGS.[last_poll_user_scans] = SDDMIGS.[user_scans],
MDDMIGS.[last_poll_system_seeks] = SDDMIGS.[system_seeks],
MDDMIGS.[last_poll_system_scans] = SDDMIGS.[system_scans],
MDDMIGS.date_stamp = GETDATE()
FROM iDBA.MetaBOT.[dm_db_missing_index_group_stats] MDDMIGS INNER JOIN sys.[dm_db_missing_index_group_stats] SDDMIGS
ON MDDMIGS.group_handle = SDDMIGS.group_handle
END
ELSE
BEGIN
--Service restart date > last poll date
PRINT 'Lastest service restart occurred on ' +
CAST(@last_service_start_date AS VARCHAR(50)) +
' which is after the latest persist date of ' +
CAST(@last_data_persist_date AS VARCHAR(50))
UPDATE MDDMIGS
SET
MDDMIGS.unique_compiles = MDDMIGS.[unique_compiles] + SDDMIGS.[unique_compiles],
MDDMIGS.[user_seeks] = MDDMIGS.[user_seeks] + SDDMIGS.[user_seeks],
MDDMIGS.[user_scans] = MDDMIGS.[user_scans] + SDDMIGS.[user_scans],
MDDMIGS.[last_user_seek] = SDDMIGS.[last_user_seek],
MDDMIGS.[last_user_scan] = SDDMIGS.[last_user_scan],
MDDMIGS.[avg_total_user_cost] = SDDMIGS.[avg_total_user_cost],
MDDMIGS.[avg_user_impact] = SDDMIGS.[avg_user_impact],
MDDMIGS.[system_seeks] = MDDMIGS.[system_seeks] + SDDMIGS.[system_seeks],
MDDMIGS.[system_scans] = MDDMIGS.[system_scans] + SDDMIGS.[system_scans],
MDDMIGS.[last_system_seek] = SDDMIGS.[last_system_seek],
MDDMIGS.[last_system_scan] = SDDMIGS.[last_system_scan],
MDDMIGS.[avg_total_system_cost] = SDDMIGS.[avg_total_system_cost],
MDDMIGS.[avg_system_impact] = SDDMIGS.[avg_system_impact],
MDDMIGS.[last_poll_unique_compiles] = SDDMIGS.[unique_compiles],
MDDMIGS.[last_poll_user_seeks] = SDDMIGS.[user_seeks],
MDDMIGS.[last_poll_user_scans] = SDDMIGS.[user_scans],
MDDMIGS.[last_poll_system_seeks] = SDDMIGS.[system_seeks],
MDDMIGS.[last_poll_system_scans] = SDDMIGS.[system_scans],
MDDMIGS.date_stamp = GETDATE()
FROM iDBA.MetaBOT.[dm_db_missing_index_group_stats] MDDMIGS INNER JOIN sys.[dm_db_missing_index_group_stats] SDDMIGS
ON MDDMIGS.group_handle = SDDMIGS.group_handle
END
--Take care of new records next
INSERT INTO [iDBA].[MetaBot].[dm_db_missing_index_group_stats]
(
[group_handle], [unique_compiles], [user_seeks],
[user_scans], [last_user_seek], [last_user_scan],
[avg_total_user_cost], [avg_user_impact], [system_seeks],
[system_scans], [last_system_seek], [last_system_scan],
[avg_total_system_cost], [avg_system_impact], [last_poll_unique_compiles],
[last_poll_user_seeks], [last_poll_user_scans], [last_poll_system_seeks],
[last_poll_system_scans], [date_stamp]
)
SELECT
SDDMIGS.[group_handle], SDDMIGS.[unique_compiles], SDDMIGS.[user_seeks],
SDDMIGS.[user_scans], SDDMIGS.[last_user_seek], SDDMIGS.[last_user_scan],
SDDMIGS.[avg_total_user_cost], SDDMIGS.[avg_user_impact], SDDMIGS.[system_seeks],
SDDMIGS.[system_scans], SDDMIGS.[last_system_seek], SDDMIGS.[last_system_scan],
SDDMIGS.[avg_total_system_cost], SDDMIGS.[avg_system_impact], SDDMIGS.[unique_compiles],
SDDMIGS.[user_seeks], SDDMIGS.[user_scans], SDDMIGS.[system_seeks],
SDDMIGS.[system_scans], GETDATE()
FROM [sys].[dm_db_missing_index_group_stats] SDDMIGS
LEFT JOIN iDBA.[MetaBOT] .[dm_db_missing_index_group_stats] MDDMIGS
ON SDDMIGS.[group_handle] = MDDMIGS.[group_handle]
WHERE MDDMIGS.[group_handle] IS NULL
As with the methods outlined previously, the calculations for updating running total values is as follows:
- If a restart of services has not occurred since the last collection of metadata the formula for updating the relevant column with new data is: TABLE.COL + (DMV.COL - TABLE.LAST_POLL_COL) Simply put, it's the currently stored value for the column (TABLE.COL) + the difference between the current value of the corresponding column in the DMV (DMV.COL) and the last value persisted from the DMV (TABLE.LAST_POLL_COL).
- If a restart of services has occurred since the last data collection the formula used will be: TABLE.COL + DMV.COL, since the values stored in the Dynamic Management View will have reset upon service restart.
Just as with sys.dm_db_index_usage_stats, and sys.dm_db_index_operational_stats, the final step is to create a SQL Server Agent Job for the execution of this stored procedure. In our environment I run this process six times daily, immediately after I've run the similar Agent jobs to store data from sys.dm_db_index_usage_stats, and sys.dm_db_index_operational_stats. It runs outside of the timeframe of scheduled index maintenance. I encourage you to test with schedules that fit your individual needs.
The final tip in this series will outline methods for using this information as opposed to relying on the Dynamic Management Objects after a system restart occurs. The need to persist this data will be quite clear once we have a chance to review the query results in this final tip in the series.
Next Steps
- Follow this series from the beginning. Here is the first part of the series, pertaining to sys.dm_db_index_usage_stats. Here is the second tip in the series.
- Review tips on utilizing the results from the missing index Dynamic Management Functions on MSSQLTips.com
- Check out our list of Dynamic Management Objects.
- Create the objects and structures from this tip in your environment.
- Download the scripts
- Read Part1, Part2
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2009-07-08