By: Eli Leiba | Updated: 2011-04-11 | Comments (4) | Related: > Dynamic Management Views and Functions
Problem
I/O is one of the most time consuming activities in SQL Server. If you can reduce the I/O wait time then you can improve performance. This can be done with indexing and tuning queries, but I/O issues may also be at the file level. In this tip we look at how to identify specific files that consume the most I/O activity using a SQL Server DMV and creating snapshots for analysis.
Solution
The solution I suggest here is to collect I/O statistics for a certain time period and then compare the data snapshots to identify potential I/O bottlenecks. First we will collect a baseline and then once a day (or hour or any time period) collect the statistics again and check the difference between two different snapshots.
The I/O statistics will be collected from a query using sys.dm_io_virtual_file_stats and sys.master_files. The data that will be included consists of the following data columns (Table from BOL).
Column | Type | Description |
---|---|---|
num_of_reads | bigint | Number of reads issued on the file. |
num_of_bytes_read | bigint | Total number of bytes read on this file. |
io_stall_read_ms | bigint | Total time, in milliseconds, that the users waited for reads issued on the file. |
num_of_writes | bigint | Number of writes made on this file. |
num_of_bytes_written | bigint | Total number of bytes written to the file. |
io_stall_write_ms | bigint | Total time, in milliseconds, that users waited for writes to be completed on the file. |
io_stall | bigint | Total time, in milliseconds, that users waited for I/O to be completed on the file. |
size_on_disk_bytes | bigint | Number of bytes used on the disk for this file. For sparse files, this number is the actual number of bytes on the disk that are used for database snapshots. |
I/O Statistics collection process
First we must create tables to store the I/O statistics. The following code will create two tables. You can put this in the master database or whatever database you prefer.
-- create a table for snapshot sequence generation
CREATE TABLE io_snapshots
(snap_id INT IDENTITY NOT NULL,
snapshot_creation_date DATETIME NOT NULL )
GO
ALTER TABLE io_snapshots ADD CONSTRAINT PK_io_snapshots PRIMARY KEY (snap_id)
GO
-- create a table for the io statistics
CREATE TABLE io_snapshots_statistics
(snap_id INT NOT NULL,
[db_id] smallint NOT NULL,
[file_id] smallint NOT NULL,
database_name SYSNAME,
physical_file_name SYSNAME,
Diff_Number_of_reads bigint,
Diff_Bytes_Read bigint,
Diff_Read_stall_time_ms bigint,
Diff_Number_of_writes bigint,
Diff_Bytes_written bigint,
Diff_Write_stall_time_ms bigint,
Diff_Read_Write_stall_ms bigint,
size_on_disk_MB bigint)
GO
ALTER TABLE io_snapshots_statistics ADD CONSTRAINT PK_io_snapshots_statistics
PRIMARY KEY (snap_id,[db_id], [file_id])
GO
ALTER TABLE io_snapshots_statistics ADD CONSTRAINT FK_io_snapshots_statistics_io_snapshots
FOREIGN KEY (snap_id) REFERENCES io_snapshots (snap_id)
GO
Second we must create the following two stored procedures.
The first SP (usp_io_vf_stats_snap) inserts a new I/O snapshot of statistics each time it is run.
-- The First Procedure.
CREATE PROC [dbo].[usp_io_vf_stats_snap]
AS
BEGIN
SET NOCOUNT ON
INSERT INTO io_snapshots ( snapshot_creation_date) SELECT GETDATE()
INSERT INTO io_snapshots_statistics
(snap_id,
[db_id],
[file_id],
database_name ,
physical_file_name,
Diff_Number_of_reads,
Diff_Bytes_Read,
Diff_Read_stall_time_ms,
Diff_Number_of_writes,
Diff_Bytes_written,
Diff_Write_stall_time_ms,
Diff_Read_Write_stall_ms,
size_on_disk_MB)
SELECT
(SELECT MAX(snap_id) FROM io_snapshots),
db_files.database_id,
db_files.FILE_ID,
DB_NAME(db_files.database_id) AS Database_Name,
db_files.physical_name AS File_actual_name,
num_of_reads AS Number_of_reads,
num_of_bytes_read AS Bytes_Read,
io_stall_read_ms AS Read_time_stall_ms,
num_of_writes AS Number_of_writes,
num_of_bytes_written AS Bytes_written,
io_stall_write_ms AS Write_time_stall_ms,
io_stall AS Read_Write_stall_ms,
size_on_disk_bytes / POWER(1024,2) AS size_on_disk_MB
FROM
sys.dm_io_virtual_file_stats(NULL,NULL) dm_io_vf_stats ,
sys.master_files db_files
WHERE
db_files.database_id = dm_io_vf_stats.database_id
AND db_files.[file_id] = dm_io_vf_stats.[file_id];
SET NOCOUNT OFF
END
GO
The second SP (usp_compare_io_stats_snaps) shows the difference between two snapshots and can be run several ways:
- If you don't pass in any parameter values it will compare the last two snapshots that were created
- If you pass in only the starting snapshot ID then it will compare that snapshot with the last snapshot
- Lastly, you can pass in a starting snapshot ID and an ending snapshot ID to compare any two time periods
-- The Second Procedure.
CREATE PROC [dbo].[usp_compare_io_stats_snaps]
(@start_snap_ID INT = NULL,
@end_snap_ID INT = NULL)
AS
DECLARE @end_snp INT
DECLARE @start_snp INT
BEGIN
SET NOCOUNT ON
IF (@end_snap_ID IS NULL)
SELECT @end_snp = MAX(snap_id) FROM io_snapshots
ELSE SET @end_snp = @end_snap_ID
IF (@start_snap_ID IS NULL)
SELECT @start_snp = @end_snp -1
ELSE SET @start_snp = @start_snap_ID
SELECT
CONVERT(VARCHAR(12),S.snapshot_creation_date,101) AS snapshot_creation_date,
A.database_name,
A.physical_file_name,
A.size_on_disk_MB,
A.Diff_Number_of_reads - B.Diff_Number_of_reads AS Diff_Number_of_reads,
A.Diff_Bytes_read - B.Diff_Bytes_read AS Diff_Bytes_read,
A.Diff_Read_stall_time_ms - B.Diff_Read_stall_time_ms AS Diff_Read_stall_time_ms,
A.Diff_Number_of_writes - B.Diff_Number_of_writes AS Diff_Number_of_writes,
A.Diff_Bytes_written - B.Diff_Bytes_written AS Diff_Bytes_written,
A.Diff_Write_stall_time_ms - B.Diff_Write_stall_time_ms AS Diff_Write_stall_time_ms,
A.Diff_Read_Write_stall_ms - B.Diff_Read_Write_stall_ms AS Diff_Read_Write_stall_ms ,
DATEDIFF (hh,S1.snapshot_creation_date, S.snapshot_creation_date) AS Diff_time_hours
FROM
io_snapshots S ,
io_snapshots S1,
io_snapshots_statistics A ,
io_snapshots_statistics B
WHERE
S.snap_id = @end_snp AND
S.snap_id = A.snap_id AND
B.snap_id = @start_snp AND
A.[db_id] = B.[db_id] AND
A.[file_id] = B.[file_id] AND
S1.snap_id = @start_snp AND
S1.snap_id = B.snap_id
ORDER BY
A.database_name,
A.physical_file_name
SET NOCOUNT OFF
END
GO
After data has been collected using the first SP, the second SP would be used to compare the differences. The output would look something like the below images (the images were broken into two parts to make it easier to read).
We can see the time of the last snapshot, the database, the physical file, the differences between the stats that were collected and the difference in hours between the two snapshots.
Next Steps
- Create the two tables and stored procedures in the scripts above and begin collecting statistics.
- Create a SQL Agent job to run usp_io_vf_stats_snap procedure on a daily or hourly basis.
- After at least two statistics collections, you can execute usp_compare_io_stats_snaps
- Check the result to see where the I/O bottlenecks are and try to load balance the activity on the files by separating them into two or more files or moving files to different drives.
- Read these additional tips about Dynamic Management Views
- Read these additional tips about Peformance Tuning
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2011-04-11