SQL Server 2014 Incremental Update Statistics Per Partition

By:   |   Updated: 2017-05-05   |   Comments   |   Related: > Query Optimization


Problem

The SQL Server Query Optimizer depends heavily on the statistics in generating the most efficient query plan execution. These statistics provide the query optimizer with the distribution of the columns values in the table and the number of rows, also called the cardinality that will result from the query.

Without these statistics, or with outdated statistics, the query will be slow, as the query optimizer uses inaccurate statistics to create the query execution plan, which is not the optimal plan to execute the query in this case. SQL Server usually does its job in keeping these statistics up-to-date. But as a database administrator, you should do your job, in some cases, by updating the statistics manually.

Updating statistics manually in large tables can be as a big challenge, as the entire table will be scanned, even if a small amount of data has changed recently. You will face the same issue with large partitioned tables, where the update statistics process will scan all the table partitions even if only one partition had changes since the last statistics update process. Is there a way to overcome this issue and update statistics for the changed partition only?

Solution

SQL Server 2014 introduced a new feature, Incremental Statistics, which helps in updating statistics for only the partition or the partitions that you choose. Instead of scanning the whole table to update the statistics, the selected partition will be scanned only for updating, reducing the time required to perform the update statistics operation, by updating only the modified partition. The other valuable point is that the percentage of data changes required to trigger the automatic update of statistics, 20% of rows changed, will be applied now at the partition level.

To understand updating the incremental statistics practically, we will prepare a test database with a partitioned table. Let us start with creating a new database MSSQLTips_PartDemo with four new filegroups in addition to the Primary default filegroup, and add four database data files in these filegroups using the below script:

USE [master]
GO
CREATE DATABASE [MSSQLTips_PartDemo]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILEGROUP [FirstQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILEGROUP [SecondQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILEGROUP [ThirdQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILEGROUP [FourthQ]
GO

USE [master]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILE ( NAME = N'FiQ_2016', FILENAME = N'D:\Data\FiQ_2016.ndf' , SIZE = 4096KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FirstQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILE ( NAME = N'SeQ_2016', FILENAME = N'D:\Data\SeQ_2016.ndf' , SIZE = 4096KB , FILEGROWTH = 1024KB ) TO FILEGROUP [SecondQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILE ( NAME = N'ThQ_2016', FILENAME = N'D:\Data\ThQ_2016.ndf' , SIZE = 4096KB , FILEGROWTH = 1024KB ) TO FILEGROUP [ThirdQ]
GO
ALTER DATABASE [MSSQLTips_PartDemo] ADD FILE ( NAME = N'FoQ_2016', FILENAME = N'D:\Data\FoQ_2016.ndf' , SIZE = 4096KB , FILEGROWTH = 1024KB ) TO FILEGROUP [FourthQ]
GO
 

Once the database is created with the new filegroups and data files, we need to prepare it to host the partitioned table. First, we will create the partitioned function PartitionMSSQLByQuarter that classifies the data depending on the four quarters of the year as follows:

USE [MSSQLTips_PartDemo]
GO

CREATE PARTITION FUNCTION PartitionMSSQLByQuarter(INT) AS
  RANGE RIGHT FOR VALUES
  (20161,20162,20163,20164)
GO

Moreover, we will create the partitioning scheme SchemeForPartitionByQuarter that specifies the filegroup that is assigned for each partition value stated in the partitioning function:

USE [MSSQLTips_PartDemo]
GO

CREATE PARTITION SCHEME SchemeForPartitionByQuarter AS
PARTITION PartitionMSSQLByQuarter  TO
  (
   [FirstQ],
   [SecondQ],
   [ThirdQ],
   [FourthQ],
   [PRIMARY] )
GO

We can now easily create the PartitionIncrStatDemo table that is partitioned based on the quarter computed value using the partitioning function and scheme created in the previous steps:

USE [MSSQLTips_PartDemo]
GO

CREATE TABLE [dbo].[PartitionIncrStatDemo](
[ID] [int] NULL,
[ActionGUID] [nvarchar](50) NULL,
[TS] [datetime] NULL,
[QuarterNum]  AS (datepart(year,[TS])*(10)+datepart(quarter,[TS])) PERSISTED
) ON [SchemeForPartitionByQuarter ] ([QuarterNum])
GO

A database level setting, Auto Create Incremental Statistics, can be used to enable the incremental statistics. With this option, any new auto created column statistics on a partitioned table will use the incremental statistics. You can set it to True from the Options page of the Database Properties window as shown below:

Auto Create Incremental Statistics in SQL Server

The same action can be performed using the below ALTER DATABASE T-SQL statement:

USE [master]
GO
ALTER DATABASE [MSSQLTips_PartDemo] SET AUTO_CREATE_STATISTICS ON(INCREMENTAL = ON)
GO

If the incremental statistics is not enabled at the database level, you need to enable it when defining the table’s clustered index by turning on the STATISTICS_INCREMENTAL as follows:

CREATE CLUSTERED INDEX IX_PartitionIncrStatDemo_ID
ON [PartitionIncrStatDemo] (ID) WITH (STATISTICS_INCREMENTAL=ON);
GO

The sys.stats DMV can be queried to check if the incremental statistics is enabled or not on a specific index. The is_incremental with value equal to 1 means that the incremental statistics is enabled on that table as shown below:

USE MSSQLTips_PartDemo
GO

SELECT
OBJECT_NAME(object_id) TableName
,name
,is_incremental
,stats_id
FROM sys.stats
WHERE name = 'IX_PartitionIncrStatDemo_ID'

The result in our case will show us that the incremental statistics is enabled on that table:

SQL Server is_incremental Flag

We will fill the PartitionIncrStatDemo table with 8000 records, 2000 records in each partition. Recall that the number beside the GO statement specifies the number of times this query will be executed:

USE [MSSQLTips_PartDemo]
GO
INSERT [dbo].[PartitionIncrStatDemo] ([ID], [ActionGUID], [TS])
VALUES (1, N'CF2FA597-7D2D-471E-8158-D51F3967B2DA', CAST(N'2016-11-11T00:00:00.000' AS DateTime))
GO 2000
INSERT [dbo].[PartitionIncrStatDemo] ([ID], [ActionGUID], [TS])
VALUES (2, N'4CFA77B9-17D6-4F20-B3FD-B7F60675C705', CAST(N'2016-06-15T00:00:00.000' AS DateTime))
GO 2000
INSERT [dbo].[PartitionIncrStatDemo] ([ID], [ActionGUID], [TS])
VALUES (3, N'5F89A428-E646-43C5-8AB1-22398B9E47F4', CAST(N'2016-01-23T00:00:00.000' AS DateTime))
GO 2000

INSERT [dbo].[PartitionIncrStatDemo] ([ID], [ActionGUID], [TS])
VALUES (4, N'D73375CE-3C93-4E36-AE17-0DD08CFF9CD2', CAST(N'2016-08-05T00:00:00.000' AS DateTime))
GO 2000

Querying the sys.partitions system object using the below script:

SELECT partition_number, rows
FROM sys.partitions
WHERE OBJECT_NAME(OBJECT_ID)='PartitionIncrStatDemo';
GO

The result will show us that each partition is filled with 2000 records, except the first one, which is the Primary filegroup:

Partitions Data Distribution

Let's run the below SELECT statement that reads about 6000 records in order to trigger the automatic update of the table statistics:

SELECT * FROM [PartitionIncrStatDemo] WHERE ID <> 1
GO

To check the statistics properties of the PartitionIncrStatDemo table, including the last statistics update time, the  sys.dm_db_stats_properties DMF can be queried using the following SELECT statement:

SELECT object_id, stats_id , last_updated , rows , rows_sampled , steps  FROM sys.dm_db_stats_properties(OBJECT_ID('[PartitionIncrStatDemo]'),1);
GO

The result will be like:

Last Updated SQL Server Statistics

As you can see, the sys.dm_db_stats_properties DMF shows us that the statistics were updated on that time, for the table that has 8000 rows. The strong question here, which partition of that table includes the updated statistics?

SQL Server 2014 introduced a new DMF sys.dm_db_incremental_stats_properties that shows us the incremental statistics properties. It will retrieve the same information retrieved from the sys.dm_db_stats_properties DMF, but for each partition in the partitioned table, by providing it with the same parameters; the object ID and the stats ID.

Let us trigger the update statistics again by deleting 1500 records from partition 3 and run another SELECT query:

DELETE TOP (1500)  FROM [PartitionIncrStatDemo] where ID  =2
GO
SELECT * FROM [PartitionIncrStatDemo] where ID  <> 4
GO

Querying the sys.dm_db_stats_properties DMF:

SELECT object_id, stats_id , last_updated , rows , rows_sampled , steps  FROM sys.dm_db_stats_properties(OBJECT_ID('[PartitionIncrStatDemo]'),1);
GO

You will see that the partitioned table’s statistics were updated on the below date:

Last Updated SQL Server Statistics

Querying the sys.dm_db_incremental_stats_properties DMF with the same parameters:

SELECT object_id, stats_id ,partition_number , last_updated , rows , rows_sampled , steps 
FROM sys.dm_db_incremental_stats_properties(OBJECT_ID('PartitionIncrStatDemo'),1);

The result will show us that the statistics were updated at that time, but specifically on partition number 3:

Last Updated SQL Server Statistics for Partition 3

The UPDATE STATISTICS … ON PARTITION command now allows you to update the statistics on a specific partition only as follows:

UPDATE STATISTICS dbo.PartitionIncrStatDemo(IX_PartitionIncrStatDemo_ID) WITH RESAMPLE ON PARTITIONS(3)
GO

Where RESAMPLE is mandatory, it reads the leaf-level statistics using the same sample rates, and merges the generated result back into the main statistics histogram.

Querying the sys.dm_db_incremental_stats_properties DMF again:

SELECT object_id, stats_id ,partition_number , last_updated , rows , rows_sampled , steps 
FROM sys.dm_db_incremental_stats_properties(OBJECT_ID('PartitionIncrStatDemo'),1);

You will see clearly that partition number 3 is the only partition with its statistics updated using the UPDATE STATISTICS … ON PARTITION command:

Last Updated SQL Server Statistics for Partition 3
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ahmad Yaseen Ahmad Yaseen is a SQL Server DBA with a bachelor’s degree in computer engineering as well as .NET development experience.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2017-05-05

Comments For This Article

















get free sql tips
agree to terms