By: Aaron Bertrand | Updated: 2020-05-26 | Comments | Related: 1 | 2 | 3 | More > Database Administration
Problem
In Part 1, I wrote about discovering more details about large SQL Server tables, and in Part 2, I showed a SQL Server stored procedure that makes it easy to assemble all of this information into a single result set. That procedure, however, assumes you already know which SQL Server table(s) to investigate. What if you only know that the files for a specific SQL Server database are growing or seem misconfigured? Wouldn't it be nice to find the tables in that database with the largest size on disk or the highest number of rows?
Solution
I looked at the solution from Part 2, and realized that it would take little more than a wrapper procedure to accomplish my goal. The concept is that you pass in the name of the database, the number of tables you want to return, and whether you want to rank by size or by row count. The procedure would then run a query against either sys.partitions or sys.dm_db_database_page_allocations to determine the top n tables, and then – for each table in the list – call the original procedure. This would produce n result sets, each showing how a table is distributed across the files and filegroups.
Another approach could be to change the original procedure to pivot each table and its indexes into a single result set, but that may be more cumbersome both to create and to consume.
Before I show the code in the wrapper procedure I created, let's create a simple database with two filegroups, three files, and two tables, just to keep the results explanatory but digestible. Here is the database:
CREATE DATABASE gronk
ON PRIMARY
(name = N'gronk_data1', filename = N'/var/opt/mssql/data/gronk1.mdf', size = 10 MB),
FILEGROUP SECONDARY
(name = N'gronk_data2a', filename = N'/var/opt/mssql/data/gronk2a.mdf', size = 10 MB),
(name = N'gronk_data2b', filename = N'/var/opt/mssql/data/gronk2b.mdf', size = 10 MB)
LOG ON
(name = N'gronk_log', filename = N'/var/opt/mssql/data/gronk_log.ldf', size = 10 MB);
Then two simple tables, one with more rows, and the other fewer rows, but more data:
USE gronk;
GO -- in the first table, we'll put a lot of rows, but small row size CREATE TABLE dbo.tblPrimary
(
id int NOT NULL,
CONSTRAINT pk_tblPrimary PRIMARY KEY (id),
INDEX ix_tblPrimary (id DESC)
) ON [PRIMARY];
GO INSERT dbo.tblPrimary(id) SELECT [object_id] FROM sys.all_objects;
-- in the second table, we'll put fewer rows, but larger row size CREATE TABLE dbo.tblSecondary
(
id int NOT NULL,
filler char(4000) NOT NULL DEFAULT '',
CONSTRAINT pk_tblSecondary PRIMARY KEY (id)
) ON [SECONDARY]; CREATE INDEX ix_tblSecondary ON dbo.tblSecondary(id DESC) INCLUDE (filler) ON [SECONDARY];
GO INSERT dbo.tblSecondary(id) SELECT TOP (1000) [object_id] FROM sys.all_objects;
GO
With those tables in place, I'll go back to the Utility database, and create the following wrapper procedure:
USE Utility;
GO CREATE PROCEDURE dbo.AssessDistribution_ForMultipleTables
@DatabaseName nvarchar(260),
@TableCount tinyint = 5,
@PrimaryFactor char(4) = 'rows' -- or 'size'
AS
BEGIN
SET NOCOUNT ON; IF DB_ID(@DatabaseName) IS NULL
BEGIN
RAISERROR(N'Database %s does not exist.', 11, 1, @DatabaseName);
RETURN;
END DECLARE @sql nvarchar(max) = N';WITH x AS (SELECT TOP (@TableCount) object_id, ',
@SchemaName sysname,
@ObjectName sysname,
@RowOrPageCount int,
@c cursor,
@Context nvarchar(1024) = COALESCE(QUOTENAME(@DatabaseName) + N'.', N'')
+ N'sys.sp_executesql'; DECLARE @t table
(
rn tinyint PRIMARY KEY,
SchemaName sysname,
ObjectName sysname,
RowOrPageCount int
); IF @PrimaryFactor = 'rows'
BEGIN
SET @sql += N'c = SUM(rows) FROM sys.partitions';
END IF @PrimaryFactor = 'size'
BEGIN
SET @sql += N'c = COUNT(*) FROM sys.dm_db_database_page_allocations
(DB_ID(), NULL, NULL, NULL, N''LIMITED'')';
END SET @sql += N' WHERE OBJECTPROPERTY(object_id, N''IsMsShipped'') = 0
-- AND index_id IN (0,1)
GROUP BY object_id
ORDER BY c DESC
)
SELECT rn = ROW_NUMBER() OVER (ORDER BY x.c DESC),
SchemaName = s.name,
ObjectName = o.name,
RowOrPageCount = x.c
FROM x
INNER JOIN sys.objects AS o ON x.object_id = o.object_id
INNER JOIN sys.schemas AS s ON o.schema_id = s.schema_id;' INSERT @t EXEC @Context @sql, N'@TableCount int', @TableCount; IF @@ROWCOUNT = 0
BEGIN
RAISERROR(N'No tables found.', 11, 1);
RETURN;
END SET @c = cursor FORWARD_ONLY STATIC READ_ONLY FOR
SELECT SchemaName, ObjectName, RowOrPageCount
FROM @t ORDER BY rn; OPEN @c; FETCH NEXT FROM @c INTO @SchemaName, @ObjectName, @RowOrPageCount; WHILE @@FETCH_STATUS <> -1
BEGIN
SELECT obj = QUOTENAME(@SchemaName) + N'.' + QUOTENAME(@ObjectName),
factor = @PrimaryFactor,
num = @RowOrPageCount; EXEC Utility.dbo.AssessDistribution_ByTable
@ObjectName = @ObjectName,
@SchemaName = @SchemaName,
@DatabaseName = @DatabaseName; FETCH NEXT FROM @c INTO @SchemaName, @ObjectName, @RowOrPageCount;
END
END
GO
Now, if I call it with this syntax, asking for tables by row count:
EXEC Utility.dbo.AssessDistribution_ForMultipleTables
@DatabaseName = N'gronk',
@PrimaryFactor = 'rows';
I get these results:
And if I switch to ranking tables by size:
EXEC Utility.dbo.AssessDistribution_ForMultipleTables
@DatabaseName = N'gronk',
@PrimaryFactor = 'size';
The results come back in the opposite order (and indicate the number of pages as opposed to the number rows):
A few caveats:
- This procedure sums the row count or size across all indexes. If you only want to account for the clustered index (or the table, if you have heaps), uncomment the -- AND index_id IN (0,1) line.
- This solution is still untested with columnstore indexes or memory-optimized tables.
- Because it bears repeating, sys.dm_db_database_page_allocations has some potential performance issues; if you ask for 10 tables, you're going to be calling this function 11 times. Just something to keep in mind.
Next Steps
See these related tips and other resources:
- SQL Server File and Filegroup Space Usage – Part 1
- SQL Server Index and Partition Space Usage - Part 2
- Use caution with sys.dm_db_database_page_allocations in SQL Server
- Different ways to determine free space for SQL Server databases and database files
- Collect SQL Server database usage for free space and trending
- Script to determine free space to support shrinking SQL Server database files
- Determine Free Space, Consumed Space and Total Space Allocated for SQL Server databases
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2020-05-26