By: Jeremy Kadlec | Updated: 2007-12-12 | Comments (6) | Related: > Database Design
Problem
I am concerned our database design is overly denormalized. I believe we have some very wide tables which makes some of our coding very easy, but I am concerned about the data access and overall performance. Before I go too far down the path of just changing some of these tables or trying to change the ways of my team for future database design, how can I find out how wide some of these tables really are based on the data as compared to the theoretical maximum size? Can you provide a script or two that I can run on my SQL Server databases? Can you also provide any resources for practical database design? I know having some denormalization is reasonable, I am just not sure if some of the database design has gone to an extreme or not.
Solution
As far as calculating table width is concerned, a few different options are available. Let's take a look at those scripts and then address the database design independently.
Option 1 - DBCC SHOWCONTIG
The DBCC SHOWCONTIG command reports row related information and is a viable source to consider. This is achieved by using the WITH TABLERESULTS option. Then the MinimumRecordSize, MaximumRecordSize and AverageRecordSize can be reviewed based on your need.
Simple DBCC SHOWCONTIG Command |
USE AdventureWorks; GO DBCC SHOWCONTIG WITH TABLERESULTS; GO |
Keep in mind that this functionality of the DBCC SHOWCONTIG command is available with SQL Server 2000 and SQL Server 2005. Although on busy SQL Server databases it is not recommended to be run during production hours. It is advisable to run the command either during off hours, during a maintenance window or on a backed up version of the database. Here is one point of reference - DBCC SHOWCONTIG Blocking Issues.
Option 2 - sys.dm_db_index_physical_stats
One of the new features with SQL Server 2005 are the dynamic management views and functions. The one that will come in handy in this scenario is the sys.dm_db_index_physical_stats. The beauty of the dynamic management views and functions is that they can be queried with a simple SELECT statement. Here are a few different examples using the AdventureWorks SQL Server 2005 sample database:
sys.dm_db_index_physical_stats - Basic SELECT Statement |
USE AdventureWorks; GO SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName', CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName', index_id, index_type_desc, alloc_unit_type_desc, min_record_size_in_bytes, max_record_size_in_bytes, avg_record_size_in_bytes FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED'); GO |
sys.dm_db_index_physical_stats - Basic SELECT Statement with an ORDER BY Clause |
USE AdventureWorks; GO SELECT CAST(DB_NAME(DATABASE_ID) AS VARCHAR(20)) AS 'DatabaseName', CAST(OBJECT_NAME([OBJECT_ID]) AS VARCHAR(20)) AS 'TableName', index_id, index_type_desc, alloc_unit_type_desc, min_record_size_in_bytes, max_record_size_in_bytes, avg_record_size_in_bytes FROM SYS.DM_DB_INDEX_PHYSICAL_STATS (DB_ID('AdventureWorks'),NULL,NULL,NULL,'DETAILED') ORDER BY avg_record_size_in_bytes DESC; GO |
Database Design Considerations
Here are some data points to consider when evaluating your database design situation i.e. wide tables:
- Good or Bad - A wide table is not inherently a bad design depending on how it is used. For reporting environments, some database designs are wide to meet reporting needs to avoid joins or provide a simple interface to users/reporting applications.
- Eliminate Joins - In OLTP environments, data is duplicated in some circumstances to eliminate joining tables. Depending on the circumstances and how the duplicated data is maintained, this may be a critical technique to ensure the user experience.
- Repeating columns - To me this is typically a sure sign that the design was not thought about or it has evolved over time. If a table has 3 or more columns with the same meaning i.e. product1, product2, product3, this is a good candidate for a one to many relationship with the current table and the new product table. Another data point to consider is what will happen when a fourth or fifth product is ordered?
- Off the map - If a database consists of a few (wide) tables and all of the columns are text data types when particular columns would be better suited as integers, date time, etc. as opposed to just characters, then either the design was not thought out at all or some database design education is needed.
- Check out the Next Steps section below for some additional points of reference related to the database design.
Next Steps
- As you begin to tackle your wide tables, consider these scripts as a good starting point to address the potential issue. The scripts should provide a means to focus in on the tables with the most critical issues. Then begin to address them one at a time until you reach an acceptable resolution.
- Check out these tips related to DBCC SHOWCONTIG:
- Check out these tips related to sys.dm_db_index_physical_stats:
- Check out these tips related to database design:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2007-12-12