By: Jeremy Kadlec | Updated: 2011-10-04 | Comments (9) | Related: 1 | 2 | 3 | > Indexing
Problem
I know it is a general best practice to have a clustered index on all of your SQL Server tables. Unfortunately, everyone that works with SQL Server in our company does not know that. I think I have uncovered a pattern where a number of tables do not have a clustered index. Can you provide a script or two to find out which tables do not have a clustered index? Check out this tip to learn more.
Solution
In general it does make sense to have a clustered index on your tables. I would not say this is an absolute rule 100% of the time. However, if a table does not have a clustered index, it should be a conscious, rational decision. To address your problem, I believe we need to address the following questions:
- What is a SQL Server Clustered Index?
- Why do I need a SQL Server Clustered Index?
- How can I find tables in my database without a SQL Server clustered index?
- How can I find out how many tables across all of my SQL Server databases that do have a SQL Server clustered index?
What is a SQL Server Clustered Index?
At the most basic level, a clustered index provides order to a table because the data is stored based on the clustered index columns. Keep in mind that there is 1 clustered index per table, so there is only 1 order for the data. Generally speaking, data can be consistently accessed based on the clustered index as opposed to what is called a heap. A table without a clustered index is called a heap. With a heap, the data is not ordered by an index, so data is not stored in any particular order.
A clustered index is generally a single column for most tables, but clustered indexes can be multiple columns. Often times, clustered indexes are created when the primary key for the table is created in tools like SQL Server Management Studio's Database Designer. However, this is not a requirement; the clustered index and Primary Key are not always best for suited for the same column(s). How the data is searched can be very different from how each row is uniquely identified.
Speaking of order, a clustered index can be created in ascending or descending order. Most clustered indexes are created in ascending order, because that is logically how the data is viewed by users in the application.
For more information about clustered indexes, check out the following tips:
Why do I need a SQL Server Clustered Index?
The most important reason to have a clustered index is to improve data access. Generally speaking the proper clustered index that is properly maintained will improve overall application performance. In general clustered indexes on columns with help these types of operations in queries are improved:
- WHERE clauses with BETWEEN, greater than, less than, etc. logic
- Column(s) used to JOIN tables
- Columns used in ORDER BY or GROUP BY clauses
Keep in mind you can only have 1 clustered index per table, so you need to chose it wisely to benefit the most queries. As I mentioned earlier, it is generally a good idea to have a clustered index, but the definition for the clustered index can change. As such, if how the data is queried changes significantly be sure to verify your clustered index still is appropriate. If not, change your clustered index to match your new data access patterns.
For more information about clustered indexes, check out the following tips:
- Category - Maintenance
- Category - Maintenance Plans
- Finding a better candidate for your SQL Server clustered indexes
How can I find tables in my database without a SQL Server clustered index?
In SQL Server 2005 and beyond, we can query the sys.indexes and sys.objects catalog views to determine which tables in our database are a 'heap', meaning they do not have a clustered index. Let's explain the query below:
- Line 1 - Retrieve the top 1000 records from the database to include only the following columns:
- name from sys.objects which is the table name
- type_desc from sys.indexes which indicates the table is a HEAP
- o.type_desc from sys.indexes which indicates the table is a USER_TABLE
- o.create_date from sys.indexes which indicates when the table was created
- Line 2 - Retrieve data from the sys.indexes catalog view and alias the view as 'i' in the query
- Line 3 - INNER JOIN the data from the sys.objects catalog view and alias the view as 'o' in the query
- Line 4 - Join the data between the sys.indexes and sys.objects catalog views based on the object_id column from both of these catalog views
- Line 5 - WHERE clause to filter the data based on the value from the type_desc column of the sys.objects catalog view being equal to 'USER_TABLE'
- Line 6 - Second WHERE clause to filter the data based on the value from the type_desc column of the sys.indexes catalog view being equal to 'HEAP'
- Line 7 - Order the results by the o.name column
- Line 8 - Execute the batch with the GO command
SELECT TOP 1000 o.name, i.type_desc, o.type_desc, o.create_date FROM sys.indexes i INNER JOIN sys.objects o ON i.object_id = o.object_id WHERE o.type_desc = 'USER_TABLE' AND i.type_desc = 'HEAP' ORDER BY o.name GO
Additional tips:
- Here is a tip on how to find tables without a clustered index using Policy Based Management
How can I find out how many tables across all of my databases that do have a SQL Server clustered index?
To build on the code above for a single database, let's use the sp_MSforeachdb system stored procedure to run a variation of the query above. The code below will determine how many (i.e. count) tables across all of my databases on a single SQL Server instance do not have a clustered index.
Here is an explanation of the code:
- Line 1 - Execute the sp_MSForEachDB system stored procedure
- The '?' i.e. question mark in this code is a placeholder for the database name in the sp_MSForEachDB system stored procedure
- The single quote after the sp_MSForEachDB system stored procedure starts the code being executed against all of the databases
- Line 2 - Retrieve the database name
- The '?' i.e. question mark in this code is a placeholder for the database name in the sp_MSForEachDB system stored procedure
- Line 3 - Retrieve the count of user defined tables that are heaps
- Line 4 - Retrieve data from the sys.indexes catalog view and alias the view as 'i' in the query
- Line 5 - INNER JOIN the data from the sys.objects catalog view and alias the view as 'o' in the query
- Line 6 - Join the data between the sys.indexes and sys.objects catalog views based on the object_id column from both of these catalog views
- Line 7 - WHERE clause to filter the data based on the value from the type_desc column of the sys.objects catalog view being equal to 'USER_TABLE'
- Notice in this example we use double quotes around the WHERE clause parameters
- Line 8 - Second WHERE clause to filter the data based on the value from the type_desc column of the sys.indexes catalog view being equal to 'HEAP'
- The final single quote matches the single quote from the first line to end the code executed against all of the SQL Server databases on the instance
EXEC sp_MSforeachdb 'USE [?] SELECT ''?'' AS DatabaseName, COUNT(o.name) AS Count FROM sys.indexes i INNER JOIN sys.objects o ON i.object_id = o.object_id WHERE o.type_desc = ''USER_TABLE'' AND i.type_desc = ''HEAP'''
Additional tips:
Next Steps
- Since you are concerned about tables in your environment without clustered indexes, I would recommend running the second query in this tip in a development or test environment to get a sense of the magnitude of the issue.
- Once you have a sense of the problem, you can run the first query to determine the tables per database without clustered indexes.
- I would recommend reading these tips to determine which columns to use for your clustered indexes:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2011-10-04