SQL Server Indexing Basics

By:   |   Updated: 2009-07-14   |   Comments (3)   |   Related: 1 | 2 | 3 | > Indexing


Problem

I have seen your tips on indexing and I think they are great. I am relatively new to SQL Server and need to understand the basics. I need some additional background information to understand what sorts of indexing options are available. Hopefully that can lead me into using some of the additional tips. Can you help me?

Solution

As you begin to design and develop databases, be sure to include indexing in that effort.  A well thought out indexing strategy can significantly improve the data access layer and overall user experience.  However, caring for the database indexes is not a one time event.  As your application changes as well as when users start to use the application in new and different ways, your indexes need to reflect these changes.  In addition, as with anything else, indexes need to be maintained.  So be sure to have your indexes included in the regularly scheduled maintenance process.

With all of this being said, let's start to cover some of the basic indexing information.  In this tip, we will try to simplify a number of complex and interdependent topics. As such, the goal of this tip is to introduce the topics and provide additional references for an expanded information.  If you are looking for more intermediate and advanced information, check out the indexing category.  In addition, check out these resources if you are looking into Full Text Search or XML indexes as opposed to traditional relational engine indexes.  With that being said, let's jump in!

Clustered Indexes

A clustered index stores the data for the table based on the columns defined in the create index statement.  As such, only one clustered index can be defined for the table because the data can only be stored and sorted one way per table.  Due to the storage and sorting impacts, be sure to carefully determine the best column for this index. 

Although many implementations only have a single column for the clustered index, in reality a clustered index can have multiple columns.  Just be careful to select the correct columns based on how the data is used.  The number of columns in the clustered (or non clustered) index can have significant performance implications with heavy INSERT, UPDATE and DELETE activity in your database.

An age old question is whether or not a table must have a clustered index.  The answer is no, but in most cases, it is a good idea to have a clustered index on the table to store the data in a specific order.  If a table only has non-clustered indexes it is called a heap.  For more information check out Clustered Tables vs Heap Tables.

Another age old question is whether or not the Primary Key for the table must be the clustered index.  The answer once again is no.  In many implementations the Primary Key is also the clustered index, but it does not have to be.

Non Clustered Indexes

A non clustered index can consist of one or more columns, but the data storage is not dependent on this create index statement as is the case with the clustered index.  For a table without a clustered index, which is called a heap, the non clustered index points the row (data).  In the circumstance where the table has a clustered index, then the non clustered index points to the clustered index for the row (data).

In terms of the number of non clustered indexes, a single table can have up to 249 non clustered indexes.  Although, too much of a good thing can become bad.  Keep in mind that SQL Server needs to keep the indexes updated as you INSERT, UPDATE and/or DELETE data.  As such, it is necessary to strike a balance with the number of indexes created for each table based on the activity on the table i.e. SELECT, INSERT, UPDATE and/or DELETE transactions.

Fill Factor

When you create an index the fill factor option indicates how full the leaf level pages are when the index is created or rebuilt.  Valid values are 0 to 100.  These values represent a percentage of the leaf level pages being used when the index is created or rebuilt.  A fill factor of 0 means that all of the leaf level pages are full.  The same is true with a fill factor equal to 100. 

Here are a few aspects to take into consideration when you select your fill factor:

  • Depending on how the data is inserted, updated and deleted in the table dictates how full the leaf level pages should be in the table.  To fine tune this setting typically takes some testing and analysis.  This could be critical for large active tables in your database.
  • If data is always inserted at the end of the table, then the fill factor could be between 90 to 100 percent since the data will never be inserted into the middle of a page.  UPDATE and DELETE statements may expand (UPDATE) or decrease (DELETE) the space needed for each leaf level page.  This should be fine tuned based on testing.
  • If the data can be inserted anywhere in the table then a fill factor of 60 to 80 percent could be appropriate based on the INSERT, UPDATE and DELETE activity.  However, it is necessary to conduct some testing and analysis to determine the appropriate settings for your environment.
  • With all things being equal i.e. table size, SQL Server versions, options, etc., the lower the fill factor percentage the more storage that could be needed as compared to a higher fill factor where the pages are more compact.
  • Another aspect to take into consideration is your index rebuild schedule.  If you cannot rebuild your indexes on a regular schedule and if you have a high level of INSERT, UPDATE and DELETE activity throughout the table, one consideration may be to have a lower fill factor to limit the fragmentation.  The trade-off may be that more storage is needed.

For more information on fill factor considerations, check out:

Covering Indexes

Covering indexes typically consist of 2 or more columns and have all of the indexes needed to fulfill a query.  One scenario where covering indexes are created is to prevent a bookmark lookup.

Index Order

The index order can either be in ascending or descending order.  Depending on how your queries return the data dictates how the index should be created.  Having the correct index order can improve the performance of queries in many circumstances especially when you have a covering index.  Check out eight examples in the Building SQL Server Indexes in Ascending vs Descending Order tip for detailed performance metrics.

Maintenance

Indexes, just like just about everything else in the world, need maintenance.  Indexes need maintenance because they become fragmented.  Fragmentation is caused by INSERT, UPDATE and DELETE activity on a table splitting pages so that the logical and physical order of the pages do not match.  To resolve this issue, fragmented indexes should be rebuilt on a regular basis.

At MSSQLTips, we care a great deal about fragmentation, so check out these tips:

Additional Options

Although we have only scratched the surface with indexes, we also want to make sure you are aware of the following indexing related topics:

Next Steps
  • Once you have a baseline set of knowledge about indexes, one of the first steps you should take is building a plan to either build, rebuild or re-design your indexes.  This plan should include expanding your knowledge before just jumping into managing your database indexes.  Indexing can easily make or break a user experience due to poor application performance.  Indexing can also become expensive in terms of new hardware.  Our team has been in situations where properly designed and maintained indexes can extend the life of a hardware platform.
  • As we indicated at the beginning of this tip, this information was not intended to cover all of the aspects of indexing, but rather serve as a baseline set of information.  As such, here are some additional tips to continue to learn about indexing:
  • Do you have more questions above indexing?  Did we miss a topic you were expecting?  If so, please visit the link to the forums below and post your question.  Your question may become a future tip.  Happy indexing!


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jeremy Kadlec Jeremy Kadlec is a Co-Founder, Editor and Author at MSSQLTips.com with more than 300 contributions. He is also the CTO @ Edgewood Solutions and a six-time SQL Server MVP. Jeremy brings 20+ years of SQL Server DBA and Developer experience to the community after earning a bachelor's degree from SSU and master's from UMBC.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2009-07-14

Comments For This Article




Friday, April 24, 2015 - 9:42:06 PM - Hernan Back To Top (37032)

Great tip. Helped me understand indexes a lot. I only have one question, when you say "but in most cases, it is a good idea to have a clustered index on the table to store the data in a specific order", everything I read about specific order in SQL Server say that we don't have to consider any particular oner. Can you explain that a little bit more? Thanks in advance!

 

Hernan Martin / Blog: http://hernanmartin.me


Tuesday, July 14, 2009 - 1:16:59 PM - admin Back To Top (3733)

Thanks for the input.  Here is one tip that helps determine a better clustered index based on data from the DMVs.

http://www.mssqltips.com/tip.asp?tip=1642

 


Tuesday, July 14, 2009 - 8:05:17 AM - DrDianaDee Back To Top (3730)

I would have liked to have seen references to articles saying which column(s) should be in the clustered index -- for example, column(s) frequently used in a range search or in an ORDER BY clause.  Likewise, references for articles saying which column(s) to make a nonclustered index for should have been given.

The sentence "Covering indexes typically consist of 2 or more columns and have all of the indexes needed to fulfill a query." should really say "Covering indexes contain all of the columns needed to fulfill a query."  You should have also distinguised between the key column(s), used in the WHERE predicate, and the columns in the SELECT clause, *included* in the leaf level.















get free sql tips
agree to terms