By: Sergey Gigoyan | Updated: 2024-09-04 | Comments (5) | Related: > Indexing
Problem
When data is modified in a Microsoft SQL database, the corresponding indexes are also modified. These modifications cause indexes to become fragmented. Fragmentation means that the logical order of the data inside the index pages does not match its physical order. The more the fragmentation level is increased, the more performance is affected for SELECT statements. With fragmented indexes, the information is not spread logically, making the index's data retrieving operations more time-consuming resulting in query performance issues. Thus, fragmentation of the indexes should be fixed periodically in order to maintain high performance. Rebuild and Reorganize index operations are aimed at defragmenting indexes.
Solution
This SQL tutorial focuses on the differences and similarities of these operations. Before beginning, some important concepts related to this topic will be explained. In particular, background information about the fill factor and statistics will be provided, as these concepts are referred to while discussing Index Rebuild and Reorganization operations.
SQL Server Index Fill Factor
The fill factor setting is used to optimize index storage and performance. The value of the fill factor defines how the percentage of leaf-level index pages are filled with data. The remaining free space is used for future growth. The idea of leaving free space in index leaf pages is to avoid frequent page splits. When the page becomes full, part of the data (approximately half) is moved to a new page. These page splits are resource-intensive operations and cause index fragmentation. Therefore, correctly defining the fill factor value based on a specific workload is important in maintaining performance.
By default, the value of the fill factor is 0 or 100, which are the same and mean that the index will be fully filled. If the underlining data of the index is frequently updated, it is probably better to set a non-zero value of the fill factor to leave a space for index growth and avoid frequent page splits.
Having said that, choosing a non-zero fill factor setting has its drawbacks. When we leave a space on each page, the index will use more storage space. Additionally, read performance will be affected because the index contains more pages, and therefore, retrieving data will take much more time. Thus, it is vital to understand the processes correctly, such as the intensity of INSERT\UPDATE and SELECT operations, and define the ideal middle ground for the value of the fill factor.
It is difficult to suggest an optimal fill factor value. It depends on a specific case, workload, and environment, and it is better to estimate, test, and then set this value based on the particular case.
SQL Server Index Statistics
Not only are indexes important for the SQL Server Query Optimizer in choosing a better execution plan for the query, but also the information about how to use them correctly. This information, or statistics, allows the Query Optimizer to determine the best access method for the data. Statistics exist for the indexes as well as for the columns without indexes and define the uniqueness and distribution of the data.
Using this information, Query Optimizer can decide which index operation to use. For instance, depending on the uniqueness or distribution of the data, the Optimizer can decide whether to use an Index Seek or Index Scan while accessing the data. So, if the estimated result set contains the majority of index rows, it is probably better to choose the Index Scan instead of the Index Seek. However, to estimate a more accurate result set, the Query Optimizer should have updated information. Thus, having up-to-date statistics is essential in terms of performance.
When a new index is created, SQL Server automatically creates statistics on the index key. Over time, when data inside the table is changed, statistics can become outdated and useless. For example, if a huge amount of data has been inserted into the table, the existing statistics will not represent the actual data distribution and uniqueness inside the indexed columns. SQL Server has an AUTO_UPDATE_STATISTICS option, and when it is set to ON, the SQL Server Engine determines when to update statistics. However, updating statistics is a resource-intensive process. Statistics can also be updated manually.
SQL Index REBUILD vs. REORGANIZE
Below is a discussion of the specifications and features of each of these operations. Next, we will compare these operations in detail and provide a chart that easily illustrates the comparisons for these types of index operations.
Index REBUILD
- This operation fixes the fragmentation of indexes by dropping and re-creating the index.
- It can be online or offline (depending on the database edition).
- It is possible to set the fill factor setting of the index while rebuilding.
- Rebuilding also updates statistics on the index.
- It is recommended to rebuild indexes when the fragmentation level is more than 30%. But it should not be considered a rule and should be defined depending on the particular environment.
- It is a much more resource-consuming process than reorganizing since it requires the dropping and recreation of indexes.
- In some cases, some operations with the clustered rowstore index can cause
all non-clustered indexes on the same table to be automatically rebuilt. These
operations are:
- A clustered index is created on a table.
- A clustered index is removed.
- Changes in the clustered key.
- Despite the fact that rebuilding a clustered index drops and recreates it, it does not cause all non-clustered indexes on the same table to be rebuilt automatically.
- During the rebuild process, SQL Server creates the new, optimized index before deleting the original one. Thus, having enough space is mandatory to store both these indexes.
The syntax below rebuilds the PK_TestTableID index on the TestTable in the TestDB database by setting its fill factor value to 70:
USE TestDB GO ALTER INDEX PK_TestTableID ON TestTable REBUILD WITH (FILLFACTOR = 70)
Rebuilding can also be achieved using SQL Server Management Studio (SSMS). To do so, under the corresponding table, click on the index under Indexes, then right-click and choose Rebuild:
Here, we can see that unlike in T-SQL code, the index will be rebuilt using default settings, and there is no option to set other configurations, such as fill factor value:
Index REORGANIZE
- This operation defragments indexes by physically reordering the leaf-level pages to match their logical order (without dropping and re-creating them).
- This is an online operation.
- Reorganizing also compacts index pages using the existing value of the fill factor. However, it cannot reset the fill factor value.
- Statistics are not updated as a result of reorganizing the indexes.
- If the fragmentation level of the index is between 5% and 30%, it is recommended to perform reorganization instead of rebuilding. However, as in the case of rebuilding, this is just an estimate range, and in real-world scenarios, these values should be chosen due to the ongoing situation.
- Compared to the rebuild index operation, reorganizing an index is much less resource-consuming.
This T-SQL code reorganizes all indexes on the TestTable table:
USE TestDB GO ALTER INDEX ALL ON TestTable REORGANIZE
The same can be performed using SSMS by right-clicking on Indexes under the corresponding table and choosing Reorganize All:
To reorganize a particular index, right-click on that index and choose Reorganize.
Comparison Table for Database Index REBUILD vs REORGANIZE
To sum up, below is a table showing the differences and similarities of REBUILD and REORGANIZE index operations:
REBUILD | REORGANIZE | |
---|---|---|
Drops and recreates indexes | Yes | No |
Online operation | Can be both online and offline (also depends on the edition of SQL Server) | Yes |
Can reset the fill factor value | Yes | No |
Updates statistics | Yes | No |
Recommended when the fragmentation level of the index is | >30% | Between 5% and 30% |
Resource consumption | More resource-consuming | Less resource-consuming |
Conclusion
In conclusion, both the REBUILD INDEX and REORGANIZE INDEX operations are designed to defragment indexes inside the relational database engine. The rebuild index operation is more flexible and allows us to reset several index options as it rebuilds and recreates the index on the database table. Therefore, it is a resource-consuming process. Reorganizing the index is less resource-consuming, but can make limited changes in the index structure.
Next Steps
For additional information, please follow the links below:
- Specify Fill Factor for an Index - SQL Server | Microsoft Learn
- Configure the fill factor (server configuration option) - SQL Server | Microsoft Learn
- Maintaining indexes optimally to improve performance and reduce resource utilization - SQL Server | Microsoft Learn
- ALTER INDEX (Transact-SQL) - SQL Server | Microsoft Learn
- Editions and supported features of SQL Server 2022 - SQL Server | Microsoft Learn
- Create Indexes with SQL Server Management Studio
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-09-04