Handling Large SQL Server Tables with Data Partitioning

By:   |   Updated: 2007-03-15   |   Comments (11)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | > Partitioning


Problem

With the increasing use of SQL Server to handle all aspects of the organization as well as the increased use of storing more and more data in your databases there comes a time when tables get so large it is very difficult to perform maintenance tasks or the time to perform these maintenance tasks is just not available.  In the past, one way of getting around this issue was to partition very large tables into smaller tables and then use views to handle the data manipulation.  With SQL Server 2005 a new feature was added that handles this data partitioning for you automatically, so the ability to create and manipulate data in partitioned tables is much simpler.

Solution

In SQL Server 2005 a new feature called data partitioning was introduced that offers built-in data partitioning that handles the movement of data to specific underlying objects while presenting you with only one object to manage from the database layer.  The picture below shows how a table may look when it is partitioned.  To the DBA and to the end user it looks like there is only one table, but based on the partition scheme the underling data will be stored in a different partitions and not in one large table.  This makes all of the existing code you have in place work without any changes and you get the advantage of having smaller objects to manage and maintain.

partition scheme
(source: SQL Server books online)

Creating a Partitioned Table

To create a partitioned table there are a few steps that need to be done:

  1. Create additional filegroups if you want to spread the partition over multiple filegroups. 
  2. Create a Partition Function
  3. Create a Partition Scheme
  4. Create the table using the Partition Scheme

Step 1 - Create Additional Filegroups

For this example, I have created four filegroups flg1, flg2, flg3 and flg4.

This is step is not mandatory, you can still use just one filegroup even if you partition the data, but one of the advantages of partitioning a table is to spread the data over multiple filegroups to get better IO throughput.

Step 2 - Create Partition Function

This creates a range of values for the partition. This will create four partitions:
  • values <= 10,000
  • values > 10,000 and <= 100,000
  • values > 100,000 and <= 1,000,000
  • values > then 1,000,000
CREATE PARTITION FUNCTION partRange1 (int)  
AS RANGE LEFT FOR VALUES (10000, 100000, 1000000);
GO

Step 3 - Create Partition Scheme

This creates the partition scheme to determine where each of the partitions will reside.  In this example we are spreading it over four filegroups:
  • values <= 10,000 (flg1)
  • values > 10,000 and <= 100,000 (flg2)
  • values > 100,000 and <= 1,000,000 (flg3)
  • values > then 1,000,000 (flg4)
CREATE PARTITION SCHEME partScheme1 
AS PARTITION partRange1 TO ( flg1, flg2, flg3, flg4 );
GO

Step 4 - Create Table Using Partition Scheme

This creates the table using the partition scheme partScheme1 that was created in step 2.  The column col1 is used to determine what data gets placed in which partition/filegroup.

CREATE TABLE partTable (col1 int, col2 char(10))
ON partScheme1 (col1);
GO

Adding Data to Partitioned Table

After the table has been setup as a partitioned table, when you enter data into the table SQL Server will handle the placement of the data into the correct partition automatically for you.

So, based on the above setup if we run the below commands the data will be placed in the appropriate partition as shown below.

-- data will go to partition 1 on filegroup flg1
INSERT INTO partTable (col1, col2) VALUES (25, 'Test1')

-- data will go to partition 1 on filegroup flg1
INSERT INTO partTable (col1, col2) VALUES (1234, 'Test1')

-- data will go to partition 2 on filegroup flg2
INSERT INTO partTable (col1, col2) VALUES (10243, 'Test1')

-- data will go to partition 4 on filegroup flg4
INSERT INTO partTable (col1, col2) VALUES (25000000, 'Test1')

-- data will go to partition 1 on filegroup flg1
INSERT INTO partTable (col1, col2) VALUES (-2523, 'Test1')

To determine what exists in each partition you can run the following command:

SELECT $PARTITION.partRange1(col1) AS Partition, COUNT(*) AS [COUNT]
FROM dbo.partTable 
GROUP BY $PARTITION.partRange1(col1) 
ORDER 
BY Partition ;

Here is the result from running the above query on our simple test of record inserts.

partition counts

In addition to determining the number of rows that are in each of the partitions we can also see how fragmented each of these partitions are.  By using the DMV sys.dm_db_index_physical_stats we can get this information

SELECT object_id, partition_number, 
avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(),OBJECT_ID(N'dbo.partTable'), NULL , NULL, NULL);

Based on this results from sys.dm_db_index_physical_stats, you can rebuild an index for a particular partition.  Here is an example of the code that could be used to rebuild index IX_COL1 only on partition #4.

ALTER INDEX IX_COL1
ON dbo.PartTable
REBUILD Partition = 4;
GO

As you can see this is a great enhancement to SQL Server.  The only downside is that it only exists in the Enterprise and Developer editions.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Greg Robidoux Greg Robidoux is the President and founder of Edgewood Solutions, a technology services company delivering services and solutions for Microsoft SQL Server. He is also one of the co-founders of MSSQLTips.com. Greg has been working with SQL Server since 1999, has authored numerous database-related articles, and delivered several presentations related to SQL Server. Before SQL Server, he worked on many data platforms such as DB2, Oracle, Sybase, and Informix.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2007-03-15

Comments For This Article




Tuesday, May 2, 2017 - 8:13:18 AM - Greg Robidoux Back To Top (55381)

Hi Durgadas,

When querying the data using an index, the index will still point to the appropriate partition to get the data so index seeks will remain as index seeks.


Tuesday, May 2, 2017 - 2:42:25 AM - Durgadas Back To Top (55373)

Let me know whether it is appropriate for OLTP applications. What will be the effect on optimization level if we calls the records based on a Non Partitioned Column? Index seek change to scan? 


Wednesday, January 20, 2016 - 1:41:42 PM - darlove Back To Top (40457)

Q: How to partition a table which has data in it?

A: Simply put a clustered index on it and make sure that the index gets built on the relevant partition scheme (the column that the partition function will use must be included in the clustered index that you're going to create). If the table already has a clustered index, drop it and rebuilt (on the scheme). If you want to retain the heap, after you've rebuilt drop the clustered index.


Tuesday, January 28, 2014 - 6:09:50 AM - Nirav Kothari Back To Top (29250)

 thx for sharing the partition concept.


Tuesday, April 23, 2013 - 10:07:48 AM - Greg Robidoux Back To Top (23517)

@murali - are you exporting data from SQL Server out to a text file?  Not exactly sure what you need to do.


Tuesday, April 23, 2013 - 7:57:51 AM - murali Back To Top (23507)

HI,

I want to retrieve the data which is in bulk [1 lakh rows] from sqlserver 2008 R2. When i am retrieving the data it is taking so much time......How can i handle this? any other ways to crack this issue?

 

 

 

Thanks in advance.

K. Murali Krishna.


Monday, September 10, 2012 - 8:44:16 AM - Greg Robidoux Back To Top (19447)

@Daniel - yes you could use this approach to handle very large tables and archive older data very quickly.  Just not that you will need the Enterprise version of SQL Server.

 

See this tip as well: http://www.mssqltips.com/sqlservertip/1406/switching-data-in-and-out-of-a-sql-server-2005-data-partition/

 


Sunday, September 9, 2012 - 4:56:46 PM - Daniel Alvarado Back To Top (19443)

Hi Greg.

This partion table could works for a table with increments about 5 millon records per day?

We're having problems with this detail table, so I'm looking for the best option to manage this table.

Thanks.

PS: Actually the table have 31 millon records and growing every day.


Monday, July 2, 2012 - 10:30:22 AM - Greg Robidoux Back To Top (18301)

Kranthi - see if this tip helps: http://www.mssqltips.com/sqlservertip/1406/switching-data-in-and-out-of-a-sql-server-2005-data-partition/

 


Sunday, July 1, 2012 - 12:47:20 AM - kranthi Back To Top (18283)

Hi Greg

How to partition a table which has data in it.


Sunday, July 1, 2012 - 12:45:11 AM - kranthi Back To Top (18282)

Very good explanation Greg, this post made me understand database partitioning very easily.

Thanks















get free sql tips
agree to terms