SQL Server Convert Cursor to Set Based


By:
Overview

The T-SQL programming language is a declarative, set based language. What does this mean? You issue a query for the results you want and the database engine sweats the details on how best to satisfy the request. Developers that are generally unfamiliar with declarative queries usually turn towards looping mechanisms, such as cursors, which retrieve data one row at a time resulting in query inefficiencies.

Explanation

Cursor logic is one of the common performance killers I come across when troubleshooting database issues. If a cursor is working with few rows, it may not ever manifest itself as an issue. Invariably, you will end up working with large sets of data and, at this point, you will likely see cursor logic begin to cause grief. I once had to troubleshoot a conversion script for a customer that worked with data from 3 related tables. The script processed the first table with a cursor, looked for related entries in a 2nd table which was also looped through with a cursor, followed by the 3rd table, also accessed using a cursor. Total number of rows to process was roughly 120,000. Total execution time was three days (I only know because I had a morbid curiosity to find out, so I let the script run to completion). I re-wrote the script querying the tables as a set (basically INNER JOINing the 3 tables) and I was able to cut the conversion down to about 1 minute. Needless to say, the customer was happy.

Let's examine a simple example. The following stored procedure is used to get some information about an order in AdventureWorks. It uses a set based query to retrieve the results

set statistics time on
go
create procedure GetOrderData @SalesOrderID int
as
begin
 set nocount on
 
 select OH.OrderDate, OH.DueDate, OH.ShipDate, D.ProductID, P.Name
 from Sales.SalesOrderHeader as OH
 inner join Sales.SalesOrderDetail as D on D.SalesOrderID = OH.SalesOrderID
 inner join Production.Product as P on P.ProductID = D.ProductID
 where OH.SalesOrderID = @SalesOrderID
end
go
exec GetOrderData @SalesOrderID = 51721
go

The resulting execution is extremely efficient returning all 72 rows for the resultset in about 5 milliseconds.

 SQL Server Execution Times:
   CPU time = 0 ms,  elapsed time = 5 ms.

Now let's re-write this procedure to retrieve the same results using a cursor to spin through the sales order detail rows and examine the difference.

set statistics time on
go
alter procedure GetOrderData @SalesOrderID int
as
begin
 set nocount on
 
 declare @OrderDate datetime
 declare @DueDate datetime
 declare @ShipDate datetime
 declare @ProductID int
 declare @Name nvarchar(50)
 select @OrderDate = OrderDate,
            @DueDate = DueDate,
            @ShipDate = ShipDate
 from Sales.SalesOrderHeader
 where SalesOrderID = @SalesOrderID
 declare DetailCursor
 cursor for
 (
  select ProductID
  from Sales.SalesOrderDetail
  where SalesOrderID = @SalesOrderID
 )
   
 open DetailCursor
 fetch DetailCursor into @ProductID
 while @@FETCH_STATUS = 0
 begin
  select @Name = Name
  from Production.Product
  where ProductID = @ProductID
  
  select @OrderDate as OrderDate, 
    @DueDate as DueDate, 
    @ShipDate as ShipDate, 
    @ProductID as ProductID,
    @Name as Name
    
  fetch next from DetailCursor into @ProductID
 end
 close DetailCursor;
 deallocate DetailCursor;
end
go
exec GetOrderData @SalesOrderID = 51721
go

The cursor approach in this stored procedure takes longer to produce the same query result

 SQL Server Execution Times:
   CPU time = 15 ms,  elapsed time = 1886 ms.

While this is not an extreme case, you may find the cursor approach start to exponentially degrade as the number of rows to process grows. I generally avoid cursors in production code except under specialized circumstances.

Additional Information

Last Update: 9/10/2011




Comments For This Article

















get free sql tips
agree to terms