Comparing performance for the MERGE statement to SELECT, INSERT, UPDATE or DELETE

By:   |   Updated: 2012-04-19   |   Comments (6)   |   Related: 1 | 2 | 3 | 4 | 5 | 6 | 7 | > TSQL


Problem

Now that SQL Server 2008 has been out for quite awhile I've started to see a lot more use of the MERGE statement that became available in this version. While I do find it makes code more readable, I wondered whether or not there were any performance benefits or drawbacks to using this new statement as compared with just using the classic SELECT/INSERT/UPDATE/DELETE statements to complete the same task.

Solution

Test Setup

For this test we are going to compare the following four scenarios which at a high level insert a new record into a table or update it if it already exists (based on the primary key of the table). We will run each scenario twice, once against a table where the record being inserted/updated already exists and once when it doesn't.

Here are the scenarios:

  1. SELECT record then INSERT or UPDATE based on result of SELECT statement
  2. UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement
  3. DELETE record then INSERT
  4. MERGE record

Here is the sql used to implement the test scenarios outlined above along with the code to create and load the test table.

-- Create test table and primary key
CREATE TABLE test (col1 INT NOT NULL, col2 VARCHAR(100))
go
ALTER TABLE dbo.test ADD CONSTRAINT
 PK_test PRIMARY KEY CLUSTERED 
 (
 col1
 ) WITH( STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, 
                ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 
GO
-- Populate table
DECLARE @val INT
SELECT @val=1
WHILE @val < 100000
BEGIN  
   INSERT INTO dbo.test(col1, col2) VALUES(@val,'TEST' + CAST(@val AS VARCHAR))
   SELECT @val=@val+1
END
GO
-- test SELECT record then INSERT or UPDATE based on result of SELECT statement (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
SELECT @cnt=COUNT(1) FROM test WHERE col1=2
IF @cnt > 0 
  UPDATE test SET col2='update2' WHERE col1=2
ELSE
  INSERT INTO test VALUES (2,'update2')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test SELECT record then INSERT or UPDATE based on result of SELECT statement (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
select @cnt=COUNT(1) from test WHERE col1=100001
IF @cnt > 0 
  UPDATE test SET col2='update100001' WHERE col1=100001
ELSE
  INSERT INTO test VALUES (100001,'update100001')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
UPDATE test SET col2='update4' WHERE col1=4
SELECT @cnt=@@ROWCOUNT
IF @cnt < 1
  INSERT INTO test VALUES (4,'update4')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test UPDATE record then SELECT update rowcount and INSERT based on result of SELECT statement (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DECLARE @cnt INTEGER
UPDATE test SET col2='update100004' WHERE col1=100004
select @cnt=@@ROWCOUNT
IF @cnt < 1
  INSERT INTO test VALUES (100004,'update100004')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test DELETE record then INSERT (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DELETE FROM test WHERE col1=40000
INSERT INTO test VALUES (40000,'update40000')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test DELETE record then INSERT (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
DELETE FROM test WHERE col1=100002
INSERT INTO test VALUES (100002,'update100002')
GO
sp_lock
GO
COMMIT TRANSACTION
-- test MERGE record (record exists)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
MERGE test AS TARGET
USING (SELECT 50000 as col1,'update50000' as col2) AS SOURCE
ON (TARGET.col1=SOURCE.col1)
WHEN MATCHED THEN
  UPDATE SET TARGET.col2=SOURCE.col2
WHEN NOT MATCHED THEN
  INSERT (col1,col2) VALUES (SOURCE.col1,SOURCE.col2);
GO
sp_lock
GO
COMMIT TRANSACTION
-- test MERGE record (record does not exist)
DBCC DROPCLEANBUFFERS
GO
BEGIN TRANSACTION
MERGE test AS TARGET
USING (SELECT 100003 as col1,'update100003' as col2) AS SOURCE
ON (TARGET.col1=SOURCE.col1)
WHEN MATCHED THEN
  UPDATE SET TARGET.col2=SOURCE.col2
WHEN NOT MATCHED THEN
  INSERT (col1,col2) VALUES (SOURCE.col1,SOURCE.col2);
GO
sp_lock
GO
COMMIT TRANSACTION
-- cleanup
DROP TABLE test
GO

Comparing Locks Acquired

Using sp_lock before the "COMMIT TRANSACTION" was issued for each scenario I was able to determine that in every case an X lock was acquired on the row being inserted/updated and an IX lock on both the table and page where the record was located. Below is the output of sp_lock for the first scenario. All other scenarios followed the exact same pattern. More information on these lock types can be found here.

spid dbid ObjId IndId Type Resource Mode Status
80 1 1659152956 0 TAB   IX GRANT
80 1 1659152956 1 PAG 1:310 IX GRANT
80 1 1659152956 1 KEY (020068e8b274) X GRANT

Comparing Resources Used

Now that we have confirmed that the MERGE statement acquires the exact same locks as the classic SELECT, INSERT, UPDATE and DELETE would, let's focus on the performance of these statements. We will run through the scenario's we described above and use SQL Profiler to capture some key performance metrics. Here are the results.

    CPU (ms) Reads Writes Duration (ms)
Scenario #1 Record exists 0 4 0 5
Record does not exist 0 12 0 8
Scenario #2 Record exists 0 6 0 10
Record does not exist 0 4 0 2
Scenario #3 Record exists 0 12 1 9
Record does not exist 0 4 0 1
Scenario #4 Record exists 0 10 1 2
Record does not exist 0 10 0 9

Looking at these results we can see that Scenario #2 performs marginally better that the other scenarios. The reason for this can probably be attributed to the fact that when it performs the check (SELECT) it is checking the @@ROWCOUNT variable as compared to checking a physical table. As I mention in the next steps I would guess that if we were to perform a similar test using sets of data, rather than a single record insert/update, the MERGE statement may perform better than the other scenarios. Looks like I may have to have a part 2 for this tip.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Ben Snaidero Ben Snaidero has been a SQL Server and Oracle DBA for over 10 years and focuses on performance tuning.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2012-04-19

Comments For This Article




Saturday, April 21, 2012 - 12:41:16 PM - Mladen Andrijasevic Back To Top (17043)

Ben,

 

Thanks for answering 

 

>>would acquire locks as it inserts so whichever process actually inserts/updates data first would hold the lock and any other process trying to insert/update the same record would receive the error.

 

 

Well,should it not WAIT until the other transaction ends? There seems to be only exclusive locks and they are kept until the end of the transaction.    And when the lock is released then MERGE should update (and not insert) if the key already exists and not result in a PK violation.   What is more, if I execute the same code from two connections without committing one of them the other one waits.   I agree that one should avoid using the serializable isolatoin level, that is precisely why I am trying to understand if we can avoid it.  SERIALIZABLE prevents phantom inserts through key range locking but in this particular case even when I put SERIALIZABLE ( no violations) there do not seem to be key range locks so why would a lesser isolation level not work?   I will check the exact allocations of locks .

 

Thanks

Mladen


Friday, April 20, 2012 - 1:50:41 AM - Allan Hansen Back To Top (17013)

My own testing when I blogged about MERGE showed that I got about 25% increase in performance using MERGE versus just a plain update/insert over 66.000 ish records.




Thursday, April 19, 2012 - 11:51:41 PM - Ben Snaidero Back To Top (17011)

Mladen,

Without seeing a trace of your system under load as well as the sql you are running (specifically to see the explain plans) it's hard to diagnose exactly what's happening in your situation.  That said, I'd have to assume that using the merge statement would be no different than issuing an update or insert from multiple sessions in that the sql engine would acquire locks as it inserts so whichever process actually inserts/updates data first would hold the lock and any other process trying to insert/update the same record would receive the error.  I've never liked using serializable transactions as this reduces concurrency, most of these types of issues can usually be solved with proper indexing.

Thanks for reading

Ben


Thursday, April 19, 2012 - 3:20:55 PM - GMAguilar Back To Top (17008)

UPSERT has the best performance against  (Update or Insert) combintation.

 

begin trans

UPDATE ...

If @rowcount = 0 then

INSERT ...

End If

Commit


Thursday, April 19, 2012 - 8:20:28 AM - Andy Wilbourn Back To Top (16999)

In your sampling perhaps you should update sp_lock with sys.dm_tran_locks as the docs show sp_lock is going away.


Thursday, April 19, 2012 - 8:19:35 AM - Mladen Andrijasevic Back To Top (16998)
Although MERGE is an atomic operation without the need for explicit transactions we do get PK violations if we do not put SET TRANSACTION ISOLATION LEVEL SERIALIZABLE or WITH (HOLDLOCK) in the MERGE statement when there are identical values specified from different connections under heavy load. Why is that the case?














get free sql tips
agree to terms