By: Mohammed Moinudheen | Updated: 2024-02-22 | Comments (1) | Related: 1 | 2 | > Replication
Problem
Can you provide a few examples of known issues encountered in SQL Server transactional replication and a possible fix for the issues faced?
Solution
This tip assumes you are familiar with transactional replication and comfortable configuring it between two databases. To learn more about the topic, refer to these replication tips on MSSQLTips.com. Additionally, Robert Pearl wrote an excellent article, Handling Data Consistency Errors in SQL Server Transactional Replication, regarding the common data consistency errors faced in transactional replication.
For this tip, we will review the possibilities for one of the errors and simulate the scenarios using a sample SQL Server database.
Simulation Set Up
For our simulation, download the WideWorldImporters sample database from Github, restore the database using the available backup, and configure transactional replication between two SQL instances. As mentioned in this tip, let's explore the common data consistency errors we may encounter in transactional replication.
In the WideWorldImporters database, we will use the following tables: [WideWorldImporters].[Sales].[Orders] and [WideWorldImporters].[Sales].[OrderLines]. After transactional replication is configured between the publisher and subscriber, you can check the max value of the primary key on both tables using the script below.
select max(OrderID) from [WideWorldImporters].[Sales].[Orders] select max(OrderLineID) from [WideWorldImporters].[Sales].[OrderLines]
Below are the highest values in each table.
Simulate Replication Failure by Forcing Primary Key Violations
Error Simulation for "2627 Violation of PRIMARY KEY constraint 'PK__A'. Cannot insert duplicate key in object 'dbo.A'."
Based on the information from the max function, new values can be inserted into the subscriber tables. This is to simulate a replication failure for primary key violation. As we are directly entering values into the replicated subscriber tables, the replication is bound to fail when we insert the same values in the publisher table.
Use the script below to insert values into the subscriber table [Sales].[Orders]. As you can see, we are entering a value of 73596 and a few more into the subscriber table.
USE [WideWorldImporters] GO INSERT INTO [Sales].[Orders] ([OrderID] ,[CustomerID] ,[SalespersonPersonID] ,[PickedByPersonID] ,[ContactPersonID] ,[BackorderOrderID] ,[OrderDate] ,[ExpectedDeliveryDate] ,[CustomerPurchaseOrderNumber] ,[IsUndersupplyBackordered] ,[Comments] ,[DeliveryInstructions] ,[InternalComments] ,[PickingCompletedWhen] ,[LastEditedBy] ,[LastEditedWhen]) VALUES (73596 ,11 ,14 ,NULL ,1021 ,NULL ,getdate() ,getdate()+2 ,13445 ,1 ,NULL ,NULL ,NULL ,NULL ,3 ,getdate()) GO
Use the script below to insert values into the subscriber table [Sales].[OrderLines]. This time, we are entering the value 231420 and other values.
USE [WideWorldImporters] GO INSERT INTO [Sales].[OrderLines] ([OrderLineID] ,[OrderID] ,[StockItemID] ,[Description] ,[PackageTypeID] ,[Quantity] ,[UnitPrice] ,[TaxRate] ,[PickedQuantity] ,[PickingCompletedWhen] ,[LastEditedBy] ,[LastEditedWhen]) VALUES (231420 ,73599 ,130 ,'This is just for MSSQLTips testing replication' ,7 ,20 ,54.00 ,15.00 ,19 ,getdate() ,4 ,getdate()) GO
Replication Errors Starts Due to Duplicate Entries
Next, use the script above to insert the same values into the publisher database. In a few minutes, you will notice replication failure when checking the replication monitor, as seen in the screenshot below.
The error message states, "Violation of PRIMARY KEY constraint 'PK_Sales_OrderLines'. Cannot insert duplicate key in object 'Sales.OrderLines'. The duplicate key value is (231420). (Source: MSSQLServer, Error number: 2627)."
We knowingly entered duplicate values in two tables in the subscriber database to generate the errors. However, the replication monitor only reports errors in one of the tables.
Using the script below, you can view the details in the distribution database. We can filter by the error code to get the details of the replication failure.
Use Distribution go select * from dbo.MSrepl_errors where error_code in ('2627')
You can see that the replication is failing continuously with the error as the duplicate key value of 231420 is detected.
If you see similar errors from earlier SQL Server versions as you can see from this tip, you will notice that it can get much harder to fix this type of error as the actual duplicate value is not seen in the error message as shown below. As you can see from this tip from a few years back, we can only see this message "Cannot insert duplicate key in object" but there is no mention of the actual duplicate value which is 231420 in this case.
Few Commands and Options to Collect Additional Information Regarding the Replication Errors
From the errors, we see that the distribution agent is failing and retrying repeatedly due to these duplicate values. You will also notice that the undistributed commands will increase, causing further delays in subscriber syncing with the publisher. This will occur as the publisher database continues to function normally, and data modifications will occur, causing the latency to increase.
There are helpful commands that can be used to collect additional information from the distribution databases, including information on the articles used in replication. This is especially useful in complex scenarios where hundreds of replication components are configured on the server.
Use Distribution Go select * from MSarticles
Output is below.
As seen earlier from the replication monitor, we could also collect information on the actual transaction sequence number where replication is failing by copying the highlighted value and using it in the sp_browsereplcmds option.
Replace the @xact_seqno_start and @xact_seqno_end with the transaction sequence number copied from the replication monitor.
Use distribution go exec sp_browsereplcmds @xact_seqno_start = '0x00000274000010A0000A00000000', @xact_seqno_end = '0x00000274000010A0000A00000000', @publisher_database_id = '2'--Query MSpublisher_databases on distributor DB
You can see that the insert operation is failing for value 231420. Another option to view the error details is to review the distribution agent job directly on the Job Activity Monitor by clicking on the correct job for that database and viewing the job history.
Right-click on the distribution agent job, review the job steps as shown, and view the job history. You will notice a string of duplicate entries in the subscriber database, causing the replication to fail.
Fixing the Errors Related to the Duplicate Key Values
As you can see from the error messages, several duplicate entries in the subscriber tables are causing the replication to fail. You may directly delete those duplicate entries in the subscriber table; in this case, we can see duplicate entries in the 'Sales.OrderLines' table. You can use the below script to delete those entries.
--Run this on subscriber database delete from Sales.OrderLines where OrderLineID in(231420,231421,231422,231423,231424)
Check the 'Replication monitor' again to check the status.
You will see errors appearing for the next subscription table with duplicate entries. When you review the job history for the distribution agent job, you will get additional information, as shown below.
Based on this information, you can further delete duplicate entries in the subscription database tables using this script.
--Run this on subscriber database delete from Sales.OrderLines where OrderLineID in(231425) delete from Sales.Orders where OrderID in(73601,73602)
Refresh the replication monitor after deleting the duplicate entries from the subscriber tables, but give it some time. You may also stop and restart the distribution agent job. In a short while, you will notice that the replication alerts have cleared and the undistributed commands have reduced.
Using this method, you needed to fix the duplicate errors in the subscriber tables to get the replication working again. Our scenario only had a few duplicate entries in the subscriber tables. But, in real-world production environments, you may encounter several of them, which may be time-consuming to analyze and fix.
Option to Skip the Errors in the Distribution Agent Profile
As you can see, the process of deleting the duplicate entries in the subscription tables is time-consuming. Another option is skipping the data consistency errors described in this tip. To skip the data consistency errors, you can update the distribution agent profile to use a non-default profile. Make sure to go to the correct distribution agent profile in the replication monitor. Click on the correct publication node and then right-click on the correct subscription where replication is failing to change the agent profile. If you right-click on the root node on the replication monitor and then change the agent profile, it may impact all the rest of the publications.
Click on the Agent Profile to see the different distribution agent profiles. Here, you will see the 'Default agent profile' option is already selected.
Click on the ellipsis to view this profile's settings. This default profile does not have an entry for the -SkipErrors parameter.
The errors related to primary key violation have an error number of 2627, as shown below.
So, we want to use that error number in the -SkipErrors parameter. We have two options: select the profile 'Continue of data consistency errors' or create a custom profile that skips error 2627.
Option 1: Skip the Errors by Selecting System Profile that Skips Data Consistency Errors
The image below shows that the 'Continue on data consistency errors' checkbox is checked.
You can view the details related to the -SkipErrors parameter by clicking the ellipsis.
In the -SkipErrors parameter, you can see the error numbers 2601:2627:20598, meaning that this parameter will handle issues related to each of these error numbers. Our primary key violation issue is related to error number 2627, which is included in this parameter. In other words, enabling this profile will cover replication issues related to primary key violations.
Option 2: Skip the Errors by Creating a New Custom Profile
As you saw, the system profile for skipping data consistency errors will skip errors related to multiple error numbers. Instead of including multiple error numbers, we can create a new profile only to skip the errors related to error number 2627.
Click the option to create a new profile as shown.
Next, you will see and select one of the profiles available to initialize the parameters. Select 'Default agent profile'.
A new window will appear, allowing you to input the correct value for the -SkipErrors parameter. In this case, enter 2627 and click OK.
Now, you are only skipping the error number causing the data consistency errors.
NOTE: Once either option is implemented, you must stop and restart the distribution agent job for this to work. You can use the script provided in this tip for that purpose.
-- To STOP the Distribution Agent: sp_MSstopdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db --To START the Distribution Agent: sp_MSstartdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db
Risks of Skipping Errors Related to Data Consistency
Ideally, it is not recommended to use this option to skip data consistency errors in the distribution agent profile. However, this could be used as a temporary solution, but why these errors are occurring in your configuration needs to be investigated further. It may be possible for the subscription database to be used for reporting purposes. If several transactions were never applied in the subscriber database due to skipping these errors, it may lead to compliance issues for your business. So, it is recommended that these issues be solved in the long term by preventing them from occurring in the first place. Refer to this tip by Tim Smith, Addressing SQL Server Replication Violation of Primary Key Errors, which addresses replication errors related to primary key violations.
In this tip, you saw a demo of a known issue in transactional replication due to primary key violations. In the next few tips, we will discuss some other errors we may encounter in transactional replication.
Next Steps
- Configure transactional replication in your lab server.
- Try out the scripts in this tip for demo purposes.
- Try to break replication in your lab server and simulate the options to fix the errors.
- Review these tips related to replication on MSSQLTips.com.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2024-02-22