Understanding SQL Data Sync for SQL Server

By:   |   Updated: 2013-10-25   |   Comments (5)   |   Related: > Azure


Problem

Windows Azure SQL Database has a feature called SQL Data Sync which synchronizes data across multiple SQL Server and Windows Azure SQL Database instances. Can you tell me more about this new feature, how it works and what consideration to take when using it?

Solution

Windows Azure SQL Database has a feature called SQL Data Sync which synchronizes data across multiple SQL Server and Windows Azure SQL Database instances. Before you can use data synchronization using SQL Data Sync you need to first create a Sync Agent on the Windows Azure Portal, then create sync groups with which you can specify the databases, tables and columns to synchronize as well as the synchronization schedule.

SQL Data Sync is a cloud-based data synchronization service built on Microsoft Sync Framework technologies. It provides single direction as well as bi-directional data synchronization and data management capabilities allowing data to be easily shared across Windows Azure SQL Databases across multiple data centers or between on-premises SQL Server databases and Windows Azure SQL databases.

These are the different scenarios for data synchronization using SQL Data Sync:

  • Cloud (Windows Azure SQL Database) to cloud (Windows Azure SQL Database) synchronization
  • Enterprise (SQL Server on-premises) to cloud (Windows Azure SQL Database)
  • Cloud (Windows Azure SQL Database) to Enterprise (SQL Server on-premises)
  • Bi-directional (changes made either on SQL Server or Windows Azure SQL Database are automatically synchronized back and forth) or sync-to-hub or sync-from-hub synchronization

Please note, you need to have at least one Windows Azure SQL Database instance which works as a hub for the sync group in the hub-and-spoke topology.

Understanding system requirements before using SQL Data Sync

The operating system platform needed by SQL Data Sync include Windows 2008, Windows 2008 R2, Windows 2012, Windows Vista, Windows 7 or Windows 8 for either 32-bit and 64 bit platform.

You need to have a Windows Azure account and subscription along with SQL Database subscription (one at minimum; if you have Windows Azure SQL Database instances in multiple data centers then you need a SQL Database subscription for each data center).

Though SQL Server 2005 SP2 is supported, SQL Data Sync works best with SQL Server 2008 R2 and later versions as support for Windows Azure SQL Database was added to SQL Server Management Studio (SSMS) in SQL Server 2008 R2.

Understanding limitations of SQL Data Sync

A database can belong to maximum of 5 sync groups whereas the total number of endpoints can be up to 30 in a single sync group (if you have a database belonging to multiple sync groups, each of the them is counted as one endpoint) if there is only one sync group. If you have more than one sync group the total number of endpoints across all sync groups cannot exceed 30. There is another point to note, if you have databases participating from on-premises SQL Server instance, the maximum for them is 5 in a single sync group.

With respect to names of database objects, there are certain restrictions for the name of the objects (database, tables or columns) such as the object name must not contain printable characters, a period (.), left square bracket ([) or right square bracket (]).

The information is based on SQL Data Sync - Preview. It might change in the final release of the product. Please check the latest documentation on msdn.

Best Practices and Optimization

As you might have guessed, the cost of moving data between Azure data centers is greater than moving data within a single data center (throughput within a single data center is about double what it would be between data centers). So ideally, you should have the SQL Data Sync server created in the data center where you have the most data traffic, so that you can reduce overall costs by minimizing the data traffic leaving or entering a data center. Also, as the sync group hub has more traffic to handle, create it in the data center with the most data traffic or where most of the databases are located. If most of the databases from a sync group are from a single data center, its recommended to create the hub in the same data center.

Only include the tables which are required as per your business needs in the sync group; including unnecessary tables can have impact on the overall cost as well as on efficiency of the synchronization.

Prudently set the frequency of automatic synchronization. Synchronize only as often as is required by your business requirement or needs. Synchronizing too often can increase the traffic and hence increasing the overall cost. To offset the traffic, certain sync groups can be synchronized on a less frequent basis. For example, for a payroll sync group you can do synchronization once a month (or a week) instead of doing synchronizations hourly or daily.

There are many other best practices, guidelines and optimization techniques mentioned on msdn documentation.

Conflict Resolution Policies

You can define policies to apply when there are conflicting changes and these policies are:

  • Hub Wins - If you set this policy then the first row change written to the hub is kept intact. SQL Data sync discards subsequent attempts to write to the same row in the hub. This means the first change in a row written to the hub is propagated out to all the member databases by the end of the sync.
  • Client Wins - If you set this policy then every row changed in a client database is written to the hub, overwriting prior changes to the same row. This means the last write to the hub (from the client database) is propagated out to all the member databases by the end of the sync.

Please note, SQL Data Sync is in preview as of writing this tip and features might change in the final release of the SQL Data Sync product; for latest updates on SQL Data Sync refer to this msdn documentation.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Arshad Ali Arshad Ali is a SQL and BI Developer focusing on Data Warehousing projects for Microsoft.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2013-10-25

Comments For This Article




Monday, April 21, 2014 - 3:53:41 PM - TM Back To Top (30385)

I am not clear on the part in this article that states "object name must not contain printable characters".

Could someone clarify?

Thanks.

 


Monday, October 28, 2013 - 6:10:34 PM - S Back To Top (27302)

HierarchyID is supported in Azure but not in the Sync framework as I understand it.


Monday, October 28, 2013 - 4:37:49 PM - Arshad Back To Top (27301)

Hi Christopher, As it serves the same person, you can think of it like replication but these are two different features and have different architecture.

I didn't use HierarchyID recently but as per this documentation, its already supported now:

http://msdn.microsoft.com/en-us/library/windowsazure/ee336233.aspx


Monday, October 28, 2013 - 3:21:31 PM - S Back To Top (27299)

When Will HierarchyID Be supported?

 


Sunday, October 27, 2013 - 1:36:55 PM - CHRISTOPHER JOHNSON Back To Top (27292)

So is this the Windows Azure replication model?















get free sql tips
agree to terms