SQL Server 2012 Multi-Subnet Cluster Part 4

Problem

In a previous tip on Install SQL Server 2008 on a Windows Server 2008 Cluster Part 1, we have seen how to install and configure a SQL Server 2008 on a Windows Server 2008 Failover Cluster. We now have a new requirement to deploy a SQL Server 2012 instance on a multi-subnet cluster that spans different geographical locations. How do I install and configure a SQL Server 2012 on a multi-subnet cluster? Check out this tip to learn how.

Solution

To continue this series on Installing SQL Server 2012 on a Multi-Subnet Cluster, let's look at what we have accomplished so far.

In Part 1, we have configured the storage in both of the servers that we will be using as part of our cluster.
In Part 2, we've created the Windows Server 2008 R2 multi-subnet cluster, configured the cluster heartbeat settings and modified the cluster quorum configuration.
Part 3 completed with a working SQL Server 2012 multi-subnet cluster running on a single node.

In this tip, we will proceed to add a node in a SQL Server 2012 multi-subnet cluster and configure network properties of the cluster resource group.

Adding a Node on a SQL Server 2012 Multi-Subnet Cluster

Adding a node in a SQL Server 2012 multi-subnet cluster is no different than performing the same task in a single-subnet cluster - the steps have been highlighted in this tip. What I would like to point out in this tip, however, are the differences in the sections of the installation that pertains to the cluster spanning multiple subnets. As I've also mentioned in the previous tip, the number of nodes you can add in a failover cluster depends on the editions of SQL Server that you will use. Make sure you include that in your planning phase. I'm also skipping some sections because they're basically the same as when you were installing the SQL Server 2012 failover cluster.

To add a node on a SQL Server 2012 multi-subnet cluster:

Run setup.exe from the installation media to launch SQL Server Installation Center.
Click on the Installation link on the left-hand side. Click the Add node to a SQL Server failover cluster link. This will run the SQL Server 2012 Setup wizard.
In the Setup Support Rules dialog box, validate that the checks return successful results and click OK.
In the Cluster Node Configuration dialog box, validate that the information for the existing SQL Server 2012 cluster is correct. Click Next.
In the Cluster Network Configuration dialog box, enter the virtual IP address and subnet mask that your SQL Server 2012 cluster will use in the subnet that the second node is in. From the previous tip, this would be 192.168.0.13. Similar to the previous tip, notice that the setup process has detected the existence of two network subnets - LAN_DC1 and LAN_DC2. Since we've already configured the virtual IP address for LAN_DC1 in the first subnet, the section has been disabled. You'll also notice the message box that gives you a brief explanation of how the OR logic dependency works similar to how we've defined it in another previous tip. Click the Yes button in the message box. Click Next.
In the Service Accounts dialog box, verify that the information is the same as what you have used to configure the first node. Click Next.
In the Error and Usage Reporting dialog box, click Next.
In the Add Node Rules dialog box, verify that all checks are successful and click Next.
In the Ready to Add Node dialog box, verify that all configurations are correct and click Install.
In the Complete dialog box, click Close. This concludes adding a node to a SQL Server 2012 Multi-Subnet Cluster.

You can validate your cluster installation by expanding the Services and Applications node in Failover Cluster Manager and check the cluster name of your SQL Server instance. You can now see an option to move the service to another node, in this case, the node you've just added in your failover cluster.

Check the cluster name of your SQL Server instance in the Failover Cluster Manager

You probably noticed that I was not able to successfully failover to the newly added node in my multi-subnet cluster. When you look at the clustered resources for the SQL Server resource group, the virtual IP address for the second subnet, the virtual network name and the clustered disks are all online. This helps me narrow down the issue and is definitely not a network nor a disk issue. The only clustered resource that would not go online is the SQL Server service (and, of course, the SQL Server Agent service because it is dependent on the SQL Server service.) If you look at the Windows Event Log entries on the node that was recently added to the cluster, you will see an error message similar to the one below. This will point you to the potential cause of the problem.

Windows Event Propertes - Event 5123, MSSQLSERVER CREATE FILE encountered operating system error 3

I highlighted in the previous tip that if you will be storing the tempdb database in a local drive instead of a clustered drive, you need to make sure that:

The same drive letter and folder structure exists in all of the nodes in the cluster
The SQL Server service account has the appropriate permissions on the folder where tempdb will be created

NOTE: I initially granted the SQL Server service account the Modify permissions on the folder where the tempdb 
database will be stored. However, that didn't allow the creation of the tempdb database during failover, 
thus, preventing the clustered instance to go online on the node. I needed to grant the Full Control 
permissions so I can successfully failover.

Tuning the SQL Server Failover Cluster Instance DNS Settings

System downtime is more often a perception than reality. Even if we manage to perform a failover between cluster nodes in a matter of seconds, users will still think that the system is down if the client applications could not connect to the database server. In a multi-subnet cluster, the perceived downtime is influenced by the behavior of the DNS client's ability to resolve the virtual server name. Client workstations and applications cache DNS entries for a period of time before checking with the DNS server to see if the name resolution has changed. This is called the Time-To-Live (TTL) value and, for cluster resources, the default value is 1200 seconds, or 20 minutes. Consider a situation where a client workstation has just checked in with the DNS server and immediately after that, the cluster resource failed over to the other node on a different subnet. Even though the cluster resource is already online on the other subnet, the client workstation will still have to wait for the next time that it has to check in with the DNS server. This can be addressed by reducing the TTL value of the DNS host record registered by the cluster resource, in this case, the SQL Server virtual server name. While we still do not have any recommendations for SQL Server multi-subnet clusters, we can use the one that is recommended for Exchange Servers. As per this Microsoft TechNet article, we can configure the DNS TTL value of the virtual server name for the SQL Server instance to 300 seconds or 5 minutes.

These properties are not visible via the Failover Cluster Management console so we will be using Windows PowerShell to check and modify this property.

#List different cluster resources
PS C:\>Get-ClusterResource | Select Name, ResourceType
#List parameters and their values of the SQL Server Network name
PS C:\>Get-ClusterResource "SQL Network Name (SQLMULTISUBCLUS)" | Get-ClusterParameter
#Set parameter value
PS C:\>Get-ClusterResource "SQL Network Name (SQLMULTISUBCLUS)" | Set-ClusterParameter HostRecordTTL 300

Tuning the SQL Server Failover Cluster Instance DNS Settings

The first line is simply to retrieve all of the cluster resource names in the cluster. We want to know the cluster resource name of the virtual server name of the SQL Server instance, in this case, SQL Server (MSSQLSERVER). The next line simply grabs all the cluster resource parameters for the cluster resource named SQL Server (MSSQLSERVER). The parameter that we want to modify is HostRecordTTL. As I've previously mentioned, the default value is 1200 seconds. The third line simply changes the HostRecordTTL parameter value to 300.

Once the cluster parameter value has been changed, the client applications will no longer have to wait for 20 minutes in order to get the name resolution correctly and, thus, minimizing "perceived" downtime.

Another thing that we need to look into is how the virtual server name gets registered in the DNS server. Microsoft DNS is usually implemented with Active Directory integration which means that an update in the DNS made in one Active Directory site in a subnet may take a while to replicate to another Active Directory-integrated DNS server on another site with a different subnet. This can be addressed by registering all of the virtual IP addresses for the virtual server name on all of the DNS servers even when the virtual IP address is not online. This gives the DNS client an opportunity to store all of the possible IP addresses for a virtual server name. However, this assumes that the client application is smart enough to retry a connection against all available IP addresses for a given virtual server name. The default behavior of a cluster resource name is to only register the virtual IP address that successfully comes online. In our example, the virtual IP address 172.16.0.113 will only be registered in the DNS server within that subnet when it successfully goes online. The same thing is true with the virtual IP address 192.168.0.113. This results in having just one entry per DNS server per subnet for the virtual server name instead of two. To modify this behavior, run this code:

#List parameters and their values of the SQL Server Network name
PS C:\>Get-ClusterResource "SQL Network Name (SQLMULTISUBCLUS)" | Get-ClusterParameter
#Set parameter value
PS C:\>Get-ClusterResource "SQL Network Name (SQLMULTISUBCLUS)" | Set-ClusterParameter RegisterAllProvidersIP 1

PowerShell code to capture the RegisterAllProvidersIP value

The first line simply grabs all the cluster resource parameters for the cluster resource named SQL Server (MSSQLSERVER). The parameter that we want to modify is RegisterAllProvidersIP. As I've previously mentioned, the default value is 0, which means only the virtual IP address that successfully goes online will be registered with the DNS server in that subnet. The second line simply changes the RegisterAllProvidersIP parameter value to 1.

After changing the parameter value, the virtual server name will now be able to register all of the virtual IP addresses in the DNS server corresponding to the virtual IP address that goes online in that subnet. This also means that the client workstation can now store multiple virtual IP addresses that get registered in the DNS server and, thus, also minimizing "perceived" downtime.

Once the SQL Server instance has been successfully failed over and failed back, you can verify in your DNS servers that the virtual IP addresses that you have defined for your SQL Server Failover Clustered Instance has been successfully registered.

verify in your DNS servers that the virtual IP addresses that you have defined for your SQL Server Failover Clustered Instance has been successfully registered in the DNS Manager

Be sure to test these settings with your applications and come up with a baseline measurement of your client connectivity, tweaking the parameter values based on your requirement, taking note that they should meet your availability objectives.

Testing Application Connectivity

In order for client applications to be automatically redirected during a cluster failover, they need to either be using

the SQL Server Native Client 11.0
the Data Provider for SQL Server in .NET Framework 4.02 or above
the Microsoft JDBC Driver 4.0 for SQL Server

A new connection string parameter named MultiSubnetFailover is made available to allow applications to simultaneously try all the IP addresses assigned for the SQL Server multi-subnet cluster instance name and connects to the first one that responds. When dealing with legacy client libraries or 3rd-party data providers, the MultiSubnetFailover parameter cannot be used. The recommendation is to adjust the connection timeout setting by 21 seconds for each additional IP address. An example of using the MultiSubnetFailover parameter in a connection string is shown below.

The application was created using the .NET Framework 4.0 with the SQL Server Native Client 11.1. Similarly, the parameter can also be used with SQL Server Management Studio under the Additional Connection Parameters tab.

SQL Server Management Studio Additional Connection Parameters to set the MultiSubnetFailover to True

Or the new -M parameter in sqlcmd.

Be sure to do a failover and failback test while your applications are connected to verify that they are indeed getting properly redirected to the correct subnet.

Congratulations! You now have a working two-node SQL Server 2012 multi-subnet cluster and have verified that client applications have been tested for connectivity.

Next Steps

Read more about SQL Server 2012 Multi-Subnet Clustering on MSDN.
Review the previous tips on Install SQL Server 2008 on a Windows Server 2008 Cluster Part 1, Part 2, Part 3 and Part 4.
Download and install an Evaluation copy of Windows Server 2008 R2 for this tip and SQL Server 2012 in preparation for the next one.
Start working on building your test environment in preparation for setting up a SQL Server 2012 multi-subnet cluster on Windows Server 2008 R2. This is a great opportunity to learn more about networking concepts and fundamentals as a SQL Server DBA.

About the author

Edwin M Sarmiento is a Microsoft SQL Server MVP and Microsoft Certified Master from Ottawa, Canada specializing in high availability, disaster recovery and system infrastructures.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips

Article Last Updated: 2012-09-06