Considerations for Implementing SnapManager for SQL Server

By:   |   Updated: 2011-01-06   |   Comments (19)   |   Related: > Backup


Problem

Our company virtualized a large number of SQL Servers to NetApp appliances. The infrastructure team purchased NetApp's application Snap Manager for SQL Server (SMSQL) for snapshot backup and recovery. I'm a DBA, not a storage administrator! What are some best practices prior to implementing this backup tool?

Solution

For this article I'll cover the basic steps for preparing your server to use SMSQL. If you have an existing server that will use the tool then be prepared to do some serious reconfiguring. I won't go into detail as to the underlying technology. For information on what is a snapshot, you can find it here http://blogs.netapp.com/uspub/2010/04/netapp-101-snapshot-your-datas-first-line-of-defense.html#more

Also, I don't work for NetApp and I'm not a NetApp evangelical, but whether we as DBAs like it or not it's becoming more common for large infrastructure changes, like virtualization, to force database administrators into hardware specific solutions. In this case because the storage team chose NetApps as the storage infrastructure there are certain benefits to using a tool such as SMSQL to backup and restore your databases. The benefits mostly come from its integration with the backend snapshot storage technology (Snapshot, SnapRestore, Flexclone) and correctly configuring the systems can significantly cut recovery times for even the largest databases. Still, there are some serious challenges to using what is essentially a storage application to backup SQL databases. As a DBA, it can be frustrating to have the storage design heavily impact how you backup and recover your databases.

First off, for better or worse, the following items must be true in order to successfully implement snapshot backups with SMSQL:

  • System database files cannot reside on the same LUN as user databases
  • User database logfiles and datafiles need to be on separate LUNs
  • Tempdb must have its own LUN
  • A separate LUN needs to be created to store the Snapshots
  • SQL must be installed on the server before installing SnapDrive and SMSQL
  • ONLY database files can exist on the LUN. You'll need to put all application binaries and non-database files on different LUNs.

These restrictions tell us that at a minimum we will need 5 LUNs or, if you prefer, drive letters. If your system was a physical box that was then virtualized and has all the files on one drive then you will need to have the storage team add additional LUNs. You will need to physically move the data and log files to the appropriate drives.

Here is a typical configuration:

 basic steps for preparing your server to use SMSQl

In this configuration:

  • L is for the user database logs.
  • M is for the user database data.
  • S holds the snapshots and
  • T is for the tempdb files.

How you want to name the drive letters and organize the directory structure is completely up to you. The size of your S drive is determined by how many backups you take and store. Hopefully your storage administrator created dynamic LUNs which allows for easy expansion.

The system files can either be on your C drive or you can create a new dedicated LUN. They cannot be on the same LUN as user database files and they cannot be backed up using SMSQL. Why? Because in order to create a snapshot the application must quiesce the database and SQL does not allow you to freeze IO for system databases. You'll need to setup a separate maintenance plan for your system databases. In our company we use a central MSX server and use native SQL backups.

Things to Consider

You will need to consider the relationship between LUNs and volumes. Ideally each database should be on its own volume and LUN. This would mean that if you had 100 databases then you would need 200 LUNs (100 for data files and 100 for logs)! The reason behind this is that the snapshot occurs at the volume level. The snapshot must quiesce every file on the volume whether or not it is being backed up. If you have a large number of databases on the volume then the IO freeze may last too long and SQL will lose its connection to the databases. The recommendation is no more than 25 databases on a single volume. If IO freezes more than 15 seconds, problems can occur.

You will see significant increases in restore time if multiple databases are on the same volume. In this scenario SnapManager needs to snap the volume to a temporary location to restore the single database. It can't apply the snapshot for the volume because it would overwrite and corrupt all the other database files on the volume. This is also why no other application files can exist on the same LUN as your database files. I tend to have a one-database-per-volume configuration only if the database exceeds 200 GB. The beauty of this one-to-one relationship is that even a multi-terabyte database restores almost instantaneously.

Finally, expect to see these messages in the SQL errorlog during your backup times.

expect to see these messages in the SQL errorlog during your backup times

The IO above was frozen for 5 seconds on a database named DBA. That's typical and doesn't seem like a long time, but be aware that some applications are sensitive to session timeouts and users may see their screens freeze or their connections drop. If this happens you'll need to either change the time your backup occurs or move the databases (especially the large ones) to their own volume.

Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Scott Shaw Scott Shaw is a Lead SQL Server DBA with extensive experience running SQL Server in a virtualized environment.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2011-01-06

Comments For This Article




Wednesday, January 25, 2017 - 2:17:15 PM - Brian Back To Top (45573)

 Hello Scott. I found the information in this article usefull as I have been tasked as a DBA to use SnapMgr. I currently am doing SQL native backups using OLA maintenance scripts. While is see the benefits of using SnapMgr for SQL, I also fear the an I/o freeze in a cluster migth be catastrophic, as many of my luns house greater 35 databases.

The other piece I have to figure out, and can't find any useful information about, is integrating the part of OLA's script that writes to the commandlog after the backups finish. The commandlog insert after db backup completions a requirement of an application running in my SQL cluster.

Do you know of any DBA's who have taken the time to figure out how to intergrate OLA's commandlog into SnapManager after backups complete? 

Thanks, Brian

 


Tuesday, February 1, 2011 - 11:06:51 AM - TIm Boesken Back To Top (12773)

We used SMSQL for our SQL backups and had to deal with weekly unplanned outages. Our % of change is high due to testing data from manufacturing machines. What we discovered is if we do not have a 3x1 Snap Volume to Lun size ration (for a 100 GB data LUN we needed 300 GB for snap space) the vol space would overrun the Lun and drop it offline. Not something you want to deal with production databases!

We had Netapp troubleshoot and they came up with the 3x1 ratio. When we have some databases >500GB, that meant 1.5 terabyte snap space. So it got very spendy very quickly. Needless to say, we moved to a 3rd party streaming backup tool and just use seperate backup luns that we stream our backups to. We then use Snap Mirroring to move the backup files to near store for DR.

While I appreciate the fast backup times of SMSQL, and recovery times, sometimes it does not make finacial sense to "throw disk" at the backup solution.


Friday, January 7, 2011 - 10:58:23 AM - Brian Back To Top (12530)

Nice article wish it had been around a number of years ago when I started using Netapp and SQL Server. When I started using Netapp and SQL server there were some concepts that were overlooked that seem to be over looked in a few posts as well. Here is a brief list of concepts I keep around hope it is helpful

·     There can be multiple luns per Volume

·     Snapshots are taken per volume - all luns in the volume are snapped (backed up)

·     When snapping a volume with a system database(master) the snapshot is streamed like a regular backup.

·     Luns are mounted attached as Drives to the Windows OS but not necessarily 1 to 1.

·     Snapping the tempdb is usually a waist of space make sure it is on its own Volume and not snapped.

·     IF using the simple recovery model separating the logs and datafiles on different volumes is not necessary.

·     If using multiple backup methods and not in simple recovery have only one method manage the transaction log backups.

I feel that snapshots are a little ***bersome for every day backups and restores. I only use the Netapp snapshots (SMSQL) for DR recovery where it works very well in our large environment. This means we us a different solution for daily backups. Some netapp reps will try to tell you that this will not work. The key is the transaction logs just determine how you want to manage the transaction log backups and use only one solution for the transaction logs.

 

 


Friday, January 7, 2011 - 10:07:21 AM - Jason Back To Top (12529)

Oh, I also found the syntax that I use to copy every database on the server, regardless of if databases were added or removed since the last configuration.

C:\Program Files\NetApp\SnapManager for SQL Server\SmsqlJobLauncher.exe new-backup svr 'XXXXDBS' -d 'XXXXDBD', '0' -RetainBackups 3 -lb -bksif -RetainSnapofSnapInfo 4 -trlog -noutm mgmt daily

This is the CMDEXEC in a SQL Agent Job.


Friday, January 7, 2011 - 9:57:43 AM - Matt Back To Top (12528)

You are correct the logs and SnapManager do not have settings for tempdb, it should be ignored but for some reason it seemed it was getting lumped in which I think is because SnapDrive was recording the changes but thats just a guess.  Again, I don't understand NetApp that well at all.  I know enough to scream and yell and break it :).

I am not 100% sure why but relocating the tempdb seems to have calmed things down from what I can tell.  The size of the snapshots were several gigs and I don't have any reason why other than tempdb which is used heavily on this box.  One test would be to create a temp table several gigs on a server that has a small system drive snapshot footprint and see what happens.


Friday, January 7, 2011 - 9:47:40 AM - Jason Back To Top (12527)

Matt,

That's odd.  In my SMSQL interface, I cannot ever select TempDB to be backed up.  I also cannot select any of the System databases to be backed up via Snapshot.  Here's an excerpt of a nightly SMSQL backup log.

[00:00:31.510]  Backup database list:
[00:00:31.510]  1 - XXXXXDBS : master
[00:00:31.511]  2 - XXXXXDBS : model
[00:00:31.511]  3 - XXXXXDBS : msdb
[00:00:31.512]  Getting SnapInfo directory info...
[00:00:31.513]  1 - Database [master] of [XXXXXDBS] SnapInfo directory located at:
[00:00:31.513]  H:\SMSQL_SnapInfo\SQL__XXXXXDBS\DB__master
[00:00:31.514]  2 - Database [model] of [XXXXXDBS] SnapInfo directory located at:
[00:00:31.514]  H:\SMSQL_SnapInfo\SQL__XXXXXDBS\DB__model
[00:00:31.515]  3 - Database [msdb] of [XXXXXDBS] SnapInfo directory located at:
[00:00:31.515]  H:\SMSQL_SnapInfo\SQL__XXXXXDBS\DB__msdb


[00:00:31.516]  CHECK SNAPMANAGER LICENSE ON LUN.
[00:00:31.715]  Querying LUN license: I...


[00:00:31.892]  *** FULL DATABASE BACKUP


[00:00:31.893]  FULL DATABASE BACKUP [#1]


[00:00:31.893]  Full database backup location for database [master]:
[00:00:31.894]  H:\SMSQL_SnapInfo\SQL__XXXXXX\DB__master\FG__\StreamFullBackup\01-07-2011_00.00.22__master__Daily.FBK


[00:00:31.894]  Connecting to SQL Server XXXXX...
[00:00:31.904]  Starting full database backup for database [master]...
[00:00:34.488]  Database [XXXXXDBS:master] Percent: 6 percent processed.6
[00:00:34.589]  Database [XXXXXDBS:master] Percent: 11 percent processed.11
[00:00:34.694]  Database [XXXXXDBS:master] Percent: 15 percent processed.15
[00:00:34.728]  Database [XXXXXDBS:master] Percent: 22 percent processed.22
...

It peforms the steps for Master, Model, MSDB, once those are done, it cleans up old backups based on my settings, and then starts the snapshot process for the User DB's on the two LUNs I have created for those files.  Are you using some other NetApp tool to snapshot your LUNs? 

 

 


Friday, January 7, 2011 - 8:38:59 AM - Matt Back To Top (12526)

Thanks Jason for your elaboration of your setup.  I see what your are saying.

From what I found and working with a NetApp support the tempdb was still being included.
My network team noticed that the system database volumn had fairly large snapshots.  I couldn't find any reason as the physical size was less then a gig.  I am still new to SQL and NetApp so perhaps I am missing something.  I contacted NetApp support and asked about tempdb on system volumn and they told me that it would record the changes if it was included on that lun. 


Thursday, January 6, 2011 - 4:09:31 PM - Jason Back To Top (12523)

Matt,

Since I'm not taking snapshots of the LUN that has the System DB's, I can't possibly be taking snapshots of TempDB.   SMSQL cannot take snapshots of a LUN that has System DB's on it, as it cannot suspend the I/O on those databases.  It does a Stream backup (Native SQL backup) of Master, MSDB, and Model.  That is why it requires that all System DB's are on their own LUN.


If I look in the LUN I use for my Snap Info directory, I see the two subdirectories, one for the two User Database LUN's that I have MDF and LDF files on.  Those are snapshots of the LUNs.  The other directiry is the SQL__ (the System DB LUN backups), and in there I have directories.  One for Master, one for Model, and one for MSDB.  Under a couple more subdirectories, there are files with a ".FBK" extension.  These are the full native backups of the database.

SMSQL simply cannot snapshot a LUN with a system DB located on it... it cannot suspend the I/O for the databases.  Again, I'm using Snapdrive 6.2 and 6.3 and SMSQL versions 5.0 and 5.1 on my systems.


Thursday, January 6, 2011 - 3:17:09 PM - Matt Back To Top (12520)

JASON!!!!! move that tempdb as stated above...  you are taking a snapshot of tempdb if it is with other databases especially system.  I ran into this just the other week.  The documentation is misleading as you think it is not even looking at tempdb but it does when with system databases.

Wish I could get a dedicated Verify server :(

David, I agree with the cleanup needing some improvements but I haven't seen any big issues.
When adding/removing databases you need to run a reconfigure, pain, seems that sometimes its not needed.  You also have to modify the jobs to remove those databases.  I assume if you were doing all databases you could script to collect all databases then use a variable but it is easy enough to update the job.  It is good to run the reconfigure as I have seen application install and put the files in some other location then where I want or some other problem.

 


Thursday, January 6, 2011 - 3:10:33 PM - Jason Back To Top (12519)

David,

Actually, I might be wrong, it seems that the difference isn't selecting the entire server when using the Backup Wizard to get all the databases..  it seems that it's happening on the servers where I have SMSQL 5.1 installed, and not happening on the ones where I have SMSQL 5.0 installed.


Thursday, January 6, 2011 - 3:05:56 PM - Jason Back To Top (12518)

David,

Regarding new databases.... when you Configure a Backup, you can select the entire computer, which will in turn select all the User DB's and all the System DB's.  What this does then is create the actual OS level command without a list of DB names.. it will instead backup any database on the system with the exception of TempDB.    I have my scheduled job stored as a SQL Job, and if I look at the job for a system where I've selected the computer name, the command is as follows. 

C:\Program Files\NetApp\SnapManager for SQL Server\SmsqlJobLauncher.exe new-backup  –svr 'XXXXXX'  -d  'XXXXXXXX', '0' -RetainBackupDays  7 -lb -bksif -RetainSnapofSnapInfo 8 -trlog  -noutm  –mgmt daily

There's no list of Databases there, this is from my system that has 72 databases on it.  Here's another one from a sharepoint instance where I have not selected the entire server for the backup configuration.

C:\Program Files\NetApp\SnapManager for SQL Server\SmsqlJobLauncher.exe new-backup  –svr 'XXXXX'  -d  'XXXXX', '34', 'Application_Registry_Service_DB_9a295edc61634367958d4fd959994ae2', 'BDC_Service_DB', 'Extranet PerformancePoint_a56f3d71333f4872ab5966afdccd045d', 'Extranet_Search_Service_CrawlStoreDB_f52078f0d0e4436db4595feafa297db5', 'Extranet_Search_Service_DB_62ed2f9624b04c349ee9badc9d5a5f72', 'Extranet_Search_Service_PropertyStoreDB_a8af35547e8a4ea38e13fa9105116877', 'ExtranetManagedMetadata', 'ExtranetWordAutomation', 'master', 'model', 'msdb', 'Profile DB', 'SharePoint_AdminContent_602c15be-3abe-4977-9958-941dfaeab314', 'SharePoint_Config', 'Social DB', 'StateService_4366ac1e823f42c7b7327e51f1079460', 'Sync DB', 'WebAnalyticsReporting', 'WebAnalyticsStaging', 'WSS_Content_Ext_Drop_Dept', 'WSS_Content_Ext_Drop_EHS', 'WSS_Content_Ext_Drop_Eng', 'WSS_Content_Ext_Drop_Exec', 'WSS_Content_Ext_Drop_Fin', 'WSS_Content_Ext_Drop_GBS', 'WSS_Content_Ext_Drop_HR', 'WSS_Content_Ext_Drop_LCM', 'WSS_Content_Ext_Drop_Legal', 'WSS_Content_Ext_Drop_Ops', 'WSS_Content_Ext_Drop_ProdMgmt', 'WSS_Content_Ext_Drop_Qual', 'WSS_Content_Ext_Drop_SalesMktg', 'WSS_Content_ExtranetPortal', 'WSS_UsageApplication' -RetainBackupDays  7 -lb -bksif -RetainSnapofSnapInfo 8 -trlog  –mgmt daily

As you can see, I have each DB listed.  Somewhere I have recently also read a tip on how to make a job that will programatically create this job and DB list.  I think by using SP_MSFOREACHDB, or querying the name from sys.databases.


Thursday, January 6, 2011 - 2:33:10 PM - David Benoit Back To Top (12517)

Good article Scott and nice coverage of the topic. We are presently using this technology and it definitely has benefits for DR solutions. A couple points of interest that we have hit in our utilization as follows;

1. The snapshot cleanup process can be a bit problematic in that it actually causes some pretty intense activity which can result in degraded I/O performance for your database. This will most likely vary depending on the utilization of your controller which in our environment is typically high already.

2. When new databases are added / dropped there is some manual effort that is required in order to get SnapManager to recognize the changes. We have not yet found a way to automate that.

3. As with any database that is used for replication you need to have a solid testing plan for recovery even with this technology. The recovery process is very similar to that of standard backup and restore but it should be tested to be sure that you know how to fully recover and get replication working again.

Thanks again for the article and good information in all the comments above.

David


Thursday, January 6, 2011 - 2:30:23 PM - Jason Back To Top (12516)

Great article.  We brought up a new Virtualized center 6 months ago, and it was my first dip into NetApp and Snapdrive / SnapManager for SQL.  We've just upgraded our VMWare from 4.0 to 4.1, and missed a known bug with Snapdrive 6.2 and VMWare 4.1.  It's caused a number of headaches for us... so we're quickly upgrading to Snapdrive 6.3.  We use RDM's for all my SQL Data files, and VMDK's for the OS and Binaries.

My thoughts are a lot of repeated info... I have most of my systems configured with a LUN for User Database datafiles, a LUN for User Database logfiles, a LUN for System Database files, and a LUN for SnapInfo.   I have TempDB online with the other system databases.  The System databases need to be on their own LUN because the I/O cannot be queisced, and so they are backed up as a stream (think Native SQL backup)... and TempDB cannot be backed up.  So there's no harm in having TempDB on the same LUN as the other System Databases.

The recommendation is to not have anything other than datafiles on your LUNs because you're taking a snapshot of the entire LUN, and there's no reason to snapshot other files when you're backing up your databases.  It's just a waste of precious NetApp disk.  The recommendation to split databases into seperate LUN's also stems from the fact that when you take a snapshot you do it of everything on the LUN, and so if you want to backup or restore a single database and you have those files on a LUN with other database files, you will have to backup or restore everything.

The recommendation for 25 is only that, and as has been said, it's due to the time the I/O has been suspended in the databases that have files on those LUNs.  I have one system that has 72 small databases on a set of Data and Log LUNs and everything is fine.  Keep in mind that each LUN needs a volume much larger to store the snapshots, so if you split all your DB's into seperate LUNs, you're goingt to be using a lot more disk than if you can consolidate.

We're running Snapdrive 6.2 and 6.3, and currently using Snapmanager For SQL 5.0 R1 P2.  (P2 is the version that works with SQL Server 2008 R2 and Full Text Catalogs - Since SQL 2008 R2 moved the FT files into the primary MDF). 

I glanced over the article and comments quickly, I didn't see mention of Remote Verification.  This is an awesome feature that I'm employing on my Production databases.  It allows me to complete my Snapshot backups, and then mount those snapshots on another SQL Server that has SnapDrive and SMSQL installed.  It then attaches those database files to the SQL Server instance, and runs DBCC CheckDB on those.  I offload all that expensive itegrity checking from the production servers to an offline server dedicated only to this purpose.  Not only do I verify the database integrity with this, but also that the backup is verified.

As a DBA, I highly recommend getting a NetApp NOW account, and using their forums and community resources for help. 


Thursday, January 6, 2011 - 2:22:22 PM - Matt Back To Top (12515)

Thanks for the clarification.

As to other files on snap luns I will have to make some changes here on my new SQL 2008 boxes and luckily haven't done any point in time restores as everything is simple model, but I am trying to explain benefits of full to our team for some of the major systems.  Although, I don’t know all the requirements with backups that I will need based on our current strategy.

Thanks for the link I will make sure to check it out.

Is there a way to be notified of your new tips? I didn’t see a way to be notified or follow a person.  I get the newsletter if that’s all I need.

I wouldn't mind a topic on flexclone for test and dev :)  Currently mount clone to dev server, copy database over and remove clone all before the next snapshot or verification (disable in some cases just in case) because I think it will say another snapmangager operation message or something.  Trying powershell to script but it is difficult because they are separate machines and figuring out exactly what should be done.

Yet another topic, verification practices... currently verifies happen at night which could happen during a tape backup which only drags the tape slowing it down and hitting the system all the harder and adds the _Mount database data.  I am told there is no real control over when tapes happen as they are based on windows of opportunity and not specific times.

I wish snapmanager would show the size of the snapshots this could help a lot with planning space requirements.  I know network team can get this but I don't have access to anything but snapmanager and limited SnapDisk to dismount snapshots.  I don't even think my Network team understands everything either and know nothing about SMSQL.

Another thing I found is the SnapManager for SQL in program files has reports and you may want to cleanup these logs.
SnapDrive also has some reports and these may need to be cleaned up from time to time as well.  I haven't found any settings in the applications to retain bla bla reports.  And most everything on NetApp site is for Exchange.

In reference to the post about the SQL Jobs from Larry thanks for the added tip, didn't think about management groups :).
I wish my network team would stop going directly to tape (I don't have control over tape backups), we have occasional issues with tapes going off around a snapshot and snapshot failing.  We currently take snapshots around core business hours and tapes at night.  Tapes hog a lot of bandwidth and the heaviest tasks/jobs run at night including index maintenance and verifies.

In reference to the post from Scott and SSMS and SMSQL, totally agree, and I don’t like how you can’t change the path on the new drive when migrating
J.  I tend to make the moves myself and save a copy of original incase system crash.  As far as the default options, I think majority of DBA’s know to review defaults J I know I learned my lesson a few times.  These should not be default by any means and I think the NetApp team needs to bring in or ask SQL community for improvement ideas if they are not J.

Mount Points, Mount Points… I haven’t run out of drive letters yet?  This would be another good topic if it can be explained from a DBA perspective and simply enough for the new
J.

Again, Thanks,
Matt


Thursday, January 6, 2011 - 12:38:08 PM - Larry Back To Top (12513)

I should have mentioned the NetApp documentation I have read and use:

 

1.  SnapManager® 5.0 for Microsoft® SQL Server® Installation and Administration Guide:  I have a copy of this, but for the life of me cannot find a link to it on NetApp's website now.  This is the "bible" I have primarily used. 

2.  SnapManager 5.0 for SQL Server: Best Practices Guide:  http://media.netapp.com/documents/tr-3768.pdf

3.  Best Practice Guide for Microsoft SQL Server on NetApp Storage:  http://media.netapp.com/documents/tr-3821.pdf

4.  Helping DBAs Become More Efficient:  http://media.netapp.com/documents/wp-7079.pdf   (This is really aimed at Oracle DBAs, but it introduces and helps you understand some of NetApps technologies and concepts.)

 


Thursday, January 6, 2011 - 11:53:00 AM - Scott Shaw Back To Top (12512)

Hey Larry,

Great points!  You jogged my memory with your point about mount points. I should have mentioned that in the article and I'm glad you brought that up. That's a great way to get around the LUN limitation.

VM has caused a lot of changes in the way we think as DBAs. Like you mention the concept I grew up with of separating datafiles for performance is no longer valid (or at least out of the control of many DBAs) due to how storage is allocated and cut up. Its frusting to not know what IO from what application you might be sharing. I've looked at some of the shared IO and I think I would have been better not knowing at all...

Our company has built a new data center and moving all the systems (+ 3,000) will be accomplished using Snap technology. After the move there is no going back (there will be no physical Windows servers in our environment) and I hope to take your advice and start using more of the Flexclone and Flexmirror options. I'm glad to hear they are working well for you.

As you know, getting your systems ready for implementation is only part of the battle. Actually backing up the systems is another feat. One thing I've found and will discuss in another article is the lack of true integration between SSMS and SMSQL. For example, SMSQL by default will want to move your databases, verify snaps, and run dbcc checkdb.  For even medium sized databases, accepting the defaults can bring your system to a halt!  I think NetApp has a lot of work to do on this front.

Thanks for your comments!

Scott


Thursday, January 6, 2011 - 11:36:21 AM - Scott Shaw Back To Top (12511)

Hi Matt,

The limitations are based on the amount of time it takes to snap the entire drive - so the number of databases can vary. We choose 25 just to be sure but I've heard around 30 also thrown about if databases are small.  The numbers come from a NetApp trainier who was onsite for a week.  Again, if the snap takes longer than about 15 seconds than SQL will lose the connection and will shut down. Keep in mind also that the more databases you have on a single LUN the longer your recovery times.

Thanks for the clarification. You are correct. You could put misc files on your tempdb. The misc file rule only applies to the LUNS that have database files in the snapshot.This is because if you restore those snapshots to a point-in-time than all other files will be overwritten.

Excellent point about tempdb! It should be excluded from any snapshot and in some cases doing so will save a lot of space.

I feel for you to be new at SQL AND have to deal with snapshots. I'd like to point you to a blog that I think pretty much describes the future DBA role: http://sqlblogcasts.com/blogs/gavinpayneuk/archive/2010/12/18/the-role-of-the-infrastructure-dba.aspx. Trust me - its unfortunate but if others don't feel our pain yet, they soon will!

I plan to follow-up with other tips that go more into depth about how many and how frequently to setup snapshots and also how to handle notifications and errors. Like you said the documentation is thin and I sometimes get the feeling that NetApp is learning right along with us.

I hope I addressed your questions and I look forward to any future comments you may have. Feel free to contact me anytime. Good luck!

Scott 


Thursday, January 6, 2011 - 11:30:26 AM - Larry Back To Top (12510)

Thank you for your article on SnapManager for SQL.

 

Our company has also centralized storage using NetApp appliances.  I definitely agree that there is a learning curve for the DBA, but mostly in the area of backup/recovery.  Also, when properly configured, the DBA does not need to nearly *as* concerned about how to configure drives and where to place your database files, such as separating data, index, log, system, and tempdb files to optimize performance.

 

One point I would like to clarify is that if you utilize mount points, you will not need separate drive letters for each LUN.  We have over 350 user databases in an active-active cluster, where all of our databases are mounted on shared drives (one per server), and utilize mount points (folders/sub-directories) within each drive that represent each of the LUNs in use.  The actual recommendation of databases per LUN is no more than 35 (not 25), so we do this for smaller databases, and have separate LUNs for each of the larger databases.  There are also separate folders (mount points) for system DBs, tempdb, and SnapInfo (snapshots).

 

Another correction to the article is that you are not required to have separate LUNs for data and log files if you do not utilize the full recovery model.  Due to the nature of our business, we are able to use the simple recovery model, so we have our data files and log files on the same LUN, and it works perfect.

 

There are two features about using NetApp technologies that are a tremendous benefit, and these are FlexClone and FlexMirror.  We use FlexClone daily to create our test environment, creating an exact duplicate of all of our production databases.  This process only takes around 15-20 minutes, remembering that there are over 350 databases (several terabytes) on 2 servers!  Our developers and QA team absolutely love this, and it is probably the easiest process that I have come across to do this.  As far as FlexMirror goes, this is used to replicate our snapshots to our DR site.

 

In our environment, one of the few negative things I have found in all this is that SnapManager for SQL does quite a bit of "housekeeping".  The actual snapshot of the volume is very fast, as Mr. Shaw has indicated (usually a few seconds, regardless of DB size), but SMSQL does other things, such as checking licenses prior to each snapshot and cleaning up old snapshots.  So if you have a large number of databases like we do, the whole process takes quite a while to complete.  At first this bothered me, but once it sunk in that the actual snapshots are completed so quickly, I felt more comfortable with how it works.

 

You actually do not need to create a maintenance plan to back up your system databases.  SMSQL can back these up as part of the normal process, but instead of making snapshots, it does a streamed backup, which is just an ordinary backup.  Nothing special has to be done if you have your snapshot process configured to snapshot all databases.  This is how our servers are set up and it works flawlessly.  The backup files of streamed backups are ordinary backup files and are located in the same SnapInfo folder (LUN) as all the other snapshots.

 

One last point I would like to make is that for my configuration, I have two SQL Agent jobs set up to perform snapshots, each with slightly different parameters.  One job snapshots databases every 2 hours, and cleans up snapshots older than 1 day.  The other job snapshots user databases at midnight, and retains snapshots for 3 days. Each job has its own naming scheme (which is called "management group" in SMSQL) to help differentiate between them at the folder level.  So with this configuration, we have snapshots (local on our servers) for every 2 hours for the past day plus the past 3 nights.  The snapshots require a fraction of the space to store compared to traditional backups, so you could actually retain them (locally) much longer if desired.  And as business requirements dictate, you may also want to copy these to long-term storage such as tape.

 

Thanks,

- Larry


Thursday, January 6, 2011 - 11:09:57 AM - Matt Back To Top (12509)

Thank you very much for your post you cover many topics I have been uncertain about.

Quick question on that 25 database limit you have, is that a personal recommendation or do you have documentation that states this?  I am in a similar situation and I am in the middle of planning for SQL 2008 upgrades on all our databases.  If I can prove to Network team of these limitations I might be able to get some extra drives.  Unfortunately, I need proof and documentation is the best method.  I hate the NetApp documentation as it is written from a storage point of view and not DBA :) and this is something I have to live with.

Also, you say you cannot or should not have other files on the lun, why exactly is that?  From what I can tell, I have a text file or something and it exists in snapshot.  What about putting misc files on tempdb drive since this does not get snapshots?

Also, I would like to mention that although tempdb is a system database it does not belong on the system database drive.  This will cause your system database snapshots to be larger than normal because system databases use a different method and will include tempdb which if separated does not need snapshots.  (this is just to cover those that might question, you have yours in its own drive)

Also, a worthy mention is disabling snapshots or what to do during maintenance or large database changes.
I will remove all snapshots and stop the snapmanger services which seems to prevent taking a huge snapshots of changes and drop the drive.  It would be advisable to notify your network team and have them dedicate a huge amount of space to the backend so you do not drop a drive which would allow you to retain backups.  In my situation, this is not always possible.

I wish it would just fail the snapshot and not drop a drive, ARG!!! I hate that!

I am happy to find someone who feels my pain.  I am new enough to SQL having been in the role for a short time and I find it difficult to explain to NetApp support as well as my network team what I see.

Anyway, just thought I would share.

Thanks,
Matt















get free sql tips
agree to terms