By: Eric Blinn | Updated: 2019-05-10 | Comments | Related: More > System Databases
Problem
I want to learn more about the SQL Server MSDB database and what it can offer me as a DBA.
Solution
MSDB is one of the 4 default system databases created when installing SQL Server along with master, model, and TempDB. The purpose of MSDB is to hold most of the information needed to support SQL Server Agent, specifically, it holds the information for the SQL Server Agent jobs, job history, operators, and alerts. For database activity it contains backup history, restore history, log shipping configuration, etc. It also holds the information needed to run Database Mail. Historically it has been the home for DTS packages and SSIS packages, but DTS is no longer supported and SSIS is often stored in a catalog. This tip will focus on backup and restore history.
All of the MSDB tables can be viewed in SQL Server Management Studio (SSMS). At first, when expanding the tables node in SSMS there will appear to be no tables. Instead, all of the tables will be found one node deeper under "System Tables".
SQL Server Backup Information
Anyone that has restored a database using SSMS has noticed that the restore wizard will quickly offer up a potential solution for a most recent date and time restore. It does not get this information from reading files off of disk or looking at the history from the SQL Server Agent backup jobs. Instead, it reads this information from MSDB which remembers all of the native backups taken by any process.
These same tables are available to be queried by a DBA. To follow along with these examples, use SSMS to connect to a SQL Server and open a new query window.
To start out a backup will be needed. The following script will take a backup of MSDB. Any other database will also work. Also, an existing backup may be used.
BACKUP DATABASE msdb TO DISK='C:\MSDB.bak' WITH name='MSSQL TIPS', DESCRIPTION='Made Today', COMPRESSION
The first set of tables to be considered are the backup information tables dbo.backupset, dbo.backupmediaset, and dbo.backupmediafamily. Adjust the file name and WHERE clause as necessary to match the backup chosen in the first step and execute this pair of statements.
RESTORE HEADERONLY FROM DISK = 'C:\MSDB.bak'; SELECT backupset.[name] , backupset.[description] , [type] , expiration_date , is_compressed , Device_Type , [user_name] , server_name , [database_name] , is_copy_only , backup_start_date , backup_finish_date , backup_size , compressed_backup_size , physical_device_name , [backup_set_id] , backupset.media_set_id FROM msdb.dbo.backupset INNER JOIN msdb.dbo.backupmediaset ON backupset.media_set_id = backupmediaset.media_set_id INNER JOIN msdb.dbo.backupmediafamily ON backupset.media_set_id = backupmediafamily.media_set_id WHERE database_name = 'msdb' AND backup_start_date > DATEADD(n, -20, GETDATE());
The information from the MSDB query should look quite a lot like the information from the header restore of the backup file. There is other information available including the actual backup file name, size, datetime, and log sequence numbers of the backup. This will come in handy later.
One important note: The value type can be a little bit misleading. "D" is short for "Database". It indicates a full backup. The value "I" means a differential backup was taken. The value "L" is for transaction log backups. There are other possible values, but these three are far and away the most popular.
Make note of the value for backup_set_id for the next demo.
Most of the information from a RESTORE FILELISTONLY command is stored in the table dbo.backupfile. Run this pair of queries using the backup_set_id and filename from the previous query.
RESTORE FILELISTONLY FROM DISK='C:\MSDB.bak' SELECT logical_name , physical_name , file_type , [filegroup_name] , file_size , backup_set_id FROM msdb.dbo.backupfile WHERE backup_set_id= 4057
This query will show the most recent full backup for a given database and every log backup taken since and will do so in the proper order that they need to be restored. It could be a handy script to have on hand.
SELECT type, physical_device_name FROM msdb.dbo.backupset INNER JOIN msdb.dbo.backupmediafamily ON backupset.media_set_id = backupmediafamily.media_set_id WHERE database_name = 'MSSQLTips' and backup_finish_date >= (SELECT TOP 1 backup_finish_date FROM msdb.dbo.backupset b1 WHERE b1.database_name = 'MSSQLTips' AND b1.type = 'D' ORDER BY b1.backup_finish_date DESC) ORDER BY type, backup_finish_date
How can this information be put to use? Consider this previous tip which used these tables to determine the most recent full backup of each database and also the most recent differential if it exists. It then uses the backup_set_id to look up the logical file names of the underlying database. Finally, it takes that information to restore the database to another machine in a fully automated way.
SQL Server Restore Information
Much in the same way that each native backup is recorded in MSDB, so is each database restore. To see how this works, download the WideWorldImporters bak file from this link and restore it to a new database with a unique name.
RESTORE DATABASE [MSSQLTips] FROM DISK = N'C:\WideWorldImporters-Full.bak' WITH MOVE N'WWI_Primary' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\MSSQLTips.mdf', MOVE N'WWI_UserData' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\MSSQLTips_UserData.ndf', MOVE N'WWI_Log' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\MSSQLTips.ldf', MOVE N'WWI_InMemory_Data_1' TO N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\MSSQLTips_InMemory_Data_1'
Having completed that operation there is a new row in the table restorehistory to reflect the activity.
SELECT TOP 1 * FROM msdb.dbo.restorehistory ORDER BY restore_history_id DESC;
This row indicates that a full (D) restore was done at this time for the target database MSSQLTips.
If the column name backup_set_id looks familiar, it should! As seen in the previous demo, the data in the backup tables is very similar to the file list and header data in the backup files themselves. When a restore is completed from a backup file that wasn’t already listed in the set of backup tables then new rows are created in those tables with the data from the backup file(s). This is why there is a server name column in the backup tables –to indicate the source of the backup information -- and why queries commonly use that column in an argument.
SELECT restorehistory.destination_database_name DestDBName , restorehistory.restore_date , restorehistory.restore_type , backupset.user_name , backupset.server_name OrigSvrName , backupset.database_name OrigDBName , backupset.backup_finish_date FROM msdb.dbo.restorehistory INNER JOIN msdb.dbo.BackupSet ON restorehistory.backup_set_id = backupset.backup_set_id WHERE restore_history_id = 3014;
The user jodebrui must have their name stored in thousands of SQL Server instances across the globe!
How can this information be useful? If there is a test environment that is built by restoring a copy of the production database there is a frequently asked question in any business, "when was the last time we restored test from production?". This can be a hard question to answer with the DBA scouring emails or helpdesk requests to make a best guess. Now, armed with this information, the DBA can go to restore history and say confidently that the restore was done at a specific date and time (msdb.dbo.restorehistory.restore_date) with a backup file from a specific production server (msdb.dbo.backupset.server_name) that was taken at anther specific time (msdb.dbo.backupset.backup_finish_date). That is a very accurate response!
Next Steps
- Check out these resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2019-05-10