By: Jeffrey Yao | Updated: 2014-06-25 | Comments (9) | Related: > High Availability
Problem
Each time we get a production SQL Server reboot incident, we need to investigate and prepare an "Incident Report" in which the DBA Team must report the duration that the SQL Server service was unavailable. Is there an easy way to determine the downtime window after a reboot? Check out this tip to learn more.
Solution
First, for a SQL Server service restart notification there are two ways (assuming
Database
Mail is setup correctly) to determine the downtime:
- Send a notification via a startup stored procedure, see sp_procoption
- Send a notification via a SQL Server Agent Job, whose schedule is set to start when the SQL Server Agent service starts. Here we assume SQL Server Agent is set to auto-start, i.e. when the SQL Server service starts, the agent service starts too.
Second, to determine the SQL Server service downtime window there are two ways to determine the downtime as well:
- Finding the time for the last entry in the previous
SQL
Server Error Log and the first entry of the current SQL Server Error Log
- Finding the time for the last entry in the previous default SQL Server Trace file and the first entry of the current default Trace file
Note: these two methods may not give accurate information in some scenarios, such as a sudden power outage (assuming there is not a UPS), but in the majority of cases these two methods work fine. I once had a case whrere a managed clustered node failed over to another node. The whole failover downtime window was only around 14 seconds and since it was so short, nobody noticed this incident except for the DBA team.
SQL Server Error Log Entries Indicating Downtime
Here is what the SQL Server Error Log entries look like:
Query the SQL Server Default Trace to Determine Downtime
If we check the previous default trace file using the code below:
-- need to change to the trace file path to your own:
SELECT f.EventClass, f.EventSubClass, te.name, v.SubClass_Name, f.StartTime FROM fn_trace_gettable('C:\Program Files\Microsoft SQL Server\MSSQL10_50.MSSQLSERVER\MSSQL\Log\log_509.trc', 1) f inner join sys.trace_events te on te.trace_event_id = f.eventclass inner join sys.trace_subclass_values v on v.trace_event_id = f.eventclass and f.eventsubclass = v.subclass_value where f.eventclass=18 ORDER BY f.STARTTIME DESC
We will get:
Determining SQL Server Downtime from the Default SQL Server Trace and Email DBAs
I will provide two methods, one is a stored procedure (checking the default trace) to be used as a startup procedure and another a PowerShell script (checking the SQL Server Error Log) to be used in a SQL Server Agent Job which is scheduled to run when the SQL Server Agent service starts. These scripts assume the default SQL Server Trace is not disabled and PowerShell 2.0 is installed.
/* Created: Jun 1, 2014 by Jeffrey Yao Function: Send email notification about the sql service restart and report the down time window. It is applicable to sql server 2005 and above version only. Assumption: The database mail is already configured and workable on the sql server instance. */ use master if object_id('dbo.uspRestartAlert', 'P') is not null drop proc dbo.uspRestartInfo; go create proc dbo.uspRestartAlert as begin set nocount on; declare @curr_trc_file varchar(256), @prev_trc_file varchar(256); declare @trc_file_name varchar(128), @trc_file_folder varchar(256); declare @trc_num int, @slash_pos int; declare @stopTime datetime, @startTime datetime; declare @msg varchar(500), @subject varchar(100), @crlf char(2); declare @recipients varchar(100), @sqlcmd varchar(max); set @recipients = '<your email account>'; -- change to the proper account set @crlf = nchar(0x0d) + nchar(0x0a); select @curr_trc_file = '', @msg=''; -- check downtime window details select @curr_trc_file = [path] from sys.traces where id =1 and status =1; set @subject= 'Server ' + quotename(@@servername, '[]') + ' has restarted'; if @curr_trc_file <> '' begin -- the following is to try to get the number in the current default trace file name, -- default trace file has a naming convention like <path>\log_<num>.trc set @curr_trc_file = REVERSE(@curr_trc_file); set @slash_pos = CHARINDEX('\', @curr_trc_file); set @trc_file_folder = reverse(SUBSTRING(@curr_trc_file, @slash_pos, 256)); set @trc_file_name = reverse(SUBSTRING(@curr_trc_file, 1, @slash_pos-1)); set @trc_num = cast(SUBSTRING(@trc_file_name, 5, len(@trc_file_name)-8) as int) -- 8 = length of "log_" plus ".trc" set @curr_trc_file = REVERSE(@curr_trc_file); set @prev_trc_file = @trc_file_folder + 'log_' + CAST((@trc_num-1) as varchar(12)) + '.trc' select @stopTime=max(starttime) from fn_trace_gettable(@prev_trc_file, 1) -- get the last StartTime of the previous trace select @startTime=min(starttime) from fn_trace_gettable(@curr_trc_file, 1) -- get the first StartTime of the current trace set @msg = 'The downtime window is from ' + CONVERT(varchar(30), @stopTime, 120) + ' to ' + CONVERT(varchar(30), @startTime, 120) + '; ' + @crlf + @crlf + 'The total down time duration is: [' + cast(DATEDIFF(second, @stopTime, @startTime)/60 as varchar(20)) + ' minutes, ' + cast(DATEDIFF(second, @stopTime, @startTime)%60 as varchar(20)) + ' seconds]'; exec msdb.dbo.sp_send_dbmail @recipients = @recipients, @subject = @subject, @body = @msg--, @body_format='HTML' end else -- there is no trace file exec msdb.dbo.sp_send_dbmail @recipients = @recipients, @subject = @subject, @body = 'Please check'; end go if exists (select * from sys.procedures where name='uspRestartAlert') exec sp_procoption @ProcName='dbo.uspRestartAlert', @OptionName='startup', @OptionValue='on'; go
List_1 - t-sql Code
The notification email will be similar to the following:
The down time window is from 2014-06-03 16:50:37 to 2014-06-03 17:09:05;
The total down time duration is: [18 minutes, 28 seconds]
Use PowerShell to Determine SQL Server Downtime
Next is the PowerShell script. In the script there is additional information about who initiated the shutdown or reboot of the computer. This additional information shows the advantage of the PowerShell script over the T-SQL approach because PowerShell can easily access the Windows System Log as well as the SQL Server Error Log, while the T-SQL solution generally can only access the SQL Server Error Log.
add-type -AssemblyName "microsoft.sqlserver.smo, version=10.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91";#version should be changed according to your installed sql version, sql2008=10.0.0.0 $info=get-eventlog -logname System -entrytype information -source User32 -computer . -after (get-date).addhours(-12) -message "The process * has initiated * on behalf of *" | select timegenerated, message -first 1 if ($info -ne $null) { $info.message -match "on behalf of user (.*?) for" [string]$user=$matches[1]; $svr = new-object 'microsoft.sqlserver.management.smo.server' 'SQLInstance_Name' # change <SQLInstance_Name> accordingly $curr=$svr.ReadErrorLog(0) | Where-Object {$_.Text -match 'Recovery is complete'} | select LogDate -last 1; [string]$msg=''; if ($curr -ne $null) { $prev = $svr.ReadErrorLog(1) | select LogDate -last 1; $d=$curr.logdate - $prev.LogDate; $msg += "The downtime window is from $(($prev.LogDate).ToString('yyyy-MM-dd HH:mm:ss')) to $(($curr.Logdate).toString('yyyy-MM-dd HH:mm:ss')) `r`n"; $msg += "Total downtime is $($d.hours) hours, $($d.Minutes) minutes and $($d.Seconds) seconds `r`n`r`n"; } $msg += $info.Message; [string]$qry=@" exec msdb.dbo.sp_send_dbmail @recipients='<your_email_acct>', @subject='Server is rebooted at $($info.TimeGenerated) by $($user)' , @body= '$($msg)'; "@; $svr.databases['msdb'].ExecuteNonQuery($qry); }
List_2 - PowerShell script
The notification email will be similar to the following:
The downtime window is from 2014-06-03 16:50:37 to 2014-06-03 17:09:30 Total downtime is 0 hours, 18 minutes and 53 seconds
The process C:\WINDOWS\system32\winlogon.exe <computer_name> has initiated the power off of computer <computer_name> on behalf of user
<domain_user_name> for the following reason: No title for this reason could be found
Reason Code: 0x500ff
Shutdown Type: power off
Comment:
Next Steps
- For the auto-start procedure implementation, you just run the T-SQL script in List_1 on your target SQL Server to generate the auto startup procedure.
- For the SQL Server Agent Job implementation, you can save List_2 as a PS file, for example c:\temp\ReportReboot.ps1, and
then, create a new job, with one job step as follows (i.e. powershell.exe
-file c:\temp\ReportReboot.ps1), you also need to set up the job schedule type to be "Start automatically when SQL Server Agent Starts" :
You can read a few other tips related to SQL Server startup procedures/jobs here:
- Automatically Running Stored Procedures at SQL Server Startup
- Identifying What Runs at Startup on SQL Server
Note: The scripts were tested for SQL Server 2008 and SQL Server 2012.
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2014-06-25