By: Atul Gaikwad | Updated: 2016-07-13 | Comments (2) | Related: > SQL Server Agent
Problem
We have a few critical applications in our environment where we have monthly SQL Server Agent Jobs which are supposed to be run at a specified time and complete within a specified time frame on a monthly basis. We are responsible to monitor these Jobs and send a status update every hour to the Subject Matter Expert (SME) with the current job status i.e. completed, failed, running at what step, etc. It is a manual process and we have to send an email every hour by checking the status manually with SQL Server Management Studio (SSMS). Are there any other options?
Solution
We are monitoring these SQL Server Agent Jobs, but there are a few occurrences where we missed sending notification to the SME with current status and we didn't realized the job was running longer than expected.
Let's see how we can automate the job monitoring and send a timely notification automatically, so different stakeholders will be aware of the current status including the DBA team.
Step 1: Create a SQL Server Linked Server for the Production server.
1: Create a Linked Server on your monitoring server pointing to your production server.
2: If you are planning to monitor jobs from multiple servers then create the respective Linked Servers.
3: Once your Linked Server is created, ensure it is working properly and tested successfully.
Step 2: Create Reporting tables on your monitoring server to store Job status information
1: There will be three reporting tables. The first table is "Job_Details" which will store the job names and their respective job_id's which you want to monitor. The second table is "Job_Report_Table" which will store the consolidated status for all the jobs for reporting purposes. The third table is "JobRunning" which will store the details of the currently running jobs.
Use the below scripts to create these tables:
CREATE TABLE [dbo].[Job_Details]( [Job_Id] [varchar](max) NULL, [Job_Name] [varchar](50) NULL ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY] GO
CREATE TABLE [dbo].[job_report_table]( [job_name] [varchar](50) NOT NULL, [Status] [varchar](50) NOT NULL, [Duration] [varchar](50) NULL, [Step_name] [varchar](50) NULL ) ON [PRIMARY] GO
CREATE TABLE [dbo].[JobRunning]( [JobID] [uniqueidentifier] NULL, [LastRunDate] [int] NULL, [LastRunTime] [int] NULL, [NextRunDate] [int] NULL, [NextRunTime] [int] NULL, [NextRunScheduleID] [int] NULL, [RequestedToRun] [int] NULL, [RequestSource] [int] NULL, [RequestSourceID] [sysname] NULL, [Running] [int] NULL, [CurrentStep] [int] NULL, [CurrentRetryAttempt] [int] NULL, [State] [int] NULL ) ON [PRIMARY] GO
2: Based on how many jobs you want to monitor, you will have to create multiple job tables. Suppose you have to monitor 3 jobs then you will have 6 tables; 3 base tables and 3 JobDataPull tables i.e. one for each job.
-- One Table for each job. if you have multiple jobs then create table like JobDataPull1, JobDataPull2,JobDataPull3.....So on.
CREATE TABLE [dbo].[JobDataPull1]( [CheckID] [int] NULL, [CheckTime] [varchar](255) NULL, [ServerName] [varchar](255) NULL, [JobName] [varchar](255) NOT NULL, [JobStatus] [varchar](50) NULL, [LastRunDateTime] [datetime] NULL, [JobDuration] [varchar](50) NULL, [CompletionDateTime] [datetime] NULL, [NotifiedBy] [varchar](50) NULL, [Comments] [varchar](255) NULL, [UpdatedOn] [datetime] NULL ) ON [PRIMARY]
GO
Step 3: Create a Monitoring job on your monitoring server to monitor the status of your mainstream jobs.
1: Once you create the respective tables from the above steps, the next step is to create a monitoring job to pull the needed information from the job execution. This Job will have 5 steps.
2: Go to SQL Server Agent Jobs. Right click and select New job. In the new job window provide detail as below.
3: Go to the Steps tab on the left and click New Step. The first step is to truncate old reporting data from the previous execution. Use the below script in step 1.
delete from dbo.job_report_table delete from dbo.JobRunning delete from dbo.JobDataPull1 delete from dbo.JobDataPull2 delete from dbo.JobDataPull3 GO
4: Add a second step to pull information for the first job. If needed add the third and fourth steps for the second and third jobs that are monitored.
5: Use the below script for job step 2. The same script would be used in steps 3 and 4 after the necessary changes are made. Change the following parameters in the script: Linked Server Name, Job Name and Server Name for getting the job running information. All the fields which might need to be changed are in bold.
SET NOCOUNT ON USE Monitoring GO
DECLARE @IsRunning varchar(10) DECLARE @job_name varchar(50) DECLARE @running_status varchar(20) DECLARE @job_duration varchar(20) DECLARE @ThisJobName varchar(255) DECLARE @job_time varchar(20) DECLARE @jobid varchar(255) DECLARE @jobname varchar (255) DECLARE @LinkedServer varchar(50) DECLARE @FirstServerQuery varchar(max) DECLARE @ServerName varchar(100) DECLARE @LinkedServerQuery varchar(max)
----- Email Parameters---- DECLARE @EmailProfile_Name varchar(50) DECLARE @EmailRecipients varchar(max) DECLARE @EmailSubject varchar(max) DECLARE @EmailBody varchar(max) DECLARE @EmailQuery varchar (max) DECLARE @step_name varchar(50)
SET @LinkedServer = 'LinkedServer-A'
--set @job_time=Convert(varchar(19),getdate())
SET @jobid = (select job_id from Monitoring.dbo.Job_Details
where job_name = 'Critical Job 1') -- Fetch the job_id of job to monitor and assign it to @jobid variable -- which will be used through out the script
SET @jobname = '''Critical Job 1''' --Critical job to Monitor Name
SET @ThisJobName = 'Month End Monitor Job' --Your Monitoring job Name
SET @ServerName = 'LinkedServer-A'
SET @EmailProfile_Name = 'ProfileName'
SET @EmailRecipients = 'EmailAddress'
SET @LinkedServerQuery = 'SELECT * FROM OPENQUERY(' + @LinkedServer + ', ''SET FMTONLY OFF set nocount on declare @jobid UNIQUEIDENTIFIER set @jobid = convert(UNIQUEIDENTIFIER,(select job_id from msdb.dbo.sysjobs where name = ''' + @jobName + ''')) exec master.dbo.xp_sqlagent_enum_jobs 1, ''''sa'''', @jobid'')'
-------------------------------------- code for current job step----------------
create table #step_info (job_id varchar(max),job_name varchar(max), start_execution_date varchar(20), current_executed_step_id varchar(30), step_name varchar(30))
insert into #step_info
SELECT ja.job_id,j.name AS job_name,ja.start_execution_date, ISNULL(last_executed_step_id,0)+1 AS current_executed_step_id, Js.step_name FROM [LinkedServer-A].msdb.dbo.sysjobactivity ja LEFT JOIN [LinkedServer-A].msdb.dbo.sysjobhistory jh ON ja.job_history_id = jh.instance_id JOIN [LinkedServer-A].msdb.dbo.sysjobs j ON ja.job_id = j.job_id JOIN [LinkedServer-A].msdb.dbo.sysjobsteps js ON ja.job_id = js.job_id AND ISNULL(ja.last_executed_step_id,0)+1 = js.step_id WHERE ja.session_id = (SELECT TOP 1 session_id FROM [LinkedServer-A].msdb.dbo.syssessions ORDER BY agent_start_date DESC) AND start_execution_date is not null AND stop_execution_date is null and j.job_id= @jobid
set @step_name = (select step_name from #step_info)
drop table #step_info ----------------------------------------------------------------------------
IF EXISTS ( SELECT * FROM Monitoring.dbo.JobRunning WHERE jobid = @jobid ) DELETE FROM Monitoring.dbo.JobRunning WHERE jobid = @jobid
INSERT INTO Monitoring.dbo.JobRunning EXEC(@LinkedServerQuery) /* The script runs exec master.dbo.xp_sqlagent_enum_jobs 1, '0F6322B3-0C9C-4D65-B388-99C3A8239C75' extended SP to get the information of whether the job is currently running or not and insert the state in [JobRunning] table. */
/* The below script Assign the state valur to variable @IsRunning which is 1 if the job is currently running or 4 */
SET @IsRunning = (SELECT [state] FROM Monitoring.dbo.JobRunning WHERE jobid = @jobid)
/* If the status value is other than 1 or 4, the job needs to be checked. */
IF @IsRunning <> 1 AND @IsRunning <> 4 BEGIN SELECT 'IsRunning is in unknown state - Email Check for Problems.' SET @EmailSubject = 'The job status on Server ' + @ServerName + ' at '+ @job_time SET @EmailBody = 'Unknown status for Job name: ' + @jobname + 'on ' + @ServerName + ' server.' + CHAR(13) + CHAR(10) + 'Please check immediately and report status to proper teams for any issues.' EXEC msdb.dbo.sp_send_dbmail @profile_name = @EmailProfile_Name, @recipients = @EmailRecipients, @subject = @EmailSubject, @body = @EmailBody, @execute_query_database='Monitoring'
GOTO ENDQUERY END
/* The below query insert the record in report table when the state is '1' i.e. currently running */
IF @IsRunning = 1 BEGIN
set @running_status='Running' set @job_name='Critical Job 1' set @job_duration= NULL
insert into dbo.job_report_table values (@job_name,@running_status,@job_duration,@step_name)
GOTO ENDQUERY END
/* The below query insert the record in [JobDataPull] table when the state is '4'. The query is the join between the sysjobs and sysjobhistory system tables to checks various values from the column 'run_status' and then insert the status of the job into the report table accordingly */
IF @IsRunning = 4 BEGIN SELECT 'IsRunning is in idle state. Insert SysJobs and SysJobHistory data into JobDataPull'
IF(1=1)
BEGIN
INSERT Monitoring.dbo.JobDataPull1 ( ServerName, JobName, JobStatus, LastRunDateTime, JobDuration, CompletionDateTime ) SELECT @ServerName AS ServerName, sj.name AS JobName, CASE cast(sjh.run_status AS VARCHAR(10)) WHEN '0' THEN 'Failed' WHEN '1' THEN 'Succeeded' WHEN '2' THEN 'Retry' WHEN '3' THEN 'Canceled' WHEN '4' THEN 'In Progress' ELSE cast(sjh.run_status AS VARCHAR(10)) END AS JobStatus, CASE sjh.run_date WHEN '' THEN 'Never' ELSE CONVERT(VARCHAR(10), CONVERT(smalldatetime, CAST(sjh.run_date AS VARCHAR), 120), 120)+' '+ RIGHT('0'+CAST((sjh.run_time/10000) AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000))/100 AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000) -((sjh.run_time-((sjh.run_time/10000)*10000))/100*100)) AS VARCHAR), 2) END AS LastRunDateTime, CAST(sjh.run_duration / 10000 AS VARCHAR(2)) + ' hr ' + CAST((sjh.run_duration % 10000) / 100 AS VARCHAR(2)) + ' mn ' + CAST(sjh.run_duration % 100 AS VARCHAR(2)) + ' sec' AS JobDuration, CONVERT(DATETIME, RTRIM(sjh.run_date)) + ((sjh.run_time/10000 * 3600) + ((sjh.run_time%10000)/100*60) + (sjh.run_time%10000)%100) / (86399.9964) + ((sjh.run_duration/10000 * 3600) + ((sjh.run_duration%10000)/100*60) + (sjh.run_duration%10000)%100) / (86399.9964) AS CompletionDateTime
FROM [LinkedServer-A].msdb.dbo.sysjobs sj JOIN [LinkedServer-A].msdb.dbo.sysjobhistory sjh ON sj.job_id = sjh.job_id WHERE sj.name IN ( 'Critical Job 1' ) AND CONVERT(VARCHAR(10), CONVERT(smalldatetime, CAST(sjh.run_date AS VARCHAR), 120), 120)+' '+ RIGHT('0'+CAST((sjh.run_time/10000) AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000))/100 AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000) -((sjh.run_time-((sjh.run_time/10000)*10000))/100*100)) AS VARCHAR), 2) > getdate()-25 AND sjh.step_id = 0 ORDER BY CONVERT(VARCHAR(10), CONVERT(smalldatetime, CAST(sjh.run_date AS VARCHAR), 120), 120)+' '+ RIGHT('0'+CAST((sjh.run_time/10000) AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000))/100 AS VARCHAR), 2)+':'+ RIGHT('0'+CAST((sjh.run_time-((sjh.run_time/10000)*10000) -((sjh.run_time-((sjh.run_time/10000)*10000))/100*100)) AS VARCHAR), 2) DESC
IF (SELECT TOP 1 JobStatus FROM Monitoring.dbo.JobDataPull1 WHERE JobName = 'Critical Job 1' AND ServerName = '[LinkedServer-A]' ORDER BY LastRunDateTime DESC) = 'Canceled' BEGIN
set @running_status='Job Canceled' set @job_name='Critical Job 1' set @job_duration= NULL
insert into dbo.job_report_table values (@job_name,@running_status, @job_duration,@step_name)
GOTO ENDQUERY END
IF (SELECT TOP 1 JobStatus FROM Monitoring.dbo.JobDataPull1 WHERE JobName = 'Critical Job 1' AND ServerName = '[LinkedServer-A]' ORDER BY LastRunDateTime DESC) = 'Succeeded' BEGIN
set @running_status='Completed' set @job_name='Critical Job 1' set @job_duration= (select top 1 jobduration from dbo.JobDataPull1)
insert into dbo.job_report_table values (@job_name,@running_status, @job_duration,@step_name)
GOTO ENDQUERY END
IF (SELECT top 1 JobStatus FROM Monitoring.dbo.JobDataPull1 WHERE JobName = 'Critical Job 1' AND ServerName = '[LinkedServer-A]' ORDER BY LastRunDateTime DESC) = 'Failed' BEGIN
set @running_status='Failed' set @job_name='Critical Job 1' set @job_duration= (select top 1 jobduration from dbo.JobDataPull1)
insert into dbo.job_report_table values (@job_name,@running_status, @job_duration,@step_name)
--GOTO ENDQUERY END
GOTO ENDQUERY END
ENDQUERY: PRINT 'QUERY ENDED'
END GO
6: The fifth step is to consolidate all the jobs status (i.e. 3 jobs in my case) and send email notification to the different stakeholders. Use the below scripts for step 5.
declare @rows int declare @sub varchar (70) declare @time varchar(20)
set @time =Convert(varchar(19),getdate()) set @sub = ' Critical Job report at '+ @time
declare @tableHTML NVARCHAR(MAX) ; SET @tableHTML = N'<H4>Critical Job Status at : '+ @time+'</H4>'+ N'<style type="text/css">'+ N'.h1 {font-family: Arial, verdana;font-size:16px;border:0px;background-color:white;} '+ N'.h2 {font-family: Arial, verdana;font-size:12px;border:0px;background-color:white;} '+ N'body {font-family: Arial, verdana;font-size:8px} '+ N'table{font-size:14px; border-collapse:collapse;} '+ N'td{background-color:#F1F1F1; border:1px solid black; padding:3px;} '+ N'th{background-color:#99CFFF;}'+ N'</style>'+ N'<table border="0">'+ N'</table>'+ N'<table border="1">' + N'<tr> <th><b>Job Name</b></th>'+ N'<th><b>Job Status</b></th>' + N'<th><b>Step Name</b></th>' + N'<th><b>Job Duration</b></th>' + N'</tr>' + CAST ( ( SELECT "td/@align"='left' ,td = [job_name], '', "td/@align"='left' ,td = [Status], '', "td/@align"='left' ,td = [Step_name], '', "td/@align"='left' ,td = [Duration], '' FROM dbo.job_report_table FOR XML PATH('tr'), TYPE ) AS NVARCHAR(MAX) ) + N'</table>' ;
EXEC msdb.dbo.sp_send_dbmail @recipients='EmailAddress', @blind_copy_recipients='EmailAddress', @profile_name='ProfileName', @subject = @sub, @body = @tableHTML, @body_format = 'HTML' ; GO
7: Now your job should look similar to the screen shot below:
8: Once all the steps are created. Go to Schedules tab from the left menu. Schedule the job to run between your first job start time and your last job end time. In my case, I have schedule to job run every first day of every month, every 1 hour between 12:00 am to 6:00 am. Because, I have jobs running only on first of every month, if you have different situation like your jobs running daily weekly you can change your schedule accordingly.
Step 4: Monitoring jobs notification.
If you complete Step 3 without any issues then you are all set for automatic monitoring and status communication of your critical jobs. On the day, you will start receiving emails every hour as shown below:
Status Email 1 at 12:05 AM:
Now, you have seen how we can setup automatic monitoring of your critical SQL Server Agent Jobs in simple way and send timely status notification to different stakeholders. Please try this and let me know if you have queries or face any issues.
Next Steps
- Check out these additional resources:
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2016-07-13