By: Jeffrey Yao | Updated: 2022-01-21 | Comments | Related: > SQL Server Agent
Problem
I inherited a Microsoft SQL Server environment where all SQL Agent jobs are configured with the native job failure notification as seen in SQL Server Management Studio (SSMS).
This configuration is not always sufficient for our sophisticated business requirements, for example, if a job step fails, we want to send a different email to a different team depending on the failed job step, or when a job fails, we want to log some customized message in a centralized database, or do some automatic post-failure actions, all these can be done if we have a last job step dedicated for job failure processing instead of only using the native job failure notification mechanism.
Is there a way as a Database Administrator (DBA) that I can easily add a last job step in all SQL Agent jobs (about 230+ jobs across 50+ SQL instances)?
Solution
Adding a job step at the bottom of a SQL Agent job is quick, but the challenge is to modify all previous steps.
We want to make sure the following rules are observed for all jobs:
If a job step configures "On failure action" to "Quit the job reporting failure", we need to change the configuration to "go to the [last] step"
If a job step configures "On failure action" to "Go to the next step" or "Go to the step [N]…", we keep the configuration unchanged.
If a job step configures "On failure action" to "Quit the job reporting success", we keep the configuration unchanged for simplicity. (But we can configure it to go to the newly added last step and make further post-failure process if justified)
Tool and Environment
There may be different ways to solve this issue, but I find using the dbatools PowerShell module is probably the most convenient way.
First let’s set up a test environment, I will first create one test job on my local SQL Server 2016 instance.
USE [msdb] -- SQL Server Database GO if not exists (select * from dbo.sysoperators where name = N'DBATeam') EXEC msdb.dbo.sp_add_operator @name=N'DBATeam', @enabled=1, @weekday_pager_start_time=90000, @weekday_pager_end_time=180000, @saturday_pager_start_time=90000, @saturday_pager_end_time=180000, @sunday_pager_start_time=90000, @sunday_pager_end_time=180000, @pager_days=0, @email_address=N'[email protected]', @category_name=N'[Uncategorized]' GO /****** Object: Job [Test_Job] Script Date: 12/5/2021 10:42:12 PM ******/ BEGIN TRANSACTION DECLARE @ReturnCode INT SELECT @ReturnCode = 0 /****** Object: JobCategory [[Uncategorized (Local)]] Script Date: 12/5/2021 10:42:12 PM ******/ IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1) BEGIN EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback END DECLARE @jobId BINARY(16) EXEC @ReturnCode = msdb.dbo.sp_add_job @job_name=N'Test_Job', @enabled=1, @notify_level_eventlog=0, @notify_level_email=2, @notify_level_netsend=0, @notify_level_page=0, @delete_level=0, @description=N'No description available.', @category_name=N'[Uncategorized (Local)]', @owner_login_name=N'sa', @notify_email_operator_name=N'DBATeam', @job_id = @jobId OUTPUT IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [S1] Script Date: 12/5/2021 10:42:12 PM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'S1', @step_id=1, @cmdexec_success_code=0, @on_success_action=3, @on_success_step_id=0, @on_fail_action=4, @on_fail_step_id=4, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'print ''hello world''; ', @database_name=N'MSSQLTips', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [S2] Script Date: 12/5/2021 10:42:12 PM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'S2', @step_id=2, @cmdexec_success_code=0, @on_success_action=3, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'print ''hello world'' raiserror (''this is a test error'', 16, 1);', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [S3] Script Date: 12/5/2021 10:42:12 PM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'S3', @step_id=3, @cmdexec_success_code=0, @on_success_action=1, @on_success_step_id=0, @on_fail_action=3, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'print ''hello''', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [S4] Script Date: 12/5/2021 10:42:12 PM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'S4', @step_id=4, @cmdexec_success_code=0, @on_success_action=3, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'print ''hello world''', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback /****** Object: Step [S5] Script Date: 12/5/2021 10:42:12 PM ******/ EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'S5', @step_id=5, @cmdexec_success_code=0, @on_success_action=1, @on_success_step_id=0, @on_fail_action=2, @on_fail_step_id=0, @retry_attempts=0, @retry_interval=0, @os_run_priority=0, @subsystem=N'TSQL', @command=N'print ''hello world''', @database_name=N'master', @flags=0 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1 IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)' IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback COMMIT TRANSACTION GOTO EndSave QuitWithRollback: IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION EndSave: GO
The final job step flow is like the following from SSMS:
As we can see, in this job:
- step 1 (i.e. S1) will go to step 4 (S4) if S1 fails
- step 2, 4 and 5 (S2, S4, S5) will quit the job reporting failure if each fails
- step 3 will go to next step if it fails
What I want is to add a last step so if S2, S4, S5 ever fail, the "On Failure" action will jump to this newly added step.
In this last step, we can include any workflow we want, for example based on different job steps, sending a notification to different teams, this is especially true in a ETL job where we need to retrieve data from different sources, and each source has its own business owner.
But in my test job, I just send out an email reporting the failed job step and the failure time. Database mail with the mail profile is already setup.
Solution Algorithm
To better understand the script, here is quick description of the solution algorithm.
- Add a new step with name [Failure Notification] at the end of the job, and mark down the job step number [N], set both "On success action" and "On failure action" to "Quit the job reporting failure", as this step is used for reporting job failure, so we need the job history to show failure if this step is ever executed.
- Scan through all job steps from the first step (i.e. step 0) to step [N-2], and if a job step "On failure" action is "Quit the job reporting failure", change the action to "Go to step [N-1]" (the last step is N-1 because first step is 0), i.e. to our newly added last step. Otherwise, no change.
Here is the PowerShell script:
# function: update existing jobs by adding a final step for failure notification # after the final step is added, we need to scan previous job steps # if that step's onfailure is "exit reporting error", we change it to "on failure, go to last step" import-module dbatools; [string[]] $ServerList = 'localhost\sql2016'; # you can put a list of servers #loop through each server foreach ($svr in $ServerList) { $jobs = get-dbaagentjob -SqlInstance $svr | where name -like "test*"; # replace "test*" with your desired table name patterns # loop through each job foreach ($j in $jobs) { [string]$qry = @' declare @recipients varchar(255) ='[email protected]' -- change it to your own email , @subject varchar(255)= '[Failed] Job [$(ESCAPE_NONE(JOBNAME))] failed on [$(ESCAPE_NONE(SRVR))]' , @body varchar(max), @crlf char(2)= char(0x0d) + char(0x0a); select top 1 @body = 'The failed step is step_id=' + cast(h.step_id as varchar(5)) + '; step_name=' + h.step_name + @crlf + @crlf + 'Message:' + @crlf + h.message from dbo.sysjobhistory h inner join dbo.sysjobs j on h.job_id = j.job_id and j.name = '$(ESCAPE_NONE(JOBNAME))' where step_id <> 0 and run_status=0 order by instance_id desc print @body; exec sp_send_dbmail @recipients = @recipients , @subject = @subject , @body = @body; '@; $final_step = @{ sqlinstance = $svr; # job = "$($j.Name)"; stepName = "Failure Notification"; command = $qry; Database = "msdb" onFailAction = "QuitWithFailure"; OnSuccessAction = "QuitWithFailure" }; $newstep = New-DbaAgentJobStep @final_step; #adding the final job #scan all previous steps and update any job step with onfailure action as "QuitWithFailure" to "GoToStep" -> the last step. $j.JobSteps.Refresh(); $cnt = $j.JobSteps.count; #$cnt already includes the newly added final step for ($i=0; $i -lt $cnt-1 ; $i++) { $js = $j.JobSteps[$i]; if ($js.OnFailAction -eq 'QuitWithFailure') { $js.OnFailAction = "GoToStep"; $js.OnFailStep = $cnt; $js.alter(); } #if } #foreach $i $j.refresh(); } #foreach $j } #foreach $svr
After I run the script, I can see the job steps changed to the following
What we can see is that Step [S1] when on failure still goes to step [S4], i.e. no change, while all others (S2, S4 and S5) will go to the new final step [Failure Notification] on failure. That’s exactly what is required.
If I run the sample job, it will fail at step 2.
I will get an expected email notification from the final step.
Summary
This tip shows how to add a final step in a job for error notification purposes, and the tricky part is to scan through all previous job steps and pick up the steps whose "on failure" actions are "quit the job reporting failure", and update them to "on failure, go to the last job step".
The script can run against multiple servers and update multiple jobs at one time if the final step is the same.
Next Steps
There may be some other potential common scenarios, for example, adding a first job step to do some business work, such as checking some conditions before starting the job, for this requirement, it is relatively easier as all we need to do is to add a new step and update it to be the starting job step.
There are many SQL Agent job related commands in the dbatools module and they are really handy tools (such as copying a job from one instance to another) and you can find out all of them using the following command:
Get-command -module dbatools -name *agent*
Currently there are 47 cmdlets covering various aspects of SQL Server job management: such as job category, job schedule, proxy, agent log, etc.
Please take a look at these other useful articles:
- Failed SQL Server Agent Jobs
- Use Tokens in Job Steps
- SQL Server Job Change Auditing
- SQL Server Agent Tips
- SQL Server MSDB Database Overview
About the author
This author pledges the content of this article is based on professional experience and not AI generated.
View all my tips
Article Last Updated: 2022-01-21