Monitoring Jobs on Multiple Servers by Using Multi Server Query

By:   |   Updated: 2012-07-03   |   Comments (3)   |   Related: > SQL Server Agent


Problem

There are often SQL Agent jobs that need to run on multiple servers in a coordinated effort, such as refreshing Development and Test environments from a Production server.  This requires that many jobs run on different servers and it requires many different steps.  Some of these steps maybe kicking off jobs to delete replication settings, making sure all jobs that need to run are finished, starting jobs that restore databases, etc...  Because there are so many jobs and many of these jobs maybe running on different servers it is hard to get an overall status for the entire process.  Each job might send an email when it is done, but since there are so many jobs involved it may be difficult to get the overall picture from a bunch of emails.  In this tip, I will walk through a process I setup to monitor multiple jobs from multiple servers from one central location.

Solution

There are many different ways to get this type of data, but for this tip I am going to utilize Multi Server Query. For those who are new to Multi Server Query, please see my previous article Execute SQL Server query on multiple servers at the same time for more detail.

Setup a Multi Server Group

Before you do anything, you need to setup the server group for servers that you want to query. For this tip, I am going to just setup Local Server Groups, but I highly recommend using Central Management Servers, so that other team members can also share the same groups.

setup local server groups

How to Execute a Multi Server Query

First, right click on the group and click "New Query" as shown below.

run a multi server query in SQL Server

This should create a connection to your group of servers as shown below. In our example "_MSSQLTIP_SERVER".

validate connection for multi server query

Once you are connected, you can copy and paste the script or open the script file from the connection box that you just created and configure it to query the jobs you want to track. 

Here is the code that is used for this process.

SET NOCOUNT ON
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
DECLARE @JobList TABLE (Servername varchar(128), Jobname varchar(500), Stage TINYINT);
DECLARE @Stage TINYINT

/* == Stage Level == 1 - Part 1 : Put Maintenance Mode Check 2 - Part 2 : Drop Replication Check 3 - Part 3 : Start Restore 4 - Part 4 : Bring Replication Back and full refresh 5 - Part 5 : Bring all online */
SET @Stage = 1
INSERT INTO @JobList SELECT 'MyServer1\SQLA', 'Test_OSQLFailed', 1 UNION ALL SELECT 'MyServer1\SQLA', 'Test job', 1 UNION ALL SELECT 'MyServer2\SQLB', 'Test job', 2 UNION ALL SELECT 'MyServer2\SQLB', 'syspolicy_purge_history', 2 UNION ALL SELECT 'MyServer3\SQLC', 'Expired subscription clean up', 1 UNION ALL SELECT 'MyServer3\SQLC', 'Test job', 1
--Checking for SQL Server version IF CONVERT(tinyint,(SUBSTRING(CONVERT(CHAR(1),SERVERPROPERTY('productversion')),1,1))) <> 8 BEGIN ---This is for SQL 2k5 and SQL2k8 servers SET NOCOUNT ON SELECT j.name AS job_name, CASE j.enabled WHEN 1 THEN 'Enabled' Else 'Disabled' END AS job_status, CASE jh.run_status WHEN 0 THEN 'Error Failed' WHEN 1 THEN 'Succeeded' WHEN 2 THEN 'Retry' WHEN 3 THEN 'Cancelled' WHEN 4 THEN 'In Progress' ELSE 'Status Unknown' END AS 'last_run_status', ja.run_requested_date as last_run_date, CONVERT(VARCHAR(10),CONVERT(DATETIME,RTRIM(19000101)) +(jh.run_duration * 9 + jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108) AS run_duration, ja.next_scheduled_run_date FROM (msdb.dbo.sysjobactivity ja LEFT JOIN msdb.dbo.sysjobhistory jh ON ja.job_history_id = jh.instance_id) join msdb.dbo.sysjobs_view j on ja.job_id = j.job_id join @JobList jl on jl.Servername = @@servername and jl.Jobname = j.name and jl.Stage = @Stage WHERE ja.session_id=(SELECT MAX(session_id) from msdb.dbo.sysjobactivity) ORDER BY job_name,job_status END ELSE BEGIN --This is for SQL2k servers SET NOCOUNT ON DECLARE @SQL VARCHAR(8000)
--Getting information from sp_help_job to a temp table SET @SQL=' SELECT job_id,name AS job_name,CASE enabled WHEN 1 THEN ''Enabled'' ELSE ''Disabled'' END AS job_status, CASE last_run_outcome WHEN 0 THEN ''Error Failed'' WHEN 1 THEN ''Succeeded'' WHEN 2 THEN ''Retry'' WHEN 3 THEN ''Cancelled'' WHEN 4 THEN ''In Progress'' ELSE ''Status Unknown'' END AS last_run_status, CASE RTRIM(last_run_date) WHEN 0 THEN 19000101 ELSE last_run_date END last_run_date, CASE RTRIM(last_run_time) WHEN 0 THEN 235959 ELSE last_run_time END last_run_time, CASE RTRIM(next_run_date) WHEN 0 THEN 19000101 ELSE next_run_date END next_run_date, CASE RTRIM(next_run_time) WHEN 0 THEN 235959 ELSE next_run_time END next_run_time, last_run_date AS lrd, last_run_time AS lrt INTO ##jobdetails FROM OPENROWSET(''sqloledb'', ''server=(local);trusted_connection=yes'', ''set fmtonly off exec msdb.dbo.sp_help_job'')' exec (@SQL)
--Merging run date & time format, adding run duration and adding step description SELECT distinct jd.job_name,jd.job_status,jd.last_run_status, CONVERT(DATETIME,RTRIM(jd.last_run_date)) +(jd.last_run_time * 9 + jd.last_run_time % 10000 * 6 + jd.last_run_time % 100 * 10) / 216e4 AS last_run_date, CONVERT(VARCHAR(10),CONVERT(DATETIME,RTRIM(19000101))+(jh.run_duration * 9 + jh.run_duration % 10000 * 6 + jh.run_duration % 100 * 10) / 216e4,108) AS run_duration, CONVERT(DATETIME,RTRIM(jd.next_run_date)) +(jd.next_run_time * 9 + jd.next_run_time % 10000 * 6 + jd.next_run_time % 100 * 10) / 216e4 AS next_scheduled_run_date FROM (##jobdetails jd LEFT JOIN msdb.dbo.sysjobhistory jh ON jd.job_id=jh.job_id AND jd.lrd=jh.run_date AND jd.lrt=jh.run_time JOIN @JobList jl on jl.Jobname = jd.job_name and jl.Servername = @@servername and jl.Stage = @Stage) WHERE step_id=0 or step_id is null ORDER BY jd.job_name,jd.job_status
--dropping the temp table DROP TABLE ##jobdetails END

Let me explain how this code is used. 

For this sample code, I defined stages 1 through 5.  This is used as a grouping for the various parts of my process.  I have multiple jobs that run under each stage, so this helps me narrow down what is returned in the query.  Other than that, the stage has no real meaning.

The key part of this process is the jobs that I am inserting into my temp table @JobList. 

There are three values that are used:

  1. SQL Server Instance Name - this is the SQL instance that I want to check for the SQL Agent Job status
  2. Job Name - this is the name of the SQL Agent Job that I want to check
  3. Stage - this is an arbitrary grouping of the jobs.  You can see how I classified them in the Stage Level comments below

Another key part of this code is to set the @Stage variable.  This is basically setting a filter which determines what data is returned in my query.  In this case I am setting the @Stage = 1 which will only return the three jobs that have a Stage = 1 as shown below.  This example is pretty simple, but if you have 10 or 20+ jobs that run for one process, you can see how this filtering could be very useful.

walk through of sql server code

So when the code is run, as you can see below, it only returns the three jobs that have a Stage = 1.

sample output for multi server query

Notes About the SQL Code

I recommend downloading the script here just in case you have issues with copy the code.

  • This code has been tested on SQL 2000 to SQL 2008 R2.
  • Some of the code I've had for a long time and for SQL 2000 I did not write this originally. This is a very smart way to pull information for SQL 2000, so that it works on other versions.
  • The script has limitations that if a job has only one step and it is still running, the last_run_status will return as "Status Unknown", which you can see in my above example.

The code uses a combination of OPENROWSET and "SET FMTONLY OFF" to make sure this works for both SQL Server 2000 and SQL 2005 / SQL 2008. For more details, I highly recommend reading Using SQL Servers OPENROWSET to break the rules. The rest of the code is just formatting the output to be more readable.

There you have it. Obviously, there are so many other ways to use this script and like someone that I know always says, "The sky's the limit" when it comes to using Multi Server Query.

Next Steps
  • Download the script
  • Enhance the query to pull job status information more accurately by using xp_sqlagent_enum_jobs
  • Enhance the query to include the steps if there are multiple steps that run
  • Enhance the query to include the job log information


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Kun Lee Kun Lee is a database administrator and his areas of interest are database administration, architecture, data modeling and development.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2012-07-03

Comments For This Article




Tuesday, October 25, 2016 - 2:31:25 PM - Kareermatrix Back To Top (43635)

Can we replicate same in case of a particular job is running on each server like to clean out the things on each server after 30 minutes. Actuall we have to send email wit attachment and then we have to remove that file once the email is sent. So we have to have one process running on each server which will delete that file after every 30 minutes. But if create a job on load balancer or apache server managining multiple tomcat instances then how we can do so?

 

-- CTO Office Kareermatrix


Tuesday, December 31, 2013 - 12:44:56 PM - Pam Ozer Back To Top (27922)

This is a great help!! Thanks for posting this.  I have 50 instances I am trying to get data from and this works perfectly


Tuesday, March 19, 2013 - 12:23:48 PM - Pavel Back To Top (22875)

usefull

 















get free sql tips
agree to terms