SQL Server Agent – Query To Find Long Running Jobs

I use SQL Agent a ton. It is like my go to place to schedule jobs and tasks. Problem is, there are pieces “missing”. Like if a job hangs, you can have it auto stop after a given amount of time. It doesn’t alert on changes to a job, etc, etc.

I asked on twitter what the DBA’s think, @SQLRockstar thinks using OpsMgr is the way, which probably would be great, if I had it. I need a quick and dirty solution.

What I did was query the sysjobs, sysjobactivity, and sysjobhistory tables to get some kind of “look” into the running jobs. That way, if a job that is normally 45 minutes has been running for 5 hours, I should be able to catch it early on, just trying to be more proactive instead of reactive.

SELECT 
	j.job_id AS 'JobId',
	name AS 'JobName',
	start_execution_date AS 'StartTime',
	stop_execution_date AS 'StopTime',
	avgruntimeonsucceed,
	DATEDIFF(s,start_execution_date,GETDATE()) AS 'CurrentRunTime',
	CASE WHEN stop_execution_date IS NULL THEN 
			DATEDIFF(ss,start_execution_date,stop_execution_date) ELSE 0 END 'ActualRunTime',
	CASE 
		WHEN stop_execution_date IS NULL THEN 'JobRunning'
		WHEN DATEDIFF(ss,start_execution_date,stop_execution_date) 
			> (AvgRunTimeOnSucceed + AvgRunTimeOnSucceed * .05) THEN 'LongRunning-History'
		ELSE 'NormalRunning-History'
	END 'JobRun',
	CASE 
		WHEN stop_execution_date IS NULL THEN
			CASE WHEN DATEDIFF(ss,start_execution_date,GETDATE())
						> (AvgRunTimeOnSucceed + AvgRunTimeOnSucceed * .05) THEN 'LongRunning-NOW'
			ELSE 'NormalRunning-NOW'
		END
		ELSE 'JobAlreadyDone'
	END AS 'JobRunning'
 FROM msdb.dbo.sysjobactivity ja
	INNER JOIN msdb.dbo.sysjobs j ON ja.job_id = j.job_id
INNER JOIN (
	SELECT job_id,
	AVG
	((run_duration/10000 * 3600) + ((run_duration%10000)/100*60) + (run_duration%10000)%100)
	+
	STDEV
	((run_duration/10000 * 3600) + ((run_duration%10000)/100*60) + (run_duration%10000)%100) AS 'AvgRuntimeOnSucceed'
	 FROM msdb.dbo.sysjobhistory
	WHERE step_id = 0 AND run_status = 1
	GROUP BY job_id) art 
	ON j.job_id = art.job_id
WHERE 
(stop_execution_date IS NULL) OR 
	(DATEDIFF(ss,start_execution_date,stop_execution_date) > 60
	AND 
	CAST(LEFT(start_execution_date,11) AS DATETIME) = CAST(LEFT(GETDATE(),11) AS DATETIME))
ORDER BY start_execution_date DESC

The inner query looks at the history to get the average runtime, and I added the standard deviation to that to make it more realistic, in case you have a one off that throws the average off. Of course this all depends on you having more than 3 entries in your job history as well.

I then just compare the average to the current runtime (if the job is executing) and to the actual (if it is complete).

You could use this query in a report, to view throughout the day to catch long running jobs, you could also use it to just give you the one that are currently running and over the time threshold, and alert you, so you can catch them right away.

I tested this on a few servers running SQL 2005, but I am guessing it will work on SQL 2008 as well. Find and kill those long running jobs!!

Advertisements

13 thoughts on “SQL Server Agent – Query To Find Long Running Jobs”

  1. OK, found out that OpsMgr uses sp_help_jobactivity and does a datediff between the start execution time and the current time on the server. also, by default the monitor will flag jobs that have run for longer than 1 and 2 minutes, which is less than helpful, really.i love your idea of checking on the average run time. i could put something into PBM but the return results are limited there. so now i am thinking of some powershell, but i don't think that will do what we want here: you want to be notified as soon as a job has exceeded the average by 10%, correct?

    Like

    1. you are right, the query I put on the blog just look if it is over 5% at this point. Takes AVG + STD of historical run times, * 5%, you might need to tweak the pct's accordingly. this query would have to run every X min/seconds and alert

      Like

      1. This is exactly what I have been looking for….  It doesn’t give accurate results in 2008.  Do you have an update to this by any chance?

        Like

  2. OK, found out that OpsMgr uses sp_help_jobactivity and does a datediff between the start execution time and the current time on the server. also, by default the monitor will flag jobs that have run for longer than 1 and 2 minutes, which is less than helpful, really.i love your idea of checking on the average run time. i could put something into PBM but the return results are limited there. so now i am thinking of some powershell, but i don't think that will do what we want here: you want to be notified as soon as a job has exceeded the average by 10%, correct?

    Like

  3. you are right, the query I put on the blog just look if it is over 5% at this point. Takes AVG + STD of historical run times, * 5%, you might need to tweak the pct's accordingly. this query would have to run every X min/seconds and alert

    Like

  4. Couldn't we use a similar script within Ops Manager to only trigger the monitor when the job is x% past avg runtime or some other criteria to make this Ops Manager monitor a bit more useful?

    Like

  5. Hi, I have few jobs that hangs and I need to put it in the SQL 2005 Alert so it pages me and then I can take a look to find out if the jobs hangs and why. How do I use your query to put it in the alert so it pages me when it hangs more than 3 or 4 hours. Thanks for your help. Ali

    Like

    1. I created a proc that I run from a job every 15min. I added this at the end of the script to email me when a job is running long.–********************Long Running NOW**************************************IF EXISTS(SELECT * FROM #tempjobs where jobrunning= 'LongRunning-NOW') BEGIN SELECT * INTO #tempjobs2 FROM #tempjobs where jobrunning= 'LongRunning-NOW' DECLARE @server varchar(max) set @server = '' –add servername DECLARE @subject varchar(max) set @subject = '' DECLARE @body varchar(max) set @body = '' DECLARE @tmpbody2 varchar(max) set @tmpbody2 = null DECLARE @tmpbody3 varchar(max) set @tmpbody3 = null DECLARE @tmpbody4 varchar(max) set @tmpbody4 = nullselect @tmpbody2 = JobName from #tempjobs2select @tmpbody3 = CurrentRuntime from #tempjobs2select @tmpbody4 = avgruntimeonsucceed from #tempjobs2–if @@rowcount > 0set @body = @body + 'Server:' + @server + ' – Job ''' + @tmpbody2 + ''' is RUNNING LONG!' + '.'set @body = @body + ' ' + @tmpbody3 + ' – CurrentRuntime' + ''set @body = @body + @tmpbody4 + ' – AverageRuntime' + ''set @body = @body + ''SET @body = @bodySET @subject = @server + ' – Long Running Jobs' EXEC msdb.dbo.sp_send_dbmail @recipients='', –add email address @subject = @subject, @body = @body, @body_format = 'HTML'

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s