CPU Pressure forms vital part in performance monitoring and sys.dm_os_wait_stats is a good place to check. Some background info before we look at the DMV.
Waiting time for a SQL Process is divided into two. They are Resource waits and signal waits( CPU waits ). When a process/query is submitted to SQL Server, SQL Server checks whether all resources required for the process are available.Resources would include things like IO, Network, Memory, Locks, Latch waits etc. For Ex: If a process is waiting for another process to release its lock, then it becomes a resource wait because of locks. If a process wants to write something to a disk but its has to wait for some other IO activity to complete then its a resource wait because of IO. So all these waiting time gets categorized as resource waits..
After a process grabs all the resources it requires, the process gets scheduled to run/executed by the Processor/ CPU. So, it enters the CPU queue. If there are already many processes executing, then it has to wait to get the CPU. The time spent waiting for the CPU is termed as signal waits.
sys.dm_os_wait_stats DMV provides the cumulative sum of the waiting time experienced by all processes at each wait type. dm_os_wait_stats DMV is a SQL Server 2005/2008 equivalent of DBCC SQLPERF(waitstats). A lot more accurate explanation about the DMV is provided here . dm_os_wait_stats maintains these stats since last restart of the server or since the last execution DBCC SQLPERF ('sys.dm_os_wait_stats', CLEAR);
sys.dm_os_wait_stats provides about 485 wait types/wait events and waiting time spent at each wait type. Two interesting columns of this DMV are
Wait_time_ms gives the total time spent at each wait type ( Resource + cpu wait ). signal_wait_time_ms specifies the time spent on waiting for CPU. Resource waittime can be found by wait_time_ms - signal_wait_time_ms.
SELECT SUM(signal_wait_time_ms) * 100 / SUM(wait_time_ms) AS
SUM(signal_wait_time_ms) AS cpu_wait,
SUM(wait_time_ms - signal_wait_time_ms) AS resource_wait,
SUM(wait_time_ms) AS total_wait_time
Consider the query above. cpu_pressure_percentage provides percentage of time spent waiting for CPU, by finding the ratio of signal_time and total wait time. Microsoft recommends this value to be less than 25% for a normal system. Value any higher would indicate a system suffering likely from a CPU stress.
Most common causes for CPU stress/high signal waits can be
* Poor and ineffective query plans / missing indexes / table scans
* Too many Plan Recompliations
* Slow CPU and requires a upgrade
As a production DBA, its a good idea to collect the resultset dm_os_wait_stats DMV once few hours or twice or thrice a day and perhaps analyze the results to notice any surge in wait time. This can be one proactive approach to detect potential major problems.
Sunday, October 31, 2010
Saturday, October 23, 2010
Sometimes after taking a backup, one can forget the path where the backup was taken.
The simple script below provides the list of backups taken on the server, backup type ( log - l, Full -D..),backup start date and time, lsn numbers and the path of the backup. Simple and handy.
WHERE backupset.media_set_id = backupmediafamily.media_set_id
ORDER BY backup_start_date DESC
Tuesday, October 19, 2010
Recently, I had a developer who came to my desk with a request.
' I started a execution of a script from a program. the program has been running for a long time. I don't know whether its executing or hanging. Can you check from database whether the script is running or hanging and if hanging at which statement it is hanging?. If possible please tell me why its hanging..'
Earlier in my blog, I posted the following script to check the currently running query.
TEXT AS [query],
Db_name(sysprocesses.dbid) AS dbname,
OUTER APPLY Fn_get_sql(sql_handle)
WHERE spid > 50
ORDER BY cpu DESC
The above query would give the stored procedure's name or starting statements of the script/batch and not the exact statement within the stored procedure or batch that is currently executing. To find that one should use sys.dm_exec_requests
Substring(txt.TEXT, ( statement_start_offset / 2 ) + 1,
( ( CASE statement_end_offset
WHEN -1 THEN Datalength(txt.TEXT)
- statement_start_offset ) / 2 ) + 1) AS statement_text,
CROSS APPLY sys.Dm_exec_sql_text([sql_handle]) AS txt
WHERE session_id <> @@SPID
AND session_id > 50
The query above gives the currently executing statement within a batch. sys.dm_exec_requests provides the columns statement_start_offset ,statement_end_offset which specify the starting and ending positions of the currently executing query within a batch, in bytes. As 'text' column returned by dm_exec_sql_text is of nvarchar datatype, statement_start_offset/2 is required to get to the starting postion of the query. statement_end_offset returns -1 to indicate the end of the batch.So, statement_text column of the query answers the first part of the developer's request.
Now, for the next part , ie finding the status of the query ie.. running/hanging/sleeping etc
A user request can take the following major states.
> Sleeping : Query completed execution and has nothing to do. Connection
is still not closed.
> Running : Currently running/executing.
> Runnable: Can run but waiting for a processor. Implies waiting in
> Suspended: Waiting for some resource or event. Wait may be because of
locks, IO completion, Latch or Memory wait etc.
Sleeping requests are not shown in sys.dm_exec_requests. Runnable/Suspended state requests are the ones we need to watch out for. Runnable/Suspended states are not abnormal, as long as they dont stay at the same state for a long time.
Now for the third part..' Why it is hanging..'
The wait_type column on the resultset will return the reason why a request is waiting. To know the meaning of each wait type refer here
Popular few wait types are given below
ASYNC_IO_COMPLETION - Waiting for IO
CXPACKET - Wait beacuse of Parallelism.
LOGBUFFER - Waiting for the logbuffer to release some space in the memory.
LCK_M% - Waiting for lock to be released. Refer to Blocking_Session_id for finding the blocking process.
Other interesting columns returned by dm_exec_requests are wait_time returning time spent waiting in ms, percent_complete indicating progress of the query,
cpu_time,reads,writes,logical_reads and row_count. percent_complete mostly indicates the progress for select queries,backup process but not for DML operations.
So, a quick execution of the script and checking the wait state details can clearly
show the current progress/ status of the request.