Wednesday, March 17, 2010

Whats causing my tempdb to grow - SQL Server 2005


In SQL Server 2005, with the introduction of Snapshot Isolation levels, tempdb is even more heavily used.In addition to the points listed here , Online index rebuild, Snapshot isolation, Read comitted snapshot isolation ( RCSI ),MARS, XML operations,cause heavy tempdb usage. The tempdb usage , in SQL 2005 can be divided into 3 cateogories.

User objects : Mainly meant for storing local and global temp tables and
table variables.
Internal objects : Worktables created for order by /sort /group by
operators/ hash algortihms etc.
Version store : Version store is mainly used at two scenarios. They are
* When indexes are built online, version store when
keeps multiple versions of the same row.
* When the database has snapshot isolation enabled
and if any transaction runs at snapshot isolation
or RCSI.

So, to find whats causing the tempdb growth, one needs to find which category of the three contributes to the growth, the most.

Step 1 : categorizing the tempdb growth using dm_db_file_space_usage

The DMV sys.dm_db_file_space_usage can help us in this cause.
Use the following query to check on which areas tempdb is growing

SELECT SUM (user_object_reserved_page_count) *8 AS usr_obj_kb ,
SUM (internal_object_reserved_page_count)*8 AS internal_obj_kb ,
SUM (version_store_reserved_page_count) *8 AS version_store_kb,
SUM (unallocated_extent_page_count) *8 AS freespace_kb ,
SUM (mixed_extent_page_count) * 8 AS mixedextent_kb
FROM sys.dm_db_file_space_usage

If usr_obj_kb is taking the major percentage, then it implies that temp tables /table variables
are used heavily. This is not considered a problem as long as you have enough space on the disk.

If internal_obj_kb is taking the major percentage, then it implies that large amount work tables,worker files are created because of ineffective query plans.

If version_strore_reserved_page_count is high, then its likely due to long running transactions when there is atleast one active transaction under snapshot isolation level.

One can poll the content of the above query on dm_db_file_space_usage to a table
every 2 minutes and use it for monitoring.

Step 2:Identifying queries causing the growth using dm_db_session_space_usage, dm_db_task_space_usage

After identifying which category of tempdb is growing, one needs to pin point which query is causing the growth. Couple of DMVs help in that cause.

sys.dm_db_session_space_usage : Provides the number of user pages and internal pages allocated and deallocated for a particular session/connection. Doesnt
provide the values for currently running query.Provides only for completed queries/requests in a batch.
sys.dm_db_task_space_usage : Provides the same values as dm_db_session_space_usage but it does only for the currently active queries.

The following query can help us zoom in on the queries that use user pages and internal pages heavily.


SELECT   r1.session_id,
         r5.TEXT,
         r1.internal_objects_alloc_page_count + r2.task_internal_objects_alloc_page_count     AS internal_objects_alloc_page_count,
         r1.internal_objects_dealloc_page_count + r2.task_internal_objects_dealloc_page_count AS internal_objects_dealloc_page_count,
         r1.user_objects_alloc_page_count + r2.task_user_objects_alloc_page_count             AS user_objects_alloc_page_count,
         r1.user_objects_dealloc_page_count + r2.task_user_objects_dealloc_page_count         AS user_objects_dealloc_page_count,
         r3.client_net_address,
         r4.host_name,
         r4.program_name,
         r4.last_request_start_time,
         r4.last_request_end_time,
         r4.login_time,
         r4.cpu_time,
         r4.memory_usage,
         r4.reads,
         r4.writes,
         r4.logical_reads,
         r4.status,
         r4.login_name,
         r4.nt_domain,
         r4.nt_user_name,
         (SELECT Count(1)
          FROM   sys.dm_tran_session_transactions t1
          WHERE  t1.session_id = r1.session_id) AS open_transactions
FROM     sys.dm_db_session_space_usage AS r1,
         (SELECT   session_id,
                   Sum(internal_objects_alloc_page_count)   AS task_internal_objects_alloc_page_count,
                   Sum(internal_objects_dealloc_page_count) AS task_internal_objects_dealloc_page_count,
                   Sum(user_objects_alloc_page_count)       AS task_user_objects_alloc_page_count,
                   Sum(user_objects_dealloc_page_count)     AS task_user_objects_dealloc_page_count
          FROM     sys.dm_db_task_space_usage
          WHERE    session_id > 50
          GROUP BY session_id) AS r2,
         sys.dm_exec_sessions r4,
         sys.dm_exec_connections r3
         OUTER APPLY sys.Dm_exec_sql_text(most_recent_sql_handle) r5
WHERE    r1.session_id = r2.session_id
         AND r3.most_recent_session_id = r2.session_id
         AND r3.most_recent_session_id = r4.session_id
         AND r1.session_id > 50
ORDER BY internal_objects_alloc_page_count DESC,
         user_objects_alloc_page_count DESC

Consistent high values on internal_objects_alloc_page_count, user_objects_alloc_page_count columns reflect heavy usage of tempdb, especially if they have not deallocated the allocated pages. Along with allocation and deallocation columns, the query provides few other useful information for monitoring purposes.

One can dump the results of the above query as done for step 1 at the same interval.

Step 3 : Version store monitoring
If Version store occupies most of the usage, it means that there are long running transactions on snapshot isolation level. Corrective measure can be
keeping the transactions on snapshot isolation level short.

The follwing query can track the longest running transactions that depend on version store.


SELECT   qt.TEXT query_name,
         dm_tran_active_snapshot_database_transactions.transaction_id,
         transaction_sequence_num,
         elapsed_time_seconds
FROM     sys.dm_tran_active_snapshot_database_transactions,
         sys.dm_tran_session_transactions,
         sys.dm_exec_requests
         OUTER APPLY sys.Dm_exec_sql_text(sql_handle) qt
WHERE    sys.dm_tran_active_snapshot_database_transactions.transaction_id = sys.dm_tran_session_transactions.transaction_id
         AND sys.dm_exec_requests.session_id = sys.dm_tran_session_transactions.session_id
ORDER BY elapsed_time_seconds DESC


Poll them if your database uses snapshot isolation. If not then tempdb growth should be mainly due to user/internal object storage which can be identified from step 1 and step 2.

The DMVs are really handy in identifying most of the performance related problems. Most part of this post has been reffered from a awesome white paper which can be read here
and books online article which can be read here.Please read the same for complete details on tempdb monitoring.

No comments: