Saturday, August 22, 2009

Fragmentation

Oopsie..I am posting on wrong order..

First it should have been the one on fillfactor..
Then this one..

And then the script to detect fragmentation.

There are two types of fragmentation ie.. internal and external fragmentation. The internal fragmentation which is caused by frequent deletes/inserts and also influenced by fill factors.External fragmentation can be further classified into extent and logical fragmentation which will be briefly explained here.

Before getting into external fragmentation just a quick background info.
As the tables grow new pages are added to the table. The newly added pages may/may not be the located as the immediate page(with the next physical address) to the previous page in the table.Ex: Let us say ,first 100 rows of a table belong to the page at address 500. The next 100 rows in table may be present in the page 515. So this would make the Query engine to jump from Page 500 to 515 while scanning thro the table.

Extents are a group/chain of 8 Pages. An Extent can contain pages from the same table/object or different table/object.When the query engine has to jump outside a current extent to obtain the next row in the table, it is termed as extent switch. The total number of extents in a table which are out of order is refered as extent fragmentation.

The Out of order pages ( only leaf level pages ) in indexes are termed as logical fragmentation.SQL Server provides logical fragmentation values for clustered indexed tables and Nonclustered Indexes. A table with no clustered index's fragmentation is expressed by SQL Server only in terms of extent fragmentation and not by logical fragmentation. When logical fragmentation is specified for a clustered index it implies for the entire table data as the leaf level pages of a clustered index make the actual table data. The script here provides the logical and extent fragmentation for the entire database.

So whats the problem if fragmentation is present? Problem would be a performance dip as SQL Server has to jump from one page to another which slows down reads.The fix is given below.

Fragmentation less than 30% - Reorg can help.Reorg command is ALTER INdex Indexname|ALL ON tablename Reorganize
Fragmentation greater than 30% - Rebuilding the index can help. ALTER INdex Indexname|ALL ON tablename Rebuild.
If its a heap(table with no clutsered index) that is heavily fragmented then creating a clustered index on the table can reduce the fragmentation.

All the options shuold be preferablly at our system's off peak time as they are quite resource intensive.Microsoft's maintenance planner has Rebuild and Reorg defined in it. Rebuild,Reorg,Database Shrinking have an impact on fragmentation and the comparison of these three options will be covered in upcoming posts.

Wednesday, August 5, 2009

Connection leak

A quick script to check connection leaks.


SELECT   spid,
         loginame,
         hostname,
         program_name,
         cpu,
         physical_io,
         memusage,
         open_tran,
         Db_name
(dbid)
FROM     MASTER..sysprocesses
WHERE    spid > 50
         AND Datediff
(hh,last_batch,Getdate()) > 4
ORDER BY last_batch


This script can be handy in checking for unclosed connections by any program on the database server. Last_batch column on sysprocesses indicates the time at which the last query was executed by a connection.Spids below 50 refer to system connections and hence they have been filtered.This query lists down the processes that have been holding on to the connection for more than 4 hours.Most of such connections have their current status to be waiting.


SELECT   spid,
         
query_name.TEXT,

         
loginame,

         
hostname,

         
program_name,

         
cpu,

         
physical_io,

         
memusage,

         open_tran,

         
Db_name(MASTER..sysprocesses.dbid)

FROM     MASTER..sysprocesses

         
CROSS APPLY Fn_get_sql
(sql_handle) query_name
WHERE    spid >
 50
         
AND Datediff
(hh,last_batch,Getdate()) > 4
ORDER BY last_batch


If you are on sql 2005 use the above query to get the last query executed as well.

Tuesday, August 4, 2009

Fill Factor

This post assumes that you know about indexes, clustered index, non clustered indexes
but not about fillfactors :)

If you dont have background info on indexes then run thro these links

http://msdn.microsoft.com/en-us/library/aa933129(SQL.80).aspx

http://msdn.microsoft.com/en-us/library/ms177443(SQL.90).aspx

http://msdn.microsoft.com/en-us/library/ms177484(SQL.90).aspx

If you already know about indexes and also fill factors then dont waste your time and run away from this post :)

ok..so lets get started with fill factors..

When designing indexes one would have noticed a small option to set fill factor percentage. So what exactly is fill factor? Fill factor specifies the percentage up to which leaf pages( clustered indexed table's data pages and non clustered index's leaf pages ) should be filled with. To add more details, when a table is created certain number of pages are allotted to the table. As the rows in the table increases table grows into more number of pages. If the table has a clustered index and if fill factor is set then data pages would be filled up to percentage specified by fill factor.

A sample data page when fill factor is set to 80%. Values mentioned (100,101,115..) are the values of the clustered indexed column.



One needs to reserve space on each data page using fill factor to support the extra space required when random inserts happen on a clustered indexed table. For example if an insert with value 117 (on clustered indexed column) for above-mentioned data page occurs then the other rows (from 120 to 239) need to move to the next row. Another insert say with a value118 would cause another shift. When the data page becomes full SQL Server triggers a 50-50 split. 50-50 split would mean that a new page will be added to the table and the current page and new page would share rows evenly. The problems with 50:50 split are as follows.

  • Half the page will be empty which means more pages needs to be read when rows are to be selected. This results in high IOs
  • 50-50 split takes extra time, which would result in slower inserts.

When the tables have a number of pages which are 50% full then the it would longer for the engine retrieve the rows in the table as it has to scan more pages than when the pages are completely full (or close to full). So Ideally, we would like to keep 50:50 splits as low as possible. So keeping the fill factor to 100% is not a great option if table can expect a number of random inserts (Ex: if your clustered index is on email id column many random inserts are definitely possible). Also, having a low fill factor say 50% would result in major portion of the page remain unused for most of the time. Imagine the case like a table of size 800 KB should fit in 100 pages ( aprroximately) but occupies 1200 pages, would result in query consuming 50% more than it was actually supposed to.So, Picking a wrong fill factor can lead to heavily fragmented data pages, which can cause a serious performance dip on all queries fired to the table. So a fill factor should be picked with the nature of inserts (and updates/deletes) in mind.

After many cycles of random inserts and page splits, it is common for tables to have irregular free space on pages. This type of irregularity is called an internal fragmentation, which can be fixed using DBCC ReIndex/Index frag command. Many DBA’s make Reindex command as a part of their maintenance script for the above-mentioned reason.

In SQL 2000 Internal Fragmentation can be checked using DBCC showcontig command. The Average Page density on the resultset of Showcontig displays the same. For more details on checking fragmentation refer to my other post.