Friday, April 2, 2010

Sparse Columns


Sparse columns are a new column property introduced in SQL Server 2008, to improve the storage of NULL values on columns.When a column is defined as SPARSE, and if a null is inserted on the column then the column doesnt occupy any space at all.

Some facts:

* A Non varying column with a NULL value occupies the entire length of the column.( Datetime column with null occupies 8 bytes)
* A varying column(varchar) with a NULL value occupies a minimum of two bytes.

So, by defining a column sparse one can save the space that is wasted when NULL values are inserted.But Sparse columns come at a cost. When a sparse column
contains NON NULL value / Valid value, it consumes extra 4 bytes of space. So,
one is expected to use sparse columns only when 90% of the value on the sparse column is expected to be null.

Example:

Consider the following table.


CREATE TABLE [dbo].[sparse_col]
  (
     [dt]    [DATETIME] NOT NULL,
     [value] [INT] NULL,
     data    CHAR(500) NULL
  ) 


Data column doesnt have a sparse property defined on it. Let me insert 10,000 rows using the following script


DECLARE @id INT

SET @id = 1

WHILE @id <= 10000
  BEGIN
      INSERT INTO [sparse_col]
      SELECT Getdate(),
             @id,
             NULL

      SET @id = @id + 1
  END 


The spaceused occupied by the table is 6280 kb as shown below.



Let us recreate the table with 'data' column defined as a sparse column.

CREATE TABLE [dbo].[sparse_col]
  (
     [dt]    [DATETIME] NOT NULL,
     [value] [INT] NULL,
     data    CHAR(500)  Sparse NULL
  ) 

The same script provided above is used to insert the rows. The space occupied is just 264KB after switching to sparse column.



Size of a row :

Size of each row in SQL Server 2008 is limited to 8060 bytes. Sparse columns doesnt allow one to exceed this limitation.The size of all Non null sparse columns of a row is limited to 8019 byes.

Example:

Let us change the table definition , by adding addtional column

ALTER TABLE sparse_col ADD data2 CHAR(8000) Sparse NULL

INSERT INTO sparse_col
SELECT Getdate(),
       5,
       NULL,
       NULL 

The above insert works. But the below doesnt.

INSERT INTO sparse_col
SELECT Getdate(),
       5,
       'x',
       'x' 

The insert fails with a error indicating that the size of the row exceeded 8060 byte limitation.

Adding more than 1024 columns:
By default SQL Server 2008 allows 1024 columns per table. But, by using sparse columns one can increase the number of columns to 30,000. However, there can be a maximum of 1024 non sparse columns on the table and the rest have to be sparse columns. The sparse columns defined will have to be grouped using the Column set feature introduced in SQL Server 2008. Column set is a untyped XML column which will group all the sparse columns on the table. The Column set column is a virtual column which doesnt get stored in the table. For more details refer here.

A table with more than 1024 columns is called a wide table and it can be defined with the following script.


DROP TABLE [sparse_col]

GO

CREATE TABLE [dbo].[sparse_col]
  (
     [dt]                  [DATETIME] NOT NULL,
     [value]               [INT] NULL,
     data                  CHAR(500) Sparse NULL,
     specialpurposecolumns XML COLUMN_SET FOR ALL_SPARSE_COLUMNS
  ) 


A table with the column set feature is created using the above script. SpecialPurposeColumns is a column set which will be grouping all the sparse columns in the table.The script provided below adds more sparse columns to the table.


DECLARE @id INT

SET @id = 1

DECLARE @sql NVARCHAR(100)

WHILE @id < 25000
  BEGIN
      SET @sql = 'ALTER TABLE [sparse_col] ADD Col' + CONVERT(VARCHAR, @id) + '  int sparse null '

      EXEC Sp_executesql @sql

      SET @id = @id + 1
  END 

The above script adds 25000 columns to a table. One doubts whether its neccassary.


Sparse columns do come with many restrictions like sparse columns cant participate in primary keys,sparse columns cant have default values etc. For complete set of restrictions refer here

No comments: