Tuesday, 23 April 2013

Computed colums and DateTime queries

Computed columns are there for already a long time, but still, I feel it's worth refreshing one's memory once in a while, so consider this is as a reason for this post.

When using a DateTime value in our applications, we store the complete date and time with a precision of let’s say milliseconds. Most of the time when querying, we will use the date part of the DateTime and in some cases, the time part (up to the minute for example)

When you create an index on the DateTime column, you will create indexes on the precision of milliseconds, which causes the index to be very granular. To avoid this granularity, you should consider creating a computed column or more computed columns of the DateTime column, based upon your query requirements.

A computed Column?

A computed column in a SQL Server Database table is a column which is computed from an expression using other columns in the same table.

Expressions can be a:
  • Function
  • Constant,
  • Non-computed column name
  • Combination of the above connected by operators.
Computed columns are virtual columns that are not physically stored in the database (unless otherwise specified). Their values are recalculated on every query reference. When the PERSISTED keyword is used, the columns are physically stored in the database. The values are updated upon the change of a column that is used in the calculation.

Considerations for computed columns:
  • The expression cannot be a sub query
  • Indexes can be created when the computed column is marked as PERSISTED
  • Computed columns containing CLR functions must be determistic and PERSISTED.

Example
In the example, a computed column will be created for the DateTime column, the column will be persisted and an index will be created on the computed column.

I have a table [dbo].[ComputedColumnsTest] with 200000 records

 


 
We execute the following query:
 


SELECT Name, [Text]
FROM [dbo].[ComputedColumnsTest]

WHERE MONTH([DateTimeEntry]) = 2
 
The execution plan looks as follows:


 
We first Create an index on the DateTimeEntry Column


CREATE NONCLUSTERED INDEX [IX_DatatimeEntry]
ON [dbo].[ComputedColumnsTest]([DateTimeEntry] ASC)
INCLUDE ([Name],[Text])
GO

 
Our execution plan now looks like this:

 

Finnally, we create our computed column specification:

1. Via the SSMS


2. Via T-SQL

ALTER TABLE [dbo].[ComputedColumnsTest]
ADD [ComputedMonth] AS (MONTH([DateTimeEntry])) PERSISTED

We add an Index on the computed column

CREATE NONCLUSTERED INDEX [IX_ComputedMonth]

ON [dbo].[ComputedColumnsTest] ([ComputedMonth])

INCLUDE ([Name],[Text])
GO

And we change the query to

SELECT Name , [Text]
FROM [dbo].[ComputedColumnsTest]
WHERE ComputedMonth = 2

The query plan looks as follows:



When we run both queries side by side, we can see that they both use the index on the computed column.

When we compare both queries again, but we force the first one to use the original index, we have the following query plan.



We can now see that the query using the index on the computed columns has a much lower cost percentage regarding the complete batch.

Conclusion


Creating computed columns and indexes on computed columns do make a difference in performance, even if you are not using the computed column in your query. The SQL Server engine is intelligent enough to create it’s query plans based upon the best performing index.