70-433:Good, Better, Best Index Selection

With SQL Server 2008, you’ve got some additional tricks you can pull off with indexes.  With 2005 you could INCLUDE columns in your index, now with 2008 you can also create indexes with a WHERE clause.  This means you can limit the number of rows in your NONCLUSTERED indexes.

By limiting the number of rows included in your index, you limit the amount of space needed to store the index, plus you can limit the impact of inserts and updates.  Think about it.  If you have data in a table that’s current and historical in nature, but you only read the current data, you could build indexes that only cover the active data.  Updates to historic data will no longer take a hit like they would if you had a traditional index defined.

Let’s dig in to this idea a bit more.  Take the following table:

CREATE TABLE OrderHeader (
    OrderID INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
  , OrderDate DATETIME NOT NULL
  , CustomerID INT NOT NULL
  , SalesPersonID INT NULL
  , CommentDate DATE NULL);

Let’s optimize a query that searches this table by CommentDate and SalesPersonID, it returns the CustomerID and SalesPersonID.  What index would you define on this table given that only 5% of the rows have a non-NULL CommentDate?

You could simply index all three columns.

CREATE INDEX ix_OrderHeader__CommentDate_SalesPersonID_CustomerID ON OrderHeader (
     CommentDate
   , SalesPersonID
   , CustomerID );

Yeah, that would speed up your query, but as of SQL 2005, you don’t have to include the CustomerID in the indexed columns.  I would include it in the INCLUDE LIST

CREATE INDEX ix_OrderHeader__CommentDate_SalesPersonID__INC_CustomerID ON OrderHeader (
     CommentDate
   , SalesPersonID)
INCLUDE (
     CustomerID );

That would speed up your query, and cut the maintenance costs, since you are no longer ordering your index on the CustomerID column, you’re simply including it.  But notice I told you only 5% of the rows have a non-NULL CommentDate.

Since we know we’re going to be looking up data in this table using CommentDate, we won’t be searching for NULL CommentDate.  We no longer have to index all the rows, only those with a non-NULL value.

CREATE INDEX ix_OrderHeader__CommentDate_SalesPersonID__INC_CustomerID ON OrderHeader (
CommentDate
, SalesPersonID)
INCLUDE (
CustomerID )
WHERE
     CommentDate IS NOT NULL;

This is the perfect place to use a WHERE clause, when we know 95% of the rows don’t need to be indexed.  This is the sort of knowledge you’ll need to pass the 70-433!

If you have any questions, please let me know!

Leave a Reply

Your email address will not be published. Required fields are marked *