BeyeBLOGS | BeyeBLOGS Home | Get Your Own Blog

« May 2007 | Main | July 2007 »

June 20, 2007

Design technique for Date type columns in fact table for maximum performance

By: Milind Zodge

Overview
In the Data warehousing project you have dimensions and facts tables. And when you design a Data warehouse or Data Mart you come across many Date data type attributes. In this article I have pointed out a design technique for Date columns which gives highest performance.

Design
Consider a data mart having a fact table "Order" with many columns like Order Number, Order Date, Shipped Date and Amount and a "Time" dimension which have an entry for each day. You use "time_id" for "Order Date" however most of the time "Shipped Date" is kept as a Date column.

Consider in your reporting system you want to design a report to report number of orders shipped in a particular year. Now you will have format the shipped date column so that you can compare its year portion to get result. If you have a massive fact table this query is going to take more time as it will not be using any index, well you can create index to solve this problem.

Now consider you have a report which report number of orders shipped in a particular month, day, quarter etc. To speed up this operation you will have create index probably more than one. However if we use the id column and index on that column then you can avoid the above problem

Add shipped_date_id column along with the Shipped Date column in the fact table. Derive the value by using Time dimension. So whenever you query you always use index.

Conclusion
This way you can achieve maximum performance without adding more indexes. You can just go to your time dimension get the required ids and join it with your fact table which will use index defined on "shipped_date_id" column.

Share: del.icio.us Digg Furl ma.gnolia Netscape Newsvine reddit StumbleUpon Yahoo MyWeb  

Posted by Milind Zodge at 7:30 PM | Comments (1)