« How to design fact table for multicurrency column | Main | Data Staging Strategy considerations »
February 3, 2007
How to design if many to many relationship exists between dimension and fact table
By: Milind Zodge
Overview
While designing a data mart/ warehouse sometimes you get cases where you find many-to-many relationship exists between dimension and fact table. Usually you have one-to-many relationship between dimension and fact table which is best for good OLAP/ cubes. In this article I am focusing on such M2M cases and how to design to resolve these cases.
Design
Let’s consider a case. We have a time dimension, which have regular attributes, like year, month, date, day, WOY, DOW etc. And we have a fact table which have following date columns, schedule date, shipped date, order date and promised date.
As per the regular design we can either store these dates as it is or can store particular time_id from a Time dimension. If we store a time dimension id (surrogate key) we will create many to many relationship. If we don’t store ids we will not be able to use these columns easily for data analysis.
To resolve this we can use a "Role Based Dimension" concept.
1. Create following views "schedule_date_vw", "shipped_date_vw" and "promised_date_vw" using the SELECT * FROM dim_time. This will create different roles of time dimension
2. Now use these views as dimensions in your schema
3. Add dim_time dimension table in the design
4. Use surrogate keys from the above created views/ dimension along with dim_time and store it in the fact table
Conclusion
This will resolve a problem of many to many relationship and can easily use in OLAP and cubes for better data analysis.
Posted by Milind Zodge at February 3, 2007 9:45 PM
Comments
Hi Milind,
Regarding your article. I kind of understand where you are going in terms of the M-M relationship. Since in the Time Dimension we are going to have Day level, Month Level, Year Level data. However, if we introduce a column called Level_Nme in the Time Dimension and load the data based on Date and generate all the Time_IDs for it. Load the data for Month and new set of Time_IDs for it.
When you query the Fact table by qualifying Level_Nme = 'DAY' will also resolve the M-M relationship.
What do you think ?
Posted by: Ramesh Veerappan at August 21, 2007 8:01 AM
Hi Ramesh,
Thanks for the comment.
Yes, that is another way of doing it. However you still will be using a same dimension table and need to have logic while building cubes. If you have three date columns which will be used in analysis or cube then having three different dimension (role based) will be much easier to build a cube. That's what I was trying to say.
Thanks,
Milind
Posted by: Milind at August 28, 2007 9:25 PM
