« "Right" Time Data Integration - How "Real" can it get? | Main | Business Intelligence @ Crossroads »
March 21, 2008
Data Modeling in the BI World
One of the key enablers of successful Business Intelligence programs are the ubiquitous, hard-working "Data Models". Data Model is the heart of any software system and at a fundamental level provides placeholders for data elements to reside.
Business Intelligence systems with all its paraphernalia - Data Warehouses, Marts, Analytical & Mining systems etc. typically deals with the largest volume of data in any enterprise and hence data models are highly venerated in the Data Warehousing world.
At a high level, a good Data Warehouse data model has the following goals: (Corollary - If you are looking for a data modeler look for the following traits)
1) Understand the business domain of the organization
2) Understand at a granular level the data generated by the business processes
3) Realize that business data is an ever-changing commodity - The placeholder provided by the data model should be relevant not only for the present but also for the future
4) Can be described at a conceptual and logical level to all relevant stakeholders
5) Should allow for non-complicated conversion to the physical world of databases or data repositories that is manipulated by software systems.
Extensible Data models deal with all the 5 points mentioned above and more specifically has future-proofing as one of its main stated goals. Such extensible models should also be "consumption agnostic", i.e. - it provides for comparable levels of performance irrespective of the way data is being consumed.
Entity-Relationship & Dimensional modeling (http://www.rkimball.com) has been the lingua-franca of BI data modelers operating at the conceptual and logical levels. Newer techniques like Data Vault (http://www.danlinstedt.com/) also provide some interesting thoughts in building better logical models for Data Warehouses.
At the physical implementation level, both relational (ROLAP)and multi-dimensional (MOLAP) databases form the backbone to the BI infrastructure. Each of these techniques have their own strengths and weakness, hence BI data modelers need to be aware of their capabilities to ensure that the right decisions are taken for physicalization of the logical models.
Even among the relational OLAP vendors, traditionally dominated by row-major databases like Oracle, SQL Server etc. there are column-major relational databases of the likes of Sybase IQ, Vertica etc. gaining a lot of popularity with claims of being built ground-up for data warehousing. The physical layer is also seeing a lot of action with the entry of data warehousing appliance vendors like Netezza, Datallegro etc. (http://www.dmreview.com/article_sub.cfm?articleId=1009168).
The intent of this post can be summed up as - BI practitioners should:
a) Understand the BI/analytical goals of the enterprise before deciding the data modeling techniques - Make it extensible and future proof
b) Understand the current techniques that help envisage and build data models
c) Be on the look-out for new developments in the data modeling and database world - There is lot of interesting action happening in this area right now!!
Data Modeling is a fascinating area that combines functional knowledge with technology skills and a good data model goes a long way in ensuring success of enterprise wide BI initiatives.
Thanks for reading. Please do share your views / thoughts.
Posted by Karthikeyan Sankaran at March 21, 2008 5:45 AM
