« Macro Environmental Business Intelligence: web mining, data mining, and text mining of external data sources with Oracle. Part I. | Main | It's the tool, stupid! Or is it? »
February 2, 2009
One pass SCD2 load: How to load a Slowly Changing Dimension Type 2 with one SQL Merge statement in Oracle
The other day I came across a useful new feature in the Merge statement for SQL Server 2008. This new feature outputs merged rows for further processing, something which up until now (Oracle 11.1.0.7) is not possible in Oracle. This extra functionality can be used to load a Slowly Changing Dimension Type 2 in one SQL statement. As a result you have only one pass over the data, less logical I/O, and as a result improved performance. If you are interested how exactly the Merge statement can be used in SQL Server 2008 to load an SCD2 have a look at this article. Recently, this has also featured in Kimball's design tip 107. The whole thing was wrecking my head a bit: how can we achieve the same thing in Oracle? Finally, I came up with a solution. I have to admit it is not perfect, as this will only work if you can identify a changed record at source as either an Insert or an Update. So you need to know if the record is a new one or was just modified. In the vast majority of cases you should be able to have this information available at extract time. So if you extract your information from source via change data capture, timestamp created/timestamp modified, audit records , or Oracle Total Recall you should be good to go.
The full test case can be found at my Business Intelligence Quotient blog
Posted by Uli Bethke at February 2, 2009 7:30 AM
