« Data Modeling and Data Quality, A Circular Relationship | Main | They Need It All »
May 15, 2006
Profile the Target
May 15, 2006
A challenge many data integrators face is the hidden business rules lying concealed in the data ready to spring up like a Jack-in-the-Box after the first hint of integration. Take a target column of product part numbers. The published (defined) format of the part numbers is nine digits starting with an 8 or a 7 followed by a possible dash "-" and then an alpha character. A part number following the published business rule would look like 856756723-a. So the data modeler and the data integrator design both their target column and ETL schema around the published definition, and then run the integration process. The problem is, and we'll use an EII implementation as an example, there are 20 different data sources feeding data through the integration engine. The coding for collecting and "staging" the part numbers is allowing for a 12 character string. Remember, EII builds virtual data sets, which means requested information is retrieved in real time from the 20 root data sources when needed.
A second problem is, and you may have guessed
Posted by Frank Dravis at May 15, 2006 9:16 PM
