BeyeBLOGS | BeyeBLOGS Home | Get Your Own Blog

« Unix as a platform for DI is fading quickly | Main | happy new year »

December 15, 2009

Semantic rationalization blog series: part 3 - the specifics of abstraction

Bookmark and Share

In my last blog, I finished up my thoughts on the importance of involving non-ETL-evelopers in the data integration process. Now let me talk more about how expressor addresses the challenges involved in accomplishing this goal.

We have patented a number of concepts related to our solution, describing a business abstraction layer where the steward, analyst, and developer can work together and contribute activities that utilize their individual skill sets, all through graphical interfaces. The curve ball here is that although expressor is based on this forward-thinking architecture, we must also allow a single individual to perform all these activities in the event that his or her organization is not ready for 'pipelined' development or needs to execute 'quick and dirty' projects.

The remainder of this blog focuses on the functionality we believe is necessary to achieve the 'abstraction layer' described above.

Fundamental to this approach is the need to refer to data items by their business names rather than their physical names. This capability allows us to lower the 'communication impedance' between the technical staff involved in developing the solution and the business staff involved in defining the requirements, validating the implementation and using the resulting solution.

expressor provides multiple ways for the steward role to introduce business definitions into our metadata repository: the expressor initiator, our metadata bulk loading tool, provides mechanisms to either bulk load definitions from a logical model or organically grow the businesses ontology through the available database schemas. The expressor administrator, a Web-based GUI tool, provides an interface to enter definitions and terms in a transactional model. Regardless of the approach chosen, the system 'learns' how the organization defines its important data and over time and is able to recognize new external data by its business definition rather than its physical description.

There are a number of internal mechanisms involved in this process - from simple matching to sound-ex type analysis, to combinatorial analysis, to proximity searching - that are used both to identify possible correlations between the external names and business definitions as well as to evaluate the likelihood that the names are related and recommend which relationships are most likely the correct ones. The true power of this functionality becomes evident as more data is added to the system in subsequent projects, but is also evident after the initial project.

That's it for today. In the next blog, I'll expand on the benefits of our abstraction layer.

- Michael Ruland, field engineering

Posted by expressor software at December 15, 2009 10:00 AM

Comments

Post a comment




Remember Me?