BeyeBLOGS | BeyeBLOGS Home | Get Your Own Blog

« The 4 C's of a Practical Data Warehouse | Main | Correlate: The 2nd C »

November 24, 2009

Collect: The 1st C

The 4 C's of Practical Data Warehousing take a very open approach to describing what a data warehouse is responsible for doing. Perhaps, the function that can be interpretted most broadly from a technical perspective is the first C: Collect.

The first job of a data warehouse is to bring together, in one business tool, the information that an analyst, knowledge worker, or decision maker needs to understand business operations. Whether that collection of information is a literal copy between data stores, a federation of databases, or a more complicated transformation of transactional data into a dimensional model is merely an implementation detail. Some of those choices are more or less practical depending on the underlying systems and data (which we'll discuss later), but to be of any value, the data warehouse has to do th collecting of information together so that users don't have to. One of the key benefits of data warehousing is the ability of an analyst to have a one-stop-shop for most of the information they need to do their analysis. In analytically immature organizations, analyst will typically spend 80% of their time collecting data and putting it into some kind of local data store (which might anything from flat files to MS Access to small databases) and only 20% of their time doing analysis on that data. One of the goals of the data warehouse is to flip that ratio so that knowledge workers are able to spend 80% of their time analyzing business operations and only 20% of their time retrieving data from as few different sources as necessary.

When various analysts from different departments (marketing, strategic planning, sales, finance, etc) all ask the same people for the same data on a continual basis, it also prevents the application teams from having time to make improvements or plan upgrades to the applications themselves. There are still organizations that have multiple staff members dedicated to the work of fulfilling data extract requests to support internal analytical needs. A data warehouse that collects the information together once, into a common enterprise model of business activities, satisfies all of those individual departments with one extract from each source system and a consolidated environment in which to do their analysis.

(Repost from: Sharpening Stones)

Posted by Paul Boal at November 24, 2009 9:30 PM


Post a comment

Remember Me?