« February 2010 | Main | April 2010 »
March 15, 2010
Innumeracy and Business Intelligence
My inspiration for this post is the wonderful book titled "Innumeracy" authored by John Allen Paulos. In this book, the author hypothesizes that many of us are unable to deal with numbers in the real world and that by understanding the concepts in this book, we can get a clearer, more quantitative way of looking at the world.
Business Intelligence, arguably, is the most quantitative of areas in Information Technology. At a very basic level, BI deals with metrics collected about various business processes. The way the metrics have to be managed and manipulated depends on the mathematical content of these metrics. If that sounds too profound, well, it is intentional and I urge you to read on!
Any Data Warehouse data modeler will appreciate the fact that metrics collected in a fact table have to be understood in the context of the Fact Table grain, viz. A transaction grain fact table has metrics that are to be treated differently than the ones stored at a Periodic snapshot level or as an Accumulating snapshot. Think about Fully Additive, Semi-additive facts and you get the idea.
Similarly, a BI report developer deals with numbers on a daily basis. A good understanding of the numbers (can it be added or averaged or extrapolated) to be shown on a report, is essential to arriving at the right information content and also the correct way to visualize the numbers in question. As a simple example, read Ralph Kimball's classic article on (aren't all his articles classics!) SQL Roadblocks and Pitfalls here and we realize that to decipher an article that exposes the basic limitations of SQL in dealing with moving averages (a very common requirement in BI reporting), we need the ability to think mathematically.
Moving on to the realm of data mining, predictive analytics and its ilk, we as BI practitioners are starting to tread on areas that require a solid quantitative mindset. In one of my earlier blogs titled 'The Esoteric World of Predictive Analytics',I had argued that traditional statistics is not enough to make sense of Predictive Analytics, when it comes to modeling Human Behavioral Systems which is what BI applications are all about. More fundamentally, an understanding of probabilities, central tendencies, cause and correlation, normal distributions, regression models, design of experiments etc. is becoming very important for BI practitioners and with sites like this one - Rice Virtual Lab in Statistics, it is quite possible to get a grasp on the fundamentals in a short time-frame.
Let me close this blog with a paragraph from 'Innumeracy'. John Allen Paulos writes and I quote "In an increasingly complex world full of senseless coincidence, what's required in many situations is not more facts - we're inundated already - but a better command of known facts, and for this a course in probability is invaluable....Probability, like logic, is not just for mathematicians anymore. It permeates our lives".
BI practitioners, whose lofty ideals, relate to helping organizations make sense out of their customers' behavior, would do well to give their "Quantitative Gene" a push or shove in the right direction.
Thanks for reading. Please do share your thoughts.
Posted by Karthikeyan Sankaran at 1:45 AM | Comments (1)
