« Catching up on recent news | Main | PC? Me? »
September 7, 2008
The Cloud of Unknowing - experimenting with predictive analytics on the web
The Cloud of Unknowing is a quite beautiful mystical text from 13th century England. It emphasizes the importance of direct experience over intellectual knowledge. I often think about it when considering the complexities of implementing successful predictive analytics. My customers often have an almost mystical goal of foreknowledge and insight, but find themselves caught up with technical details of model building and validation.
Daunted by Quants? Flaunt this taunt.
All this can be overwhelming for the non-technical user trying to get to grips with the technology. But equally, when technical specialists are brought in, the business sponsor can feel out blinded with science. The analytic centre of excellence can quickly be seen as the centre of arrogance.
The problem here is that no amount of testing, validation or statistical reflection can compensate for lack of business understanding. There was a great article in the New York Times technology blog about the failure of technology in the sub prime crisis. Jim Goodnight, the CEO of SAS, put it succinctly when describing the role of analysts in the crisis: It was the quants, and they were using our software. But they didn’t understand the underlying vehicle when computing the risk.
My personal goal for Microsoft's Data Mining technology, and the goal which the team has been pursuing long before I joined, is simple: to put predictive analysis in the hands of people who do understand the business. To help clear the cloud of unknowing.
Of course, the specialist will have a ready reprimand for this approach. Surely, there is a danger of the user creating the wrong model? Well, perhaps that is a danger, but as George Box once put it All models are wrong, but some are useful. In fact, Box went further: The practical question is how wrong do they have to be to not be useful. The answer to that question is often not to be found in the charts and stats of the quants, but in the direct experience of the savvy business user.
Joining the cloud crowd
Even a quick survey of the predictive analytics market will show that its not only the traditional need for specialists that is a barrier to adoption. The cost of software is a significant factor too. (There is an excellent TDWI survey, certainly not done quickly, by Wayne Eckerson here.)
At Microsoft, as so often, we're trying to bring capabilities well within the reach of all users. Data mining has been part of the SQL Server product since 2000. Nevertheless, there is still a barrier to just experimenting with predictive analytics. Even our very successful Data Mining Add-ins for Excel need a server to host the modelling technology.
So here's what's new - and it really is quite unique, a lot of fun, and useful too. One of our developers, Bogdan Crivat, has been working over the Summer on an incubation - to adapt our simple Table Analysis data mining tools for hosting on the cloud. This is now available for anyone to try - no server infrastructure required, just a plain old web browser. (Yes Shawn Rogers, you could try this even in Chrome.) You can try it here.
You can use our sample data, or upload your own *.csv file. Folks have been having a lot of fun with this. For example Brent Ozar blogged enthusiastically and in helpful detail here.
Bogdan has done a super job of this experimental version. And, of course, it is important to emphasise that this is still an experiment at a very early stage. There are plenty of issues to be sorted out before this would be available as a full cloud service.
Yet I am sure there is a future here - enabling users to try before they buy at the very least. Another possibility is that users with only occasional need for predictive analytics - preparing a semi-annual campaign, or analyzing survey results - will love the ability to experiment and perhaps even pay ad-hoc for a simple analytic service.
On premises or on promises?
One of the issues of which I am very aware regarding cloud services is whether businesses are really willing to rely on them for mission-critical applications. I call this the choice between on-premises and on-promises. Once you choose the cloud, you are really in the hands of your service provider and that can be a difficult choice to make for risk-averse executives.
Of course, for these users I still see us delivering server software and powerful desktop clients. But do try this first foray into the cloud. I am sure you will be intrigued.
Feel free to follow me on Twitter.
Technorati Tags: data mining predictive analytics microsoft donald farmer cloud computing
Posted by Donald Farmer at September 7, 2008 7:45 PM

