« Chinese surnames | Main | Data visualization - in a music video »
August 11, 2007
The world is flat - or at least its files are.
A couple of weeks ago, Scott Humphries held his annual Pacific Northwest BI Summit in Oregon. It is a private event, small but highly valued, and organized impeccably by Scott. The Summit is a real pleasure, with all the ingredients of a memorable symposium - fascinating company, beautiful surroundings, and a wonderful host. However, the Summit is much more than just a good time: it is an opportunity to have conversations and to exchange insights across a very broad spectrum of the BI business, not only with deeply knowledgeable friends, but also with colleagues from companies outside our usual circle of partners.
This year we formally covered four topics - RFID intelligence, software as a service, IT and business alignment, and data warehouse appliances. Informally, the subjects were ever more diverse. Coming away from the weekend, I always find that some insights have been new and surprising; some have simply, but valuably, confirmed what I have already been hearing from partners and customers; and some give an interesting new tingle to vaguely defined feelings I have had about the BI Industry and its practices.
Here is just one example. We were discussing Software as a Service, and someone observed that, in their SaaS world, many clients still exchanged data with the service in the form of encrypted flat files, exchanged over secure http. These customers were unwilling, for security, to open a port in their datacenter to exchange data with the service provider. There was much head-nodding and recognition around the table. For me especially, having spent five years specifically working on data integration technologies, I was all too aware that flat files are pervasive.
Nevertheless, one thinks of software as a service as being on the leading edge of innovation, and it was a little surprising to discover that good old flat files are still to be found there - and not only as lingering artifacts of an earlier age, but as a positive choice for otherwise early-adopting customers. It is rather like visiting the restroom in a high-tech Japanese building, and finding a squat toilet – elegant and efficient, but somehow something one expected to be phased out.
I love flat files. You have to marvel at the sheer ingenuity - sometimes inspired, sometimes perverse - with which data architects have been able to overload the meanings of delimiters, work around embedded characters, pad fields, compress fields, normalize, denormalize, you name it. And it’s not only what people have been able to do with the 2**7 characters of ASCII – We had great fun working out how efficiently to parse (and help users to define) fixed width columns in multi-byte character sets. Great stuff!
I have a friend in Canada who, in his retirement, carefully watches the Canadian markets. For this he uses Microsoft's MoneyCentral website. Now, as it happens, several of the exchanges who provide data to MoneyCentral use a simple form of compression for their streaming ticker data: they leave out the decimal point from each quote. For a quote to two decimal places, this can account for between 14% and 25% compression. Every hour or so, the data provider sends a reminder of where the decimal place should be. However, very occasionally, the provider would overlook to send this reminder and my friend's stocks appeared to jump 10000% in value. At his age, this kind of excitement could be too much for him.
Bud's method for dealing scenario was simple enough - he emailed me whenever this happened. After all, I work at Microsoft, so surely I can tell those guys at MoneyCentral to sort it out. Naturally, the team spots these problems pretty quickly anyway and the figures would be adjusted within minutes. Nevertheless, Bud was convinced that I was so powerful within Microsoft that all I had to do was pick up the phone, and entire teams jumped into action to fix the problem just for him. (Today, I believe the problem is permanently solved. I certainly haven't had that panic email from Bud in a while.)
When I reflect on it, it is natural that flat files still have a role to play in our new world of software as a service. They are, like the squatting toilet, simple and efficient. They do, perhaps, involve perhaps some manouevers to which we, in our technolgoical comforts, have grown unused. (My wife and I concluded that the wonderfully supple and elegant old ladies and men performing Tai Chi in parks of an early morning in Hangzhou were actually practising for what my own grandmother would call their "necessary visits.")
Technologies move more slowly in the real world than they do in the high-energy environment of innovators and start-ups. I have no problem with that. If for some folks the world is still flat, it is a good thing that those of us eager to rush forward with all that is new, still have to accommodate them.
Posted by Donald Farmer at August 11, 2007 12:13 PM
