<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Foraging in the Data Forest</title>
<link>http://www.beyeblogs.com/donaldfarmer/</link>
<description>Donald Farmer, from the Microsoft SQL Server Data Mining team, blogs from behind the scenes as his team work to build a leading predictive analytics platform. Subjects covered include data mining, data quality, data integration ... you know, all that data stuff.</description>
<language>en</language>
<copyright>Copyright 2008</copyright>
<lastBuildDate>Mon, 27 Aug 2007 11:32:58 -0700</lastBuildDate>
<generator>http://www.movabletype.org/?v=3.33</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
<title>I&apos;m biased. And so are you.</title>
<description><![CDATA[<p>Earlier this year, I changed teams and moved offices within Microsoft. This interrupted a little habit I had developed: pinning up my “Cognitive Bias of the Week” outside my office.</p>

<p>Cognitive biases are somewhat like optical illusions, but they affect our thinking rather than our vision. A well known example is confirmation bias; we tend to give more weight to positive observations that confirm our beliefs rather than negative observations. Fortune-tellers may appear successful when people remember one or two correct predictions more readily than the many that were off the mark.</p>

<p>Of course, you wouldn’t make such an error, would you? Think again. Like an optical illusion, many biases are extremely difficult to shake even when you are aware of the effect. In fact, some biases are most effective when we try to think most logically.</p>

<p>I believe it’s important for those of in the BI world to understand these biases. We represent data and analytic conclusions in highly persuasive ways. We help our customers to get it right or to get it wrong - and at times our influence may be inadvertently malign. With that in mind, I’m going to translate my “Cognitive Bias of the Week” posters to occasional blog posts on particular biases. I hope you’ll find these interesting, and relevant. Let me know.</p>

<p>Here’s one to start with. It’s about risk, and it has some revealing insights into how we consider the impact of risk in our decisions. It’s often called “The Pseudocertainty Effect” and it was first examined by <a href="http://www.cs.umu.se/kurser/TDBC12/HT99/Tversky.html">Tversky and Kahneman</a>. </p>

<p>Imagine that the US is at risk from a new disease spreading from Asia. Without treatment, it will kill 600 people, but we have two treatments to choose from.  <br />
  • With Program A, 200 people will certainly live. <br />
  • With Program B there is a 1/3 probability that all 600 people will leave. However, there is also a 2/3 probability that they will all die.</p>

<p>Program A is positive – you’re certainly going to save some people. Program B potentially has a better outcome, but it is way less than certain. What treatment program do you recommend?<br />
In the original study, 72% recommended Program A, and only 28% preferred Program B. </p>

<p>Let’s flip the problem round. <br />
  • With Program A, 400 people will certainly die.<br />
  • With Program B there is a 1/3 probability that no-one will die. However, there is also a 2/3 probability that all 600 people will die.</p>

<p>Now, Program A is negative: 400 people will certainly die. Program B is still uncertain: there is a risk it will all go wrong. However, if you do nothing 600 will die anyway, and if you follow Program A, 400 will certainly die. With Program B you have a chance of saving everyone. In the original study, when presented in this way to a different sample, 78% chose Program B. </p>

<p>That’s pretty remarkable. Exactly the same choices, presented in a different way, led to a complete inversion of preferences.</p>

<p>From this example, you can perhaps see why I consider cognitive biases to be an important study for BI analysts and developers. We may think of ourselves, or our users, as super-rational objective analysts of complex data; but in reality we are subject to these same biases. Also, we will tend to fall back on these biases, shortcuts and heuristics when we are making decisions under stress. </p>

<p>As BI becomes ever more pervasive, emergency planners probably would use our tools and techniques to handle an epidemic. But we could also be discussing customer churn rather than a deadly disease. The specific KPIs we choose, the manner in which we present them – the ways in which they influence decisions may be subtle, but the impact can be dramatic.</p>

<p>I’ll try to keep up a regular posting of biases, with examples relevant to the BI world. <br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/im_biased_and_s.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/im_biased_and_s.php</guid>
<category></category>
<pubDate>Mon, 27 Aug 2007 11:32:58 -0700</pubDate>
</item>

<item>
<title>Data visualization - in a music video</title>
<description><![CDATA[<p>Not quite BI, but how often do I get the chance to post a link to a <a href="http://www.youtube.com/watch?v=KHEIvF1U4PM">data visualization music video?</a></p>

<p>If you think you recognize the music, you're probably right. It's playing in the background of the Geico caveman advert when he's on the moving walkway in the airport.</p>

<p>My colleague Olivier Matrat points out that the video production is by a French design firm H5 who also made <a href="http://www.youtube.com/watch?v=E3B__ovj2jU">this </a>excellent visualization for a nuclear services company.</p>

<p>Enjoy.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/data_visualizat.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/data_visualizat.php</guid>
<category></category>
<pubDate>Sun, 12 Aug 2007 11:22:53 -0700</pubDate>
</item>

<item>
<title>The world is flat - or at least its files are.</title>
<description><![CDATA[<p>A couple of weeks ago, <a href="http://www.strategic-pr.com/">Scott Humphries</a> held his annual Pacific Northwest BI Summit in Oregon. It is a private event, small but highly valued, and organized impeccably by Scott. The Summit is a real pleasure, with all the ingredients of a memorable symposium - fascinating company, beautiful surroundings, and a wonderful host. However, the Summit is much more than just a good time: it is an opportunity to have conversations and to exchange insights across a very broad spectrum of the BI business, not only with deeply knowledgeable friends, but also with colleagues from companies outside our usual circle of partners.</p>

<p>This year we formally covered four topics - RFID intelligence, software as a service, IT and business alignment, and data warehouse appliances.  Informally, the subjects were ever more diverse. Coming away from the weekend, I always find that some insights have been new and surprising; some have simply, but valuably, confirmed what I have already been hearing from partners and customers; and some give an interesting new tingle to vaguely defined feelings I have had about the BI Industry and its practices.</p>

<p>Here is just one example. We were discussing Software as a Service, and someone observed that, in their SaaS world, many clients still exchanged data with the service in the form of encrypted flat files, exchanged over secure http. These customers were unwilling, for security, to open a port in their datacenter to exchange data with the service provider. There was much head-nodding and recognition around the table. For me especially, having spent five years specifically working on data integration technologies, I was all too aware that flat files are pervasive. </p>

<p>Nevertheless, one thinks of software as a service as being on the leading edge of innovation, and it was a little surprising to discover that good old flat files are still to be found there - and not only as lingering artifacts of an earlier age, but as a positive choice for otherwise early-adopting customers. It is rather like visiting the restroom in a high-tech Japanese building, and finding a squat toilet – elegant and efficient, but somehow something one expected to be phased out.</p>

<p>I love flat files. You have to marvel at the sheer ingenuity - sometimes inspired, sometimes perverse - with which data architects have been able to overload the meanings of delimiters, work around embedded characters, pad fields, compress fields, normalize, denormalize, you name it. And it’s not only what people have been able to do with the 2**7 characters of ASCII – We had great fun working out how efficiently to parse (and help users to define) fixed width columns in multi-byte character sets. Great stuff!</p>

<p>I have a friend in Canada who, in his retirement, carefully watches the Canadian markets. For this he uses Microsoft's <a href="http://moneycentral.msn.com/">MoneyCentral</a> website. Now, as it happens, several of the exchanges who provide data to MoneyCentral use a simple form of compression for their streaming ticker data: they leave out the decimal point from each quote. For a quote to two decimal places, this can account for between 14% and 25% compression. Every hour or so, the data provider sends a reminder of where the decimal place should be. However, very occasionally, the provider would overlook to send this reminder and my friend's stocks appeared to jump 10000% in value. At his age, this kind of excitement could be too much for him. <br />
Bud's method for dealing scenario was simple enough - he emailed me whenever this happened. After all, I work at Microsoft, so surely I can tell those guys at MoneyCentral to sort it out. Naturally, the team spots these problems pretty quickly anyway and the figures would be adjusted within minutes. Nevertheless, Bud was convinced that I was so powerful within Microsoft that all I had to do was pick up the phone, and entire teams jumped into action to fix the problem just for him. (Today, I believe the problem is permanently solved. I certainly haven't had that panic email from Bud in a while.)</p>

<p>When I reflect on it, it is natural that flat files still have a role to play in our new world of software as a service. They are, like the squatting toilet, simple and efficient. They do, perhaps, involve perhaps some manouevers to which we, in our technolgoical comforts, have grown unused. (My wife and I concluded that the wonderfully supple and elegant old ladies and men performing Tai Chi in parks of an early morning in Hangzhou were actually practising for what my own grandmother would call their "necessary visits.")</p>

<p>Technologies move more slowly in the real world than they do in the high-energy environment of innovators and start-ups. I have no problem with that. If for some folks the world is still flat, it is a good thing that those of us eager to rush forward with all that is new, still have to accommodate them.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/the_world_is_fl.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/08/the_world_is_fl.php</guid>
<category></category>
<pubDate>Sat, 11 Aug 2007 12:13:59 -0700</pubDate>
</item>

<item>
<title>Chinese surnames</title>
<description><![CDATA[<p>In January, I posted about <a href="http://www.beyeblogs.com/donaldfarmer/archive/2007/01/jills_surname_m.php">the limited range of surnames </a>in my home community in Scotland - and the problems that can cause for data quality. If it's a problem on a Hebridean island, think of how difficult it must be in China, where there is also a limited range of surnames. 85 percent of Chinese population share 100 surnames! </p>

<p>The Chinese authorities are now waking up to this problem and have introduced <a href="http://www.chinadaily.com.cn/china/2007-06/12/content_891902.htm">a new protocol </a>whereby people can register a composite surname comprising both the father's and mother's name. The hope is that this would create up to 1.3 million new surnames - although the real number is more likely to be much lower: around 10,000. Still an improvement.</p>

<p>I guess these would rather like the double-barreled names so enjoyed by the British aristocracy. These were used when property or titles were inherited through the female line: the double name signified the new male line and the endowed female line. Think of the first Britihs prime minister: Campbell-Bannerman, where the dominant Campbell family carried the weight of  history, wealth and titles in his lineage.</p>

<p>Or perhaps these new composite Chinese names would be more akin to the composite names used by ladies in the US - Hilary Rodham Clinton being an obvious current example. Either way, it's an interesting solution to an increasingly difficult problem. </p>

<p>In Thailand they tackled the Chinese name problem <a href="http://www.apmforum.com/columns/thai4.htm">quite differently</a>. They just insisted that Chinese immigrants registered themselves with unique surnames. In order to ensure uniqueness, more and more suffixes and prefixes had to be added to existing names. The result was extremely long names, which apparently Chinese quite enjoyed because they echoed the extremely long names of the Thai nobility. The idea of requiring your name to be a unique identifier appeals to my datahead, if not to my sense of individuality.</p>

<p>Enjoy the links. <br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/06/in_january_i_po.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/06/in_january_i_po.php</guid>
<category></category>
<pubDate>Tue, 12 Jun 2007 16:45:23 -0700</pubDate>
</item>

<item>
<title>Stratature and the Microsoft platform</title>
<description><![CDATA[<p>It has been some time since I last blogged. Just too much work, with some major conferences thrown in, and not enough time to compose some thoughts. Nevertheless, I cannot let last week’s news pass – that Microsoft has acquired <a href="http://www.stratature.com/">Stratature</a>, a remarkably agile vendor in the MDM space.</p>

<p>I have had many mails and calls from folks wanting to know what it all means. I can understand that – we spend a lot of time in our industry poring over headlines and quotes like the cold-war Kremlin watchers. Is comrade X, standing next to general Y - maybe the tension between their departments is over – and is commissar Z missing from the parade? Similarly, I know many people will be poring over the details of this announcement looking for hints about some grand strategy. </p>

<p>It is really much simpler than that. Like many readers of the b-eye network, a telling number of our customers are asking about MDM, CDI and PIM solutions. In the past, as <a href="http://www.b-eye-network.com/blogs/dyche/archives/2007/06/microsoft_jumps.php">Jill Dyche</a> points out in her blog, we have demonstrated some appealing capabilities using existing components of the very comprehensive Microsoft stack. Yet we have not had a product directly and solely aimed at customers looking for MDM. As I often say we do not have a product with “MDM” stamped on the label.</p>

<p>This acquisition, then, does mark a new step. We will, in the future, have a product focused specifically on the MDM market:  not just rolling various pieces of platform technology but introducing new and unique capabilities for MDM. Stratature is an awesome acquisition for that goal. </p>

<p>On the other hand, the new story is not so <em>very </em>different from our consistent approach to operational and analytic data. We are continuing to build a comprehensive BI and operational platform, now including MDM, built with the Office Business platform and the SQL Server data platform. (We have more platforms than the Jackson Five, and a good thing too.) In this continuing evolution, Stratature is an outstanding acquisition as the technology already dovetails neatly into this framework.</p>

<p>So, as we progress, expect to see some exceptionally usable and effective capabilities emerge from the Stratature acquisition within the Office Business platform – look for the fastest time to the best value in the industry. In parallel, look for the SQL Server platform to grow as the best data platform for operational, analytic and, increasingly, master data.</p>

<p>It’s going to be a stimulating time for Microsoft, our customers, and everyone else with an interest in the MDM space.</p>

<p>One last note. These acquisition announcements rarely capture the full story of how the deal was done.  I’m not going to spill any beans, but I really must congratulate my friend and colleague <a href="http://sqlblog.com/blogs/knightreign/">Kirk Haselden</a> on the tenacity, commitment and dexterity he has shown in this acquisition. Kirk and I had many discussions on this topic: at times tense, (oops, was that a bean?) but ultimately friendly, fully supportive and, as ever, totally focussed on the customer value. On a purely personal note, it’s great to see him shepherd this exciting technology into the Microsoft fold. Great work, Kirk. It’s going to be a pleasure seeing the ripples this will cause!</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/06/stratature_and.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/06/stratature_and.php</guid>
<category></category>
<pubDate>Sun, 10 Jun 2007 19:22:01 -0700</pubDate>
</item>

<item>
<title>The true art of presenting data</title>
<description><![CDATA[<p>Last week I was either brave, foolish or egotistical enough to share some of my working ideas on presenting data and data solutions. In truth, I expect all three attributes played their part.</p>

<p>This week, however, you should see how a true master of the art performs. Having seen this video of <a href="http://www.youtube.com/watch?v=wUiGGzym_uQ">Demitri Martin</a> you may never create a graph with a straight face again.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/05/the_true_art_of.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/05/the_true_art_of.php</guid>
<category></category>
<pubDate>Sun, 06 May 2007 16:48:33 -0700</pubDate>
</item>

<item>
<title>Presentation Skills for Business Intelligence - Nine Points of Roguery.</title>
<description><![CDATA[<p>A long post today, I hope it is interesting. </p>

<p>It’s funny how an idea can be dormant for ages, then suddenly crops up everywhere again. I used to have a simple method for structuring presentations – specifically, where I had to present results of an analysis, and often a related proposal. Most of us in the BI world do this regularly. I had not shared the technique much, but in recent weeks I have found myself describing it in detail several times, sitting down with hassled analysts helping them pull together summary presentations.</p>

<p>The method is simply an outline that you can use to structure your presentation for best impact. I used to call it <em>The Nine Points of Roguery</em> – there is an old <a href="http://lutheran-hymnal.com/celtic/rj73.mid">fiddle tune </a>of that name – but please don't think I am suggesting that you should be roguish with your clients. Still, the method <em>does </em>describe nine points, as follows:<br />
<blockquote>•	Make 3 points that your audience will already understand<br />
•	Enhance and extend these three points<br />
•	Introduce three new findings from your work</blockquote></p>

<p>Easy! I’m going to use an example to illustrate some of the ideas. Imagine that I have been tasked with examining customer data quality for a client and coming up with some suggestions for improvement. Here goes …</p>

<p><u><strong>Make three points your audience already understands </strong></u><br />
You will connect best with your audience when you share common ground. By speaking briefly to a few familiar points, you show understanding of their needs. You can even make it clear that you know that they know. <em>Of course, with your business experience, you understand this even better than I.</em> Do not overdo it – flattery will get you nowhere – but it is good if your audience feels you address them as equals. You are all smart people, tackling a non-trivial issue.<br />
What three points should you make? Naturally, the details depend on context, but do choose engaging, substantive, topics. Get to the core of your audience’s problems. If you need more structure, try the following:</p>

<p><strong>Strategic impact</strong>. How does the current topic affect your audience’s long-term goals? How could a successful project help? What would failure look like? <br />
<em><strong>Example</strong>: Direct marketing is a critical component of your client’s customer acquisition strategy. Poor data quality wastes money by inappropriately marketing to the wrong customers. It also risks alienating the public and damaging the company’s reputation.</em></p>

<p><strong>A tactical concern.</strong> Do not spend too long on strategy: you will be aiming too high. What immediate concerns face your listeners? What decisions will they make today or tomorrow? Choose a tactical problem that concerns them directly.<br />
<em><strong>Example:</strong> From mergers and acquisitions, your client has multiple customer data sources. There is an immediate need for a single version of a customer across the enterprise.</em></p>

<p><strong>An obstacle. </strong>Why is the current issue not easy? Get into detail: is there a financial, technical or human barrier to success? Your listeners understand that difficulties exist. Still, you are reassuring them, in effect, that it is not stupid to be in their situation.<br />
<em><strong>Example</strong>:  Their most important source system is effectively legacy software. It has been used for many years, but is not compatible with more modern CRM or data quality applications.</em></p>

<p><br />
<u><strong>Enhance your three points</strong></u><br />
You and your audience now have a baseline of shared understanding. Next, you should show that you have explored their issues further. It can be tempting to pull a rabbit from your hat, dazzling your audience with some revelation that resolves their problems at one stroke. In fact, most often you will not have such an eye-opener. Even if you do, my advice is to wait. In all cases, you must build authority first. Your presentation is not the Sermon on the Mount. You cannot simply announce “Ye have read … but I say unto you …” unless your authority is unquestionable.<br />
So, develop your themes. When you present new findings later, the audience will appreciate your knowledge and experience. You can build this influence in several ways. Indeed, using a variety of techniques will be more appealing.</p>

<p><strong>Extend. </strong>Expand one of your original topics by considering how the matter changes with time, geography, scale or some other dimension.  Was this problem easier in the past? Why? Does the passage of time have an effect, making things better, worse, smaller, or larger? Could this impact of this concern vary with geography? Perhaps the US division suffers more than the European division. You get the idea. You are building authority by going beyond the obvious.<br />
<em><strong>Example</strong>: Cleaning your client’s customer data is not a one-off action. Accurate operational data may be critical, but so is the ability to analyze customer behavior over time. Because customer data changes constantly, the client needs good quality historical data too.</em></p>

<p><strong>Contradict</strong>. I’m contrary by nature, so I like this one. However, regardless of my own predilections, finding contradictions is an excellent way in which to expand a topic. Few issues that you cover will be simply positive or negative. Your task here is to find the silver-lining in the cloud, or, vice-versa. The underlying message is, naturally, that not only is the subject not simple, but also that your understanding of it is not simplistic.<br />
<em><strong>Example:</strong>  Creating a single version of your customer data from your client’s various mergers and acquisitions is a great vision. However, that single version will be an even more valuable asset than before. As such it may require additional administration, greater security, high availability and disaster recovery planning.</em></p>

<p><strong>Personalize.</strong> Your clients are human. (If not, mail me: I would love to know more.) People relate most directly to the needs and experiences of other people. So, in every presentation, be sure to expand at least one topic to cover personal impacts. How does this concern affect the daily work of the manager, the DBA, the salesperson? Use named individuals if you like, but at least ensure that your presentation is not abstract. It should be rooted in the effects on real people of the problems you are covering.<br />
<em><strong>Example:</strong>  It is increasingly difficult to find staff skilled in the company’s legacy applications. There remains an administrator, Julie, and one developer, Bob. Julie spends too much time preparing dumps of text files for integration with other applications. Bob is stretched developing new reports to keep up with changing compliance requirements. </em></p>

<p><br />
<u><strong>Introduce three new findings from your work</strong></u><br />
By now you have demonstrated an understanding of your audience’s needs. Further, you have shown experience and authority. It is now time for new results and recommendations. The structure of this section will, again, depend on the specific context. However, if you struggle to get that right, I would suggest that you invert one of the patterns we used earlier. Start with an insight or recommendation at a personal level, and then show new tactical and strategic ideas.</p>

<p><strong>Personal insight.</strong> Do your recommendations or discoveries directly affect individuals, whether employees or customers? If so, be prepared to talk to that very directly. Do not cover every impact: just choose one as an example. A well-chosen example can establish an authentic connection with the audience.<br />
<em><strong>Example: </strong> Everyone in your audience has received junk mail. Many will have received duplicate mailings from one company. From your research, you can show that missing out a good target may be less costly exasperating a good target. So, you recommend not only consolidating and cleaning customer data, but also aggressively purging duplicates. By setting the proposal in a personal context, to which the audience can relate, you can make this case effectively. </em></p>

<p><strong>Tactical recommendation.  </strong>This should be the pivotal moment. It is when you make an actionable and material recommendation. You may have many tactical points – specific steps your client can take to achieve their strategic goals. Should you not present them all? I would suggest not: you risk overwhelming your audience. Better to choose the most impactful and representative tactic and speak to it well. Your proposal should relate to one of the issues you have raised earlier. This is also a good time to address ROI and costs associated with the problem and solution. Typically, it is easier to evaluate ROI for a tactical recommendation rather than an entire strategy. It may also be more credible to your audience.<br />
<em><strong>Example:</strong> You recommend migrating the legacy system to a new line-of-business or CRM application. Naturally, there are many sub-recommendations to be found in the report. However, overall costs can be estimated here, and supported with SWOT, cost-benefit or gap analyses.</em></p>

<p><strong>Strategic insight and observation</strong>. Now you can close the loop, referring back to your very first point. You have established common ground with your audience and demonstrated that you understand their strategic, tactical, even personal, concerns. You have specific recommendations based from your analyses. Now, you should show that your suggestions are not only tactical, but that they can have strategic impact too. Relate your point directly to the corporate strategies of your client. If your audience does not primarily comprise strategic decision makers, you can still make this point: just do not dwell on it for too long and be sure to relate any suggestion to their own work.<br />
<em><strong>Example:</strong> Direct marketing is still critical to your client’s customer acquisition strategy. With improved customer data quality you can significantly move beyond that approach. You can use your customer data to grow stronger customer relationships. Perhaps now, with a single version of the customer to hand, an effective loyalty scheme is practical across all the divisions of the enterprise, which previously poor data quality prevented. </em></p>

<p><br />
And that’s the outline. Nine simple points which help you balance the client’s current understanding with your new insights. If you try it out, do let me know how it works for you.<br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/presentation_sk.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/presentation_sk.php</guid>
<category></category>
<pubDate>Mon, 30 Apr 2007 15:19:38 -0700</pubDate>
</item>

<item>
<title>Retailer found guilty of OLAP</title>
<description><![CDATA[<p>"It's the most flagrant case of aggregation I have ever seen," said the prosecutor.</p>

<p>Ok, I'm kidding. Yet today I did find a headline in the <a href="http://charlotte.com/123/story/85822.html">Charlotte Observer</a>: <strong>"Lenders accused of data mining</strong>." In this case, the financers in question were illegitimately searching a database of student borrowers. There is no doubt that the public have valid concerns over potential misuse of data, but it is awkward (for those who used the term in a rather more limited way) to see the good name of a useful technology tainted in the process.<br />
This new usage - data mining as database search – is easy to see in a <a href="http://feingold.senate.gov/~feingold/releases/03/01/2003116745.html">press release </a>from Senator Russ Feingold. Data Mining, he says, is “is a broad search of public and non-public databases in the absence of a particularized suspicion about a person, place or thing.”<br />
Most vendors who, until recently, described their technology as <em>data mining </em>now talk about <em>predictive analytics</em>.  It is an attractive phrase for vendors and commentators, having a technical ring to it, without being intimidating. Currently I use this idiom myself, much more than data mining. Unfortunately, the term is not entirely accurate. Many uses of data mining, predictive analysis or knowledge discovery (an even rarer term these days) are primarily descriptive, to enable business analysts to understand their data better, without querying the model for predictions.<br />
As it happens, while I may regret the inconvenience that a useful term has drifted from my own usage, I see no reason to complain. I have no time for those who talk about the “real” meaning of words. The current meaning of a word or phrase is determined by its usage and I am not going to fight that. Between friends, I may continue to have a gay old time chatting about data mining; but in public, I need to be aware that the meaning has moved on.<br />
However, I do have to wonder what phrase the press will next appropriate to capture the public’s finely nuanced paranoia. I could take a guess. Senator Feingold, points out that data mining in his sense requires “a combination of intelligence data and personal information, including an individual's traffic violations, credit card purchases, travel records, medical records, communications records, and virtually any information collected on commercial, public or private governmental databases.” I think we may have to start looking around for an alternative to CDI …<br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/retailer_found.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/retailer_found.php</guid>
<category></category>
<pubDate>Mon, 16 Apr 2007 14:24:48 -0700</pubDate>
</item>

<item>
<title>Beer and diapers revisited - not just an urban legend</title>
<description><![CDATA[<p>These are days of turmoil and upheaval in the Business Intelligence world. So it’s good to hear of a story with a happy ending. And what could be happier than discovering that an urban legend is, in fact, true? You know the old tale of the supermarket chain mining their data to discover that sales of beer were uncannily linked to sales of diapers. And every time you hear it, someone is on hand to debunk it.</p>

<p>But wait. Of course, you already know, from coverage on the B-Eye-Network, that leading French retailer, Carremart, have invested heavily in a ground-breaking new BI system. Yesterday, at the end of March, they closed their first quarters’ books using the new technology and by early morning April 1st, they had already completed their first analysis. An inside contact mailed me immediately to say that they had indeed found a correlation between sales of beer and diapers. As you can imagine, I was hardly able to contain myself, so I telephoned Carremart at their Mond Rouge headquarters to interview Avril Poisson, the senior “Data Attendant.” You can read my interview with her below ...</p>

<p><em><strong>First Avril, tell us about your interesting job title. What is a Data Attendant?</strong></em>     As a Data Attendant, I look after the needs of the data.</p>

<p><em><strong>So you’re a Data Steward?</strong></em>     That’s such an old term. Yes, we used to have male “Data Stewards” and female “Data Hostesses” but we felt those terms set the wrong tone. “Data Attendant” describes my role better. The data has a long journey from source to destination. It’s my job to ensure that the right data is in the right place and is refreshed when necessary. </p>

<p><em><strong>As an Attendant, then, it’s your job to load as much data as you can into a highly compressed structure as quickly as possible?</strong></em>     Oh no, this is business data. We don’t compress it so much and it is always loaded first.</p>

<p><em><strong>And if the data has a lot of baggage with it?</strong></em>     Baggage? You must mean metadata. It is of course better if the metadata travels with the data. At least it should be tracked along with the data so they eventually can be linked. But honestly, nobody in this business cares that much – quite often the metadata arrives much later, and sometimes it gets lost altogether. </p>

<p><em><strong>Tell me about this breakthrough analysis, Avril. “Beer and diapers” turned out to be true after all?</strong></em>      Ah, not quite. I am sorry if you misunderstood. Perhaps a bad translation. However, our analytics experts were able to positively correlate sales of large quantities of beer with  - how you say? – adult diapers. </p>

<p><em><strong>Depends?</strong></em>     No, we’re absolutely certain. The more beer sold, the more adult diapers were needed.</p>

<p><em><strong>Interesting. What technology do you actually use in you analytics department?</strong></em>      We use BI tools from our database vendor, Debacle.</p>

<p><em><strong>Which consists of what exactly?</strong></em>      Today it’s a layer of Fiebel CRM over a relational bed, bound with ConFusion middleware and ELTLE from Sumoptions. All presented at the last moment with a glossy presentation layer of Hyperinflation.</p>

<p><em><strong>And have you found the Debacle solution to be well integrated?</strong></em>       Certainly. For example, all the consultants arrived on the same flight. They even shared a minivan to our headquarters – it’s good to keep costs down. We had hoped to see their data mining guy, but he missed the 9am flight from San Francisco due to heavy traffic on the 101. A shame, but you can’t predict these things.</p>

<p><em><strong>But six months on and you’re happy with final results?</strong></em>      To be honest, the results are not quite final. In fact, we’re still installing the Debacle system.</p>

<p><em><strong>Still installing? So, how did you arrive at your conclusions?</strong></em>      With the traditional methods. We copied and pasted everything into Excel and drew some charts.</p>

<p><em><strong>And are you confident in your analysis? </strong></em>     Good question. How you say? Depends …</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/beer_and_diaper.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/04/beer_and_diaper.php</guid>
<category></category>
<pubDate>Sun, 01 Apr 2007 11:05:27 -0700</pubDate>
</item>

<item>
<title>Business Intelligence networking in China</title>
<description><![CDATA[<p>I was enthusiastic to see Shawn’s announcement last week that, with Spring approaching, BeyeNETWORK is sprouting new global sites, especially a new portal for China: <a href="http://BeyeNETWORK.cn">BeyeNETWORK.cn</a> . Over the last couple of years I have been lucky to be more involved in the Chinese market. There are remarkable prospects there for BI as enterprises develop and mature through business cycles at an astounding rate. However, there are also unique social and cultural differences in the business world that western BI specialists may overlook. The challenge is how to succeed with BI in China with the best technology and practices, without compromising these unique needs.</p>

<p>Meeting this challenge will need a blend of experienced specialists (inevitably trained at first in the established IT world of western commerce), and insightful local entrepreneurs. Fortunately, the BI world is remarkably open and welcoming of new ideas; and local BI vendors in China are specially dynamic and innovative. I have high hopes that BeyeNETWORK.cn can capture that interaction live on the web. </p>

<p>As it happens, I had the opportunity to meet with one of these dynamic BI companies, almost immediately after Shawn’s announcement when I returned from TDWI. Frank Fu from the BI team at <a href="http://www.U-Soft.com.cn/en/">U-Soft</a> demonstrated their latest Aurora BI Server. Aurora integrates with SQL Server Analysis Services, and provides a full front-end for business decision makers to browse KPIs and reports in a very interactive manner. For example, in addition to seeing a timeline graph in the dashboard, a user could add various trendlines in the front-end. </p>

<p>Typically, when looking at end-user BI for the western market we think of this analytic functionality as ideally being on every desktop. Frank pointed out to me that in the case of some U-Soft customers, very few end-users (typically only most senior management  or officials) might actually use the end user tools. This is one of the differences we need to consider – interactive decision making may still be centralized and quite hierarchical, with only more static reports being delivered to a broader audience. Speaking of reports, Aurora easily handles an issue several BI companies have run into in China – the need to deliver standardized reports in a rather complex, and rather inflexible, government format. (It’s tax time here in the US, of course, so we’re in no position to criticize such forms!)</p>

<p>Another area in which local BI  vendors in China are expert is interfacing with regionally specialized ERPs:  a rapidly growing market serviced by companies such as KingDee and Langchao who have over 55% of the total ERP market. In other words, as the Chinese economy grows (ignore today’s temporary blip on the exchange!) then perhaps we may see these names become as dominant as SAP, Oracle Apps or Dynamics: at the very least expect them to be as familiar as Sage or Lawson. </p>

<p>Now, please don’t get the impression that all Chinese BI is highly specialized and locally focused. There are BI practices in China which are already delivering great results locally and to US enterprises. A great example is <a href="http://www.minesage.com/html/index.html">Minesage</a>, where my friend Dachuan Yang’s team not only implement excellent practices for the Chinese market – they also build some beautiful and innovative front-ends for cutting-edge uses, such as predictive analytics, segmentation and cluster analyses for the Microsoft adCenter online advertising platform.</p>

<p>So I do wish the BeyeNetwork team great success in launching their Chinese portal. It’s certainly a market that we in the BI world cannot ignore, and have much to learn from. </p>

<p>再见!</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/02/business_intell.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/02/business_intell.php</guid>
<category></category>
<pubDate>Tue, 27 Feb 2007 22:47:40 -0700</pubDate>
</item>

<item>
<title>Predictive Analytics for the Excel user ... and that means for you.</title>
<description><![CDATA[<p>I was speaking at a BI event in Ireland a couple of weeks ago. After one session, one of our partners decided to collar me with some friendly criticism of Microsoft's well-known slogan <strong><em>BI for the masses</em></strong>. "You can have all the dashboards and reports you want," he said, "But until I can open up Excel and do cutting-edge analysis right there, it's all just blowing smoke. Excel is where I do my work, and that is where my BI needs to live to be efficient. It is all well and good having data mining or some such on the server, but it’s too complex. I need it in Excel, and I can't wait for that day to arrive." I could feel, every time he stressed <em><strong>Excel</strong></em> that he was mentally prodding me, with good humour I must say, until I got the point.<br />
 <br />
I hope I did not look smug - but forgive me if I did. For the answer was simply, "You only have to wait a week or so. SP2 of SQL Server 2005 has exactly what you need." At the next session, I specifically demonstrated the data mining features. We took a table of customer sales data in Excel. With a few clicks, we had detected categories of customers - base on their demographics in that example. We labeled our rows with these new categories. Detecting outliers based on patterns discovered by the mining add-ins was only a few clicks more. We built predictions of future sales, directly inside Excel. We used goal seeking to find how to move customers from low-value categories to those with a higher potential.<br />
 <br />
I watched my friendly critic's expression out of the corner of my eye. He was grinning hugely. I wonder if he thought I stayed up all night coding the demo, as it so exactly matched what he asked for.<br />
 <br />
All these features and many more are in the SQL Server Service Pack 2 released today - February 19th. I believe the <a href="http://www.microsoft.com/sql/technologies/dm/addins.mspx"><strong>Microsoft Data Mining Add-ins for Excel</strong> </a>are a real game-changer. They bring the power of predictive analytics to every desktop, along with clustering and segmentation, and some powerful data preparation features. You can read more about the add-ins by following the link, and you can download them freely too. <br />
 <br />
Why are these features so significant? For years now, business users have too often looked on data mining as cryptic - at worst almost mystical - in its complexity. It was the realm of experts in the backroom, applying intimidating algorithms for the greater benefit of all.  This miasma of complexity has hung around largely because it has been so difficult for end users - business analysts mainly - to get their hands dirty with predictive analysis. Until now, specialists served the business user with carefully prepared and presented data mining results that were simply not in the business users’ domain of expertise to criticize. At best they could fall back on the saying of the wonderful Scottish scul?tor <a href="http://en.wikipedia.org/wiki/George_Wyllie">George Wyllie </a>who memorably said, "Ye don't need to know how a thing works, to know that it is nae workin'."<br />
 <br />
I remember when OLAP had the same air of intimidating difficulty. Multi-dimensional hierarchical browsing sounded like something only Doctor Spock could really understand. When Microsoft brought OLAP to the desktop through Pivot Table Services, end users could start to grasp the power of the dimensional modeling method, and to gain some insight into the techniques and capabilities, and limitations, of the technology. <br />
 <br />
The Data Mining Add-Ins will do the same for predictive analytics. Just as not every analyst rushed off to become a dimensional modeler, I do not expect them to become a data miner either. However, they will learn more about the techniques, will discover their usefulness and limitations, and will respect and trust the technology more because of it. Just as with OLAP, I expect there will be nay-sayers who maintain that there really is something special and arcane about predictive analysis: they'll still be saying that, even as the technology is commoditized on desktops all around them. Because I’ll tell you something else about these Add-Ins, too: they are beautiful to work with.The SQL Server Data Mining team have done a wonderful job with them - the user interfaces are as rich, informative, and nicely crafted as any other Excel feature. </p>

<p>Now, I’m not going to pretend that the Data Mining Add-Ins are a miracle application that somehow can deliver perfect and reliable predictive technology to the inexperienced user. There are still many issues of data preparation, and model validation that we’ll explore in future posts. However, analysts close to the data, and already experienced with BI (as most of readers of this blog will surely be) - such users will love these new tools; and I in turn would be very happy to hear more of your experiences with them.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/02/predictive_anal.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/02/predictive_anal.php</guid>
<category></category>
<pubDate>Mon, 19 Feb 2007 19:51:25 -0700</pubDate>
</item>

<item>
<title>Data Quality and Data Stupidity</title>
<description><![CDATA[<p>"But this isn't about data quality, it's about data stupidity!" It was difficult to disagree. A customer had narrowly avoided dispatching several tons of architectural goods from Seattle to Portland, Maine rather than Portland, Oregon. The address cleansing system had done its best, and only the truck driver (I would prefer to say <em>lorry</em>, but must localize) noticed at the last moment. I had been telling the client for some time that his creaking data quality system was not adequate and that data quality is a lifecycle, a process, a metier, a way of life, a calling ... but for him it was just a function between processing orders and scheduling deliveries.</p>

<p>Here, on the b-eye network, some time ago, <a href="http://www.b-eye-network.com/blogs/imhoff/archives/2005/04/data_quality_or.php">Claudia Imhoff was blogging </a>about data quality. I liked this observation: "While there certainly is useful technology to help with data quality, so much of the assurance part is still heavily dependent on the human being (in this case, usually a business person) eyeballing the cleaned up data to verify its 'quality'."</p>

<p>Our truck driver did the eyeballing in this case, and good that he did. The costs of data stupidity can be high indeed, especially when business are so large that they cannot eyeball every order, or every action. Take the <a href="http://news.bbc.co.uk/2/hi/uk_news/scotland/north_east/6310633.stm">case </a>of the Halifax Bank of Scotland. Stephanie MacLaughlin asked them for a copy of her bank statement. She received the bank statements of 75000 customers, delivered in 5 packages to her doorstep. HBOS said the incident was "isolated." I should hope so too. They might have added that it was stupid in the extreme.</p>

<p>(Readers of my post about Scottish names and addresses are perhaps wondering if the mistake is understandable. I assure you dear reader, that in a country of 5 million, there are not nearly 75000 Stephanie MacLaughlins.)</p>

<p>Of course, many businesses have quite careful rules in place for validating and catching small errors. However, in the cases described here, the errors were actually so great (each in their own way) that I doubt anyone had thought of putting a rule in place to catch the error. The HBOS error went unchecked until poor Ms McLaughlin received her mail. And there was no business rule preventing my customer from shipping to Portland, Maine, just a truck driver who, quite reasonably, did not think he could get there and back from Seattle in time for dinner. </p>

<p>The problem is how to check for all the various forms of stupidity that can arise. Often we can only do so retroactively. Indeed the mistakes are, as the bank said, isolated. Yet they are critically serious when they occur. I have no doubt that HBOS will now have a business rule implemented somewhere, anywhere, to check for such an issue again. </p>

<p>One approach I have been trying out with another customer - a car dealership -  is to use a data mining model to check orders before shipment. The mining APIs in SQL Server 2005 are fast enough to be called on an order by order basis, and can be embedded into apps. The business simply builds a clustering model from its existing records, and new orders are checked to see if they fit the known clusters of order types. If they do not, they can be flagged and reviewed. The system is excellent for capturing stupidity. In fact, running it against historical records, it correctly identified over 90% of orders that had to be subsequently cancelled, returned or corrected due to errors, including orders for 1000s of parts instead of 10s, and many over- and under-billings.</p>

<p>I think this idea has legs. Perhaps my next book should be "Data Quality: The Stupidity Dimension."<br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/data_quality_an.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/data_quality_an.php</guid>
<category></category>
<pubDate>Tue, 30 Jan 2007 00:00:00 -0700</pubDate>
</item>

<item>
<title>Surnames and nicknames, Mr. Trump, an unusual phonebook, and data quality.</title>
<description><![CDATA[<p>How much more can I cover in one post?</p>

<p>Jill Dyche's blog nearly always makes me smile. Her latest - about her surname and the problems of customer recognition - raised some familiar (and familial) issues for me. There is a BI point to this, but be prepared for diversions on the way.</p>

<p>My own family are largely from the Isle of Lewis, off the Scottish coast. In our district, the population mostly speaks Gaelic, and the large extended families result in a small number of surnames: MacLeod, MacKay, Morrison etc. First names are also regular: Donald, Angus, Murdo, Iain for men: Mary, Catriona, Margaret for women, although many women (my own mother included) have feminized male names, such as Donaldina, Angusina, Murdina, even Torquilina in extreme cases.</p>

<p>First diversion. In the USA most people on first acquaintance want to shorten my name to Don. I'm not a Don, I'm a Donald, and never think of myself otherwise. The Donald - Mr. Trump - is similarly resistant to the short form. That's no surprise to me for his mother, Mary MacLeod, was from Lewis, too. Why do we resist the short form? Well, it simply does not work in Gaelic. In Domhnall the mhn is almost silent - being pronounced more like Doh-al. So the familiar form is generally Dolaidh - pronounced Dolly. I cannot see Dolly Trump saying "You're fired!" with quite the same authority, somehow.</p>

<p>Back to our business intelligence point. Given the small number of first and last names, knowing someone's name on the Isle of Lewis may not help you find them at all. Searching for Donald MacLeod in Lews, using the UK online phone book found 7 pages of results. Perhaps we could narrow it down by address. This would help, would it not, if delivering a package? Well, it might, except that in the countryside, houses are often numbered in the order they were built, rather than by their geographic order. So knowing Donald MacLeod lives at number 17 may still not help at all in finding the right person. </p>

<p>Of course, the community long ago found an answer to this - before the invention of the phone or the phone book. Most everyone, in addition to their name, has a patronymic or matronymic name that identifies their family. I am often referred to as Domhnall Dhomhnaill Bhain, (Donald of white-haired Donald) after my grandfather. However, even with this, some disambiguation may be required. As a result, many people - probably most men - have nicknames. I have friends known as Donald "Rufus" Murray and Iain "Pluto" MacIver. Now we're getting down to a system that can identify individuals more accurately. </p>

<p>Second diversion. Nicknames are often stuck to us when children, but some kids arrive at school without them. In such cases, with perhaps 5 Donald MacLeod's in the class, the teachers feel a need of them, even if families don't. There used to be a newscaster in the UK known as Donnie B MacLeod. The B stood for nothing: except that on his first day at school, the teacher named his pupils Donnie A, Donnie B, Donnie C. Not very imaginative, but the name stuck for life.</p>

<p>Back to the identity problem. So, if the only way to disambiguate someone is to use their nickname or patronymic, how do you find them in the phone book. The answer is: create your phone book, listing people by their nicknames. It's a wonderful publication, and it's available here: <br />
http://www.c-e-n.org/merchandise_2.htm</p>

<p>Now, to be fair, some people don't need nicknames. My father was not from the island, so Donald Farmer stands out, as would Donald Trump. And when an enterprising Asian shopkeeper, Wali Muhammed, arrived from Pakistan in the 60s, his mail was most likely delivered correctly.</p>

<p>The point, which I have to make to clients surprisingly often, is that identifying individuals is not simply a matter of finding a high similarity between two records, but in having a high confidence that the matched record really is the right one. </p>

<p>At the simplest, in SQL Server Integration Services, we have a string matching algorithm returning scores of both similarity and confidence. Donald MacLeod at number 19 may be matched with an incoming record with a very high similarity, but could really be a quite different Donald MacLeod at number 10: similarity is high, but confidence is low. However, Willie Mahamed may be matched with Wali Muhammed with a relatively low similarity, but a high confidence that we have found the right person.</p>

<p>Naturally, real cases should be more complete, and therefore less error prone. But again and again, I find companies, otherwise quite serious about their customer records and CRM approach, using a small number of attributes, and a naive approach to matching. </p>

<p>And with that, I must go and send an email to Pluto.<br />
</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/jills_surname_m.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/jills_surname_m.php</guid>
<category></category>
<pubDate>Tue, 16 Jan 2007 21:30:00 -0700</pubDate>
</item>

<item>
<title>The Shock of the New</title>
<description><![CDATA[<p>What a start to the New Year! Few things could have made me smile more than to see the Intelligent Enterprise Reader's Choice Awards. Microsoft over all had a great showing, but for me the highlight was to see Microsoft taking the "Best ETL Software" gong. Let me tell you why it was especially sweet ...</p>

<p>About 6 years ago, I moved from startups to work at Microsoft, in the SQL Server BI group. The change was fascinating for many reasons - many good and some ... well, let's just say they were still fascinating. One thing was enjoyable and difficult in equal measure: I now had to deal with the somewhat fixed expectations of thousands of vocal users about what our product should do and how we should do it. The installed user base - especially one as large as Microsoft's - can overwhelm you with requests and suggestions. In fact it would be easy to be merely reactive and spend entire development cycles polishing scratches and filling dents as customers point them out.<br />
However, at some point, if you want to move the software, the business, and the customers along significantly, you have to take the plunge and make some radical changes. But you know that doing so will cause some pain to the existing, loyal and even enthusiastic users.<br />
In SQL Server Integration Services we faced this problem in bucketloads. The previous product, DTS, was lightweight but smart, and hugely popular. However, it was also very limited in its capacity to tackle the increasing demands placed on it by existing and potential customers. A complete, ground-up, no-line-of-code-left-standing rewrite was in order. And, as it turned out, the market needs forced architectural changes that made ugrading from the previous version almost impossible.<br />
Naturally, we tackled the resulting issues on a technical level to some extent - but more importantly we had to tell a compelling story to users that the changes and their pain was worth it.<br />
In that light, I can look to the award as an endorsement of the decisions we made, and the astonishing commitment of the team that drove the product along. Especially so, as the award comes from users who have to live and work with the software on a daily basis and therefore represent both the base we had to move, and the basis for our next versions. It's a great start to the New Year for the team, and for me personally a good time to look back and reflect on the long road, and all the various detours and diversions that we passed on the way.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/the_shock_of_th.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2007/01/the_shock_of_th.php</guid>
<category></category>
<pubDate>Thu, 04 Jan 2007 16:30:00 -0700</pubDate>
</item>

<item>
<title>Some books for the Christmas list to get you thinking</title>
<description><![CDATA[<p>Long time no blog, and no good excuse really - except for too much work and travel! Travelling - however tiring - does have it's advantages. One I have enjoyed recently is getting some time to read on the plane. Here are a few of my recent browsings.</p>

<p>Readers of the b-eye network need no prompting, I'm sure, that <em><strong>Customer Data Integration</strong></em> by Jill Dyche and Evan Levy (Wiley) should be on your bookshelves already. If you have not read it, don't wait for Christmas - I would buy it now, as you will find it immediately valuable and actionable. And enjoyable too.</p>

<p>Of broader scope are two other books I have been digging into, and I recommend these for more expansive moments. They are:</p>

<p>:: <em><strong>Ambient Findability</strong></em> by Peter Morville (O'Reilly)<br />
:: <em><strong>Niche Envy</strong></em> by Joseph Turow (MIT PRess)</p>

<p>Peter Morville may be known to you already as the author of <strong><em>Information Architecture for the World Wide Web</em></strong>. His latest book explores the concept of <em>findability</em>. For Morville the world is heading towards a state of complete findability - anyone can find anything at anytime. (He has not seen my sock drawer!) Findability emerges from a nexus of usability, information architectures and the literacy of the end user - in this case their literacy with the information systems that marshall the knowledge of the world. It's an easy-going read, and has some real food for thought. Like any such book, it's description of the latest interfaces and technologies was out-of-date by the time it hit the press (Microsoft and Google are innovating on the web at a remarkable pace) but the underlying thesis is compelling.</p>

<p>Niche Envy promises to be quite controversial, as it tackles issues of favouritism, trust and privacy surrounding database marketing techniques. At the National Centre for Database Marketing conference in Orlando last week, this book was well displayed in the bookstore, and there were some rather nervous discussions amongst attendees who had read it: I suspect they did not wish to see become a bestseller. I would say that this is a must-read book for anyone using databases to target better customer relationships, or to market to end-users more precisely. For all your good intentions, you may just find you are stepping into areas with which the public are not at all comfortable. Not all the arguments in this book are well-founded, but it captures very well the difficulties that mass customization and direct marketing pose for customers and practitioners alike.</p>

<p>If you do read these, I would be fascinated to hear your views on them too.</p>]]></description>
<link>http://www.beyeblogs.com/donaldfarmer/archive/2006/12/some_books_for.php</link>
<guid>http://www.beyeblogs.com/donaldfarmer/archive/2006/12/some_books_for.php</guid>
<category></category>
<pubDate>Wed, 20 Dec 2006 17:30:00 -0700</pubDate>
</item>


</channel>
</rss>