<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Frank Dravis On Data Quality</title>
<link>http://www.beyeblogs.com/dravis/</link>
<description>Firstlogic&apos;s Data Quality Blog hosted by Frank Dravis</description>
<language>en</language>
<copyright>Copyright 2011</copyright>
<lastBuildDate>Wed, 10 Oct 2007 17:12:12 -0700</lastBuildDate>
<generator>http://www.movabletype.org/?v=3.33</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
<title>MDM Publishing</title>
<description><![CDATA[<p><strong>October 10, 2007</strong></p><p>Recently, I gave a presentation at the Gartner MDM Summit on how enterprise information management (EIM) enables master data management (MDM) solutions. A common question asked by the attendees at the conference was, "What are the capabilities of MDM systems to publish data?" This question was asked in a general sense, of course, because the capability of any given MDM system to publish data will vary from solution to solution. </p><p>Let's look at this question and define the term "publish." The purpose of an MDM system is to create, maintain, distribute, and otherwise manage master data. "Distribute" is the key word here. How do you get the information out of an MDM system and distribute it to the applications that need it?</p><p><a href="http://eimblog.businessobjects.com/dravis/2007/10/10/mdm-publishing.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/10/mdm_publishing_9.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/10/mdm_publishing_9.php</guid>
<category></category>
<pubDate>Wed, 10 Oct 2007 17:12:12 -0700</pubDate>
</item>

<item>
<title>Business Objects to Acquire FUZZY!</title>
<description><![CDATA[<p><strong>September 10, 2007</strong></p><p>In case you missed the press release, Business Objects has announced the intent to acquire FUZZY! Informatik.  FUZZY!, which I will abbreviate to Fazi, is the leading data quality software vendor in Germany. Amongst other capabilities, Fazi has address assignment engines for 31 different European regions &ndash, from Portugal in the west to the Baltic States and Russia in the east. More importantly, these engines use address data from each country to process customer records in double-byte (Unicode) format and in the native language and computer encoding of that region. </p><p><a href="http://eimblog.businessobjects.com/dravis/business-objects-to-acquire-fuzzy.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/09/business_object_2.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/09/business_object_2.php</guid>
<category></category>
<pubDate>Mon, 10 Sep 2007 18:58:47 -0700</pubDate>
</item>

<item>
<title>Unstructured to Structured Data</title>
<description><![CDATA[<p><strong>August 21, 2007</strong></p><p>Some of you may have heard that Business Objects acquired Inxight, a text analytics software company. In short, text analytics &mdash, in this case &mdash, is the ability to scan through unstructured text, perhaps a whole book, and parse the inherent facts, entities, and actions into a series of structured records. Yes, before the experts start E-mailing me, there is little true unstructured text. A newspaper article, for example, has substantial structure (title, paragraphs, sentences, context, diction, etc.) that the human mind uses to interpret what is said. However, the data is not parsed out into nice discrete database fields that we can search on, analyze, and otherwise manipulate.</p><p>The possibilities are endless when you integrate a robust text analyzer with an ETL application, routing in such a way that the output of the analyzer is routed into a sophisticated data classifier, categorizer, and standardizer.</p><p><a href="http://eimblog.businessobjects.com/dravis/2007/8/21/unstructured-to-structured-data.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/08/unstructured_to_2.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/08/unstructured_to_2.php</guid>
<category></category>
<pubDate>Tue, 21 Aug 2007 19:46:04 -0700</pubDate>
</item>

<item>
<title>SaaS, How it Fits</title>
<description><![CDATA[<p><strong>August 16, 2007</strong></p><p>As part of my job, I conduct market research. One of the many needs for that research is portfolio management, which answers the question: How does the entire mix of products and services a company offers fit together in a cohesive strategy to serve the market? SaaS (software as a service), and the opportunity to offer it, is part of today's enterprise software portfolio. The evolving needs of data administrators and IT directors pose new challenges for software, and I think now is a good time to explore -- especially from the EIM perspective -- how SaaS is a complimentary solution to deploying software on-site at a customer's facility.</p><p><a href="http://eimblog.businessobjects.com/dravis/2007/8/16/saas-how-it-fits.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/08/saas_how_it_fit_1.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/08/saas_how_it_fit_1.php</guid>
<category></category>
<pubDate>Thu, 16 Aug 2007 15:07:48 -0700</pubDate>
</item>

<item>
<title>Yes, I&apos;m Back Blogging</title>
<description><![CDATA[<p><strong>August 6, 2007</strong></p><p>For those of you who subscribe to my site via RSS, or other feeds, you have probably noticed that I am now blogging again, after a 10 month hiatus. Why start again you ask? Simply put, I missed it. I like to write and share new ideas in the field of enterprise information management (EIM), and blog posts are a good place to prototype or test article ideas, white paper content, etc. So if you are looking for new thoughts this is the place to come.  It's why we have established the indexing system (along the right side bar). You can actually use the blog as a repository for retrievable EIM content. </p><p><a href="http://eimblog.businessobjects.com/dravis/2007/8/6/yes-im-back-blogging.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/08/yes_im_back_blo_2.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/08/yes_im_back_blo_2.php</guid>
<category></category>
<pubDate>Mon, 06 Aug 2007 21:14:24 -0700</pubDate>
</item>

<item>
<title>Householding and Hierarchy Management</title>
<description><![CDATA[<p><strong>July 30, 2007</strong></p><p>Yes, they mean the same thing. The primary difference between the two is where the concepts originated and the two different data management processes that initiated them.  </p><p>Householding originally comes from a direct marketing concept where data quality software is used to match across customer records and consolidate the records into market groups. </p><p><a href="http://eimblog.businessobjects.com/dravis/2007/7/30/householding-and-hierarchy-management.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/07/householding_an_2.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/07/householding_an_2.php</guid>
<category></category>
<pubDate>Mon, 30 Jul 2007 20:27:53 -0700</pubDate>
</item>

<item>
<title>ETL for EAI</title>
<description><![CDATA[<p><strong>July 20, 2007</strong></p><p>I was interviewing a client the other day in the course of a market research project when the customer explained an interesting application of our ETL product, BusinessObjects&trade, Data Integrator. I regularly brief industry analysts and attend strategy sessions with them so I had the opportunity to discuss this client's unique process with a well recognized industry analyst from Germany and he admitted he'd not heard of his client's deploying an ETL solution in such a manner.<br /></p><p><a href="http://eimblog.businessobjects.com/dravis/2007/7/20/etl-for-eai.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2007/07/etl_for_eai_1.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2007/07/etl_for_eai_1.php</guid>
<category></category>
<pubDate>Fri, 20 Jul 2007 21:35:16 -0700</pubDate>
</item>

<item>
<title>Two Sides of the Metadata Coin</title>
<description><![CDATA[<p><strong>September 26th, 2006</strong></p><p>I've been wondering why, after languishing for the past ten years or so, metadata management has gained traction in the market this past year. And I think I know why. There are two parallel concepts that comprise the sides what I call the metadata coin. Quite literally, it is the coin that is funding the development and expansion of metadata solutions.</p><p>In years past, metadata was the domain of data architects. It helped them understand what data they had and how it related to the sources and operations from which it came and to which it went. At the first mention of metadata business users would roll their eyes and head for the conference room door. Surely metadata was the stuff of arcane IT discussions best had out of earshot of people driving and running the business. </p><p><a href="http://eimblog.businessobjects.com/dravis/2006/9/26/two-sides-of-the-metadata-coin.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/09/two_sides_of_th.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/09/two_sides_of_th.php</guid>
<category></category>
<pubDate>Tue, 26 Sep 2006 13:39:51 -0700</pubDate>
</item>

<item>
<title>Just Plain Smart</title>
<description><![CDATA[<p><strong>August 14, 2006</strong></p><p>I received a letter in the mail this week from my broker, a national equity trading firm. The letter simply says "Dear Client: We have been notified of a change in your address. If this address change is incorrect, notify us immediately by making the necessary corrections on the lines below and returning this form."</p><p>In the past, stretching back to at least the early '80s when I first started in the EIM space, financial services firms were loath to change customer data. </p><p><a href="http://eimblog.businessobjects.com/dravis/2006/8/16/just-plain-smart.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/08/just_plain_smar.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/08/just_plain_smar.php</guid>
<category></category>
<pubDate>Wed, 16 Aug 2006 05:04:23 -0700</pubDate>
</item>

<item>
<title>Framework Synergies, Data Marts and Profiling</title>
<description><![CDATA[<p><strong>July 31, 2006</strong></p><p>A long standing challenge has always been accessing and processing data from applications beyond the built-in data processes supported by those applications. Take SAP, Peoplesoft, or Siebel for example. These applications work well for their intended purposes, but what happens when you want to access and process the data in ways not supported by the vendor? Data profiling is a case in point. </p><p><a href="http://eimblog.businessobjects.com/dravis/2006/8/1/framework-synergies-data-marts-and-profiling.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/08/framework_syner.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/08/framework_syner.php</guid>
<category></category>
<pubDate>Tue, 01 Aug 2006 16:14:20 -0700</pubDate>
</item>

<item>
<title>Just Plain Dumb #8, My Work is Never Done</title>
<description><![CDATA[<p><strong>July 18, 2006</strong></p><p>In my <a href="http://weblogs.firstlogic.com/dravis/2005/8/24/enough-is-enough-just-plain-dumb-5.html" target="new">August 24, 2005 entry </a> I cited a prominent bank that had sent me 15 "priority" offers for a credit card. Well, the count is now up to 32. Yes, I have received 17 additional credit card offers from the same card company. For a period of time there was a lull and I thought "Thank goodness, they've either finally seen the data quality light or run out of marketing budget." </p><p><a href="http://weblogs.firstlogic.com/dravis/2006/7/18/just-plain-dumb-8-my-work-is-never-done.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/07/just_plain_dumb.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/07/just_plain_dumb.php</guid>
<category></category>
<pubDate>Tue, 18 Jul 2006 13:48:41 -0700</pubDate>
</item>

<item>
<title>New Survey Supports Old Assertion</title>
<description><![CDATA[<p><strong>July 10, 2006</strong></p><p>One statistic I've heard cited periodically about data quality is the "20% scrap and rework number." That is 20% of the time we, information workers, spend managing and using our data is focused on verifying the accuracy of the data and fixing or compensating for defects. In an IQ maturity assessment conducted at a past employer I was personally able to verify the number. The participants in the assessment said on average they spent 19.7% of their time reconciling data quality issues. But, my assessment pool size was small, less than 20 people and came from one client. </p><p><a href="http://weblogs.firstlogic.com/dravis/2006/7/10/new-survey-supports-old-assertion.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/07/new_survey_supp.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/07/new_survey_supp.php</guid>
<category></category>
<pubDate>Mon, 10 Jul 2006 16:22:23 -0700</pubDate>
</item>

<item>
<title>The Value of Data Profiling to BI</title>
<description><![CDATA[<p><strong>July 21, 2006</strong></p><p>A key function of data profiling is continuous monitoring and the ability to generate trend reports. Trend reports graphically show how data quality is either progressing or regressing over time, through specific measurements. In parallel, BI operations are often run repetitively, on a regular schedule, with the results of the analysis published automatically. But what happens when an extraneous event causes the quality of the underlying data to degrade? How will the BI analyst know this? More importantly, when will the analyst know this, before or after the next iteration of BI reports is released?</p><p><a href="http://weblogs.firstlogic.com/dravis/2006/6/23/the-value-of-data-profiling-to-bi.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/06/the_value_of_da.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/06/the_value_of_da.php</guid>
<category></category>
<pubDate>Fri, 23 Jun 2006 18:22:41 -0700</pubDate>
</item>

<item>
<title>They Need It All</title>
<description><![CDATA[<p><strong>June 9, 2006</strong></p><p>A common debate in project management circles is how much data quality functionality is needed by "lower end" users. In this case, we define a low-end user based on data volumes.  More specifically, a person who works for a firm with a million or fewer records that need to be processed. </p><p>Often times to meet the needs of these lower-end customers, vendors try to cut functionality so as to drop their price point. The problem is whether the customer has 100,000 records or 1,000,000,000 records they need the same functionality.</p><p><a href="http://weblogs.firstlogic.com/dravis/2006/6/9/they-need-it-all.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/06/they_need_it_al.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/06/they_need_it_al.php</guid>
<category></category>
<pubDate>Fri, 09 Jun 2006 14:44:29 -0700</pubDate>
</item>

<item>
<title>Profile the Target</title>
<description><![CDATA[<p><strong>May 15, 2006<br /></strong><br />A challenge many data integrators face is the hidden business rules lying concealed in the data ready to spring up like a Jack-in-the-Box after the first hint of integration. Take a target column of product part numbers. The published (defined) format of the part numbers is nine digits starting with an 8 or a 7 followed by a possible dash "-" and then an alpha character. A part number following the published business rule would look like 856756723-a. So the data modeler and the data integrator design both their target column and ETL schema around the published definition, and then run the integration process. The problem is, and we'll use an EII implementation as an example, there are 20 different data sources feeding data through the integration engine. The coding for collecting and "staging" the part numbers is allowing for a 12 character string. Remember, EII builds virtual data sets, which means requested information is retrieved in real time from the 20 root data sources when needed. </p><p>A second problem is, and you may have guessed </p><p><a href="http://weblogs.firstlogic.com/dravis/2006/5/15/profile-the-target.html">Click to Read More</a>]]></description>
<link>http://www.beyeblogs.com/dravis/archives/2006/05/profile_the_tar.php</link>
<guid>http://www.beyeblogs.com/dravis/archives/2006/05/profile_the_tar.php</guid>
<category></category>
<pubDate>Mon, 15 May 2006 21:16:12 -0700</pubDate>
</item>


</channel>
</rss>
