<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>DataGoddess</title>
<link>http://www.beyeblogs.com/datagoddess/</link>
<description>This Blog site provides feedback to the media, as well as to tool developers.  Even a data goddess knows that there is ALWAYS something to learn from each other!</description>
<language>en</language>
<copyright>Copyright 2010</copyright>
<lastBuildDate>Mon, 11 Jan 2010 06:15:00 -0700</lastBuildDate>
<generator>http://www.movabletype.org/?v=3.33</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
<title>The Integrated Marriage: Modeling and Profiling</title>
<description><![CDATA[<p>For the road weary integration warrior, the announcement of the union of data modeling and data profiling is no CNN late breaking news update.  </p>

<p>Most integration projects understand the need to map all data sources to a data model which later (or concurrently) is mapped to the target data source.   </p>

<p>Unfortunately, the data modelers (or ETL designers) are usually working without a net.  They either interview system experts for the definitions of the source columns to get 'memory recollection of data content'..or they complete one off queries (read as:  time consuming, inefficient task) to check the values of a data source.</p>

<p>Recommendation:   <br />
Identify a scope of work on the project, <br />
  <strong>complete data VALUE profiling</strong> for all of the data source objects for that scope of work <br />
  <strong>sort the data sources at two levels</strong>:<br />
   = on whether the data source table/file is References Data, Transaction Data, or Summary data <br />
   = on the data values found during profiling <br />
<em>Most profiling tools can find Char(3) matches to Char(3) but they can NOT find  Char(3) and Integer matches when the values of each are 1 - 999.</em><br />
It is only at this point when you really understand what you have.  NOW the data modeling / data mapping work should begin.   </p>

<p><br />
Throwing data modelers and data profilers at an integration project is not wedded bliss unless your data profiler courts the project appropriately and presents VALUES to the betrothed data model.  But done right...your integration project can result into the long term relationship you always dreamed about!</p>]]></description>
<link>http://www.beyeblogs.com/datagoddess/archive/2010/01/the_integrated.php</link>
<guid>http://www.beyeblogs.com/datagoddess/archive/2010/01/the_integrated.php</guid>
<category></category>
<pubDate>Mon, 11 Jan 2010 06:15:00 -0700</pubDate>
</item>

<item>
<title>Don’t Shoot the Report Writer</title>
<description><![CDATA[<p>It thought he had it!!!</p>

<p>Mike Garrett's March 2, 2006 publication of "Don't Shoot the Report Writer" grabbed my attention!<br />
<a href="http://www.b-eye-network.com/newsletters/inmon/2458">http://www.b-eye-network.com/newsletters/inmon/2458</a></p>

<p>I was expecting him to finally publish the truth!   That all the hoopla about BI tools only made it harder on the report developers!   And I guess he inferred it....but let me append!  </p>

<p>Mike is right.  The devil is in the details and the top folks do want the detail.   The gap in the story is the level of effort to get you there.    </p>

<p>There is a disconnect of expectations.  The BI tool sales folks tell upper management how quickly you can create cubes and reports.  They don't tell upper management how long it is going to take to <strong>educate </strong>your information users in order to develop the kind of reports and cubes that will transform the way they do their business.   </p>

<p>So...the report developers walk in...are told to transform all of the old reports into the new tool (by the end of the week)...The users are not wowed....and then they hate the tool.    </p>

<p>Is there a way around this issue?  Sure is.  <br />
You have to have a strong BI reporting Development management team that supports a solid development process.   <br />
You have to educate the business users into thinking a different way about seeing their business .  <br />
And you have to get them to help show you when aggregations are useful, and when they just clunk up the operations.  </p>

<p>When the process is done well...it is a beautiful experience.  When it's not...they'll be shooting at your Report Writers! </p>

<p><br />
</p>]]></description>
<link>http://www.beyeblogs.com/datagoddess/archive/2006/03/Dont_Shoot_the_.php</link>
<guid>http://www.beyeblogs.com/datagoddess/archive/2006/03/Dont_Shoot_the_.php</guid>
<category></category>
<pubDate>Thu, 02 Mar 2006 09:45:00 -0700</pubDate>
</item>

<item>
<title>Copy-rama...a data perspective</title>
<description><![CDATA[<p>Copy-Rama….Making Copies !!!</p>

<p>Was it back in the 1980’s when the first big push was to go paperless?   </p>

<p>The cost of making copies in a company were huge!   Every company was desperately trying to figure out how to reduce their copy costs 50-80%.    Taking one document, and distributing it to 600 employees at a cost of  $.001 was costing the company 6 cents.    The problem was….that this 6 cent process occurred so many times a year…. paper costs were running the company hundreds of thousands of dollars.   That small, small cost on one process was driving business process changes that would help reduce costs.   </p>

<p>The reaction?   Automate.   If there was a way to capture and disperse that information on-line…then there would be no need for copying that piece of paper.    EVERY department in every company across the country quickly responded and started their technology solution.   Finance department across the US bought software packages to put their information on-line.   Marketing departments built really cool systems from scratch.   Sales departments may have had someone in their group with the know-how to make a quick little application that do what they needed right now.   Every department responded, as quick as they could to solve the immediate problem of being able to view and update information on-line….and it was good.</p>

<p>Cut to 2006.   IT maintenance budgets need to be cut.   Companies are finding that they are spending 80% of their IT costs not on new development,..but on maintenance of existing systems.   They are paying to keep all of their app servers up and going.  They are paying to keep all of their databases servers up and going.  They are spending huge budgets to understand and maintain processes for business continuity in case this server or that db goes down. <br />
   <br />
So what’s the next step?   Lower Costs:  Recycle Data</p>

<p>Data is the one thing in your business that really doesn’t change over time…. yet is probably lowest on the totem pole when making architecture and project deadline decisions.    Data is the most re-usable asset a company has, with the highest potential long term return costs to a company.    Example:  Airline Industry.   What kind of information was necessary to know in the 1940s?   Planes, Customers, Ticket Prices, Ticket Sales, Flight Schedules, Flight Arrivals, Pilots, Flight Attendants, etc.   And today…what kind of information is necessary to run the business:  Planes, Customers, Ticket Prices, Ticket Sales, Flight Schedules, Flight Arrivals, Pilots, Flight Attendants, etc.   You can take almost any industry and do the same thing, Retail, Manufacturing, Insurance,….the kind of data collected doesn’t change much over time.   </p>

<p>Positive or negative…database technology has not changed much in the past 25 years.    There were relational databases back then that are still running today.    So, if the business hasn’t changed much,…and the technology hasn’t changed much…it is probably a place where any long term investment would be safest.      Get this part right, re-use it, leverage the investment.</p>

<p>Make sense….but if it is that easy...why hasn’t it been done?    <br />
Buying a package with the newest whiz-bang features impacts a department <em>right now</em> with immediate results to meet immediate business needs.   Trying to integrate a shelf product (prior to SOA) onto an existing database has been <em>cost prohibitive.   </em><br />
<u>Example:  </u><br />
Cost of shelf product $750,000.  <br />
Cost of annual DB cost to support that app $150,000.   <br />
Cost of integrating that shelf product into an existing DB $500,000.   <br />
Time to implement the project if a new DB is put in place?  < 3 months.   <br />
Time to implement the project if the app was to be integrated into an existing DB?   6-12 months. <br />
  <br />
A quick look at this example project…and putting in the new DB seems like the easiest, most no-brainer cost and time effective decision this project manager has ever made!  Result?  A decision to COPY THAT DATA!!!!   <br />
But the reality is…in 5 years…that database is going to be there.   <br />
In 10 years that database is going to be there….and so are the maintenance costs…and so is all the added dependency complexities when changes need to take place or when an upstream system goes down.   <br />
Now,…all of those projects along the way based on the same “easiest, most no-brainer cost and time effective decision this project manager has ever made!” is <u>crippling IT. </u></p>

<p></p>

<p>When I say crippling...I mean...COST!   Figure out the cost of one database.   One database may annually cost your company $350,000 for hardware, licenses, backup, recovery, performance monitoring, tuning, etc.   <br />
Now…each time you make a copy of that data (4 departments make copies of 20% of that data), and put it on another database, that $350,000 multiplies….not at the $350,000….because each one only took 20% of the data….but lets say, they each have $150,000 maintenance per year.   By having 4 departments take copies of only 20% of the data,…you have more than doubled your annual data storage costs.   (I could get distracted here by also putting asking to assess the costs of change requests to downstream systems, or the cost of data quality, as data is passed and manipulated between systems…but I’ll leave that for another day).</p>

<p>Would it be hard to figure out how to make all (or at least most) of the systems re-use the same data from it’s initial data source?   YES.    But the alternative...doing it the easy way, is not necessarily (and not usually) the best long term solution.   <br />
<em>The long term impact of reducing one database at a time is like the return you get when you start paying of a high interest bill.   Once you pay it off,..you have more money that month to apply to the next bill.   The more you pay off,..the more available cash you will have on hand. </em></p>

<p>The idealistic potential, is to record that a sale was made…and have every process access that row of information when needed.    Put it on one database and re-use it.   Yes…it is idealistic.   But…if you set that as a vision, not a directive…you can at least let project managers walk in that direction.  In the above example…removing just 25% of the lowest cost databases, would return 31%  of your total data storage budget.  Remove 25% of the highest cost databases and you would save 70% of your data storage budget!</p>

<p><br />
We’re not talking just 6 cents a copy here any more.  Copy-rama!!   Making Copies!!  Costing the company money! <br />
</p>]]></description>
<link>http://www.beyeblogs.com/datagoddess/archive/2006/01/Copy-ramaperiod.php</link>
<guid>http://www.beyeblogs.com/datagoddess/archive/2006/01/Copy-ramaperiod.php</guid>
<category></category>
<pubDate>Thu, 12 Jan 2006 13:00:00 -0700</pubDate>
</item>

<item>
<title>ERwin Wish List Item - Subject Area Selection</title>
<description><![CDATA[<p>Creating the proverbial Enterprise Data Model is an awe inspiring job...not for the faint of heart.   Good tools make this job so much easier.   Computer Associates AllFusion Modeling Suite provides a very feature full data modeling tool to help simplify the process.   However, with every tool set...comes a list of wish items to improve the process.    Today's item that I would like to see added?   The "Select all entities associated to Subject Area" Feature.</p>

<p>What I want:   I want to be looking at the "Main Subject Area" and have a feature that allows me to select a given Sub-Subject Area (customer subject area, product subject area, etc) and then have all of the entities in that sub-subject area be highlighted on this main subject area.</p>

<p>The perceived benefits of this feature:  Large Model Navigation, Large Model Organization Maintenance.  </p>

<p>By allowing be to ask that a certain sub-subject area is highlighted, would allow me to see if I have color coded each of the entities the same.  It might also help me find rogue entities that are not grouped with like entities.  </p>

<p>I would like to be able to 'Mass Assign' them a certain color. Hmm...but that sounds like a second wish item for the ERwin developers! </p>]]></description>
<link>http://www.beyeblogs.com/datagoddess/archive/2006/01/ERwin_Wish_List.php</link>
<guid>http://www.beyeblogs.com/datagoddess/archive/2006/01/ERwin_Wish_List.php</guid>
<category></category>
<pubDate>Tue, 10 Jan 2006 15:00:00 -0700</pubDate>
</item>


</channel>
</rss>