<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>SidelongGlance</title>
<link>http://www.beyeblogs.com/SidelongGlance/</link>
<description>re: ETL and metadata</description>
<language>en</language>
<copyright>Copyright 2008</copyright>
<lastBuildDate>Mon, 16 Jun 2008 07:51:28 -0700</lastBuildDate>
<generator>http://www.movabletype.org/?v=3.33</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
<title>Update</title>
<description><![CDATA[<p>Although the install has completed, the tool has not yet been rolled out.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2008/06/update_3.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2008/06/update_3.php</guid>
<category></category>
<pubDate>Mon, 16 Jun 2008 07:51:28 -0700</pubDate>
</item>

<item>
<title>update</title>
<description><![CDATA[<p>The Business Objects data import works!<br />
The installation will proceed.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2008/05/update_2.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2008/05/update_2.php</guid>
<category></category>
<pubDate>Wed, 07 May 2008 09:30:00 -0700</pubDate>
</item>

<item>
<title>Semantic Revolution</title>
<description><![CDATA[<p>Thanks tye from perlmonks for posting this to the CB. <br />
http://www.herecomeseverybody.org/2008/04/looking-for-the-mouse.html <br />
 <br />
I think the author makes a good point, I know that when I am carving out my time for the participatory web, my TV time is what is sacrificed.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2008/04/semantic_revolu.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2008/04/semantic_revolu.php</guid>
<category></category>
<pubDate>Mon, 28 Apr 2008 12:45:00 -0700</pubDate>
</item>

<item>
<title>Wow, almost a year</title>
<description><![CDATA[<p>Just an update on where I am with my pursuit of the Metadata Grail. <br />
 <br />
My company still has not implemented the Metadata Manager tool from Informatica.  The folks working on that aspect have 1 more week to get this accomplished before they will have to drop it to work on other tasks.  We are currently holding up our upgrade to Informatica 8.5 in order to give the installer time to work out issues with Metadata Manager, but there are bug fixes that we sorely need in the upgrade that we are not able to wait for any longer. <br />
 <br />
As far as I know, the MM tool is installed, and works, but not for everything we wanted it to do.  We have successfully imported data from PowerDesigner models (the Sybase data modeling tool), and ran the X-Connects for the Informatica repository; but we have been unsuccessful with the Business Objects X-Connect, and the Informatica X-Connect runs very slowly (on the order of 16 to 24 hours). <br />
 <br />
Our co-workers are eager to make use of the metadata and have plans to centralize our development as much as possible in the modeling tool.  There are 2 teams working on using data from PowerDesigner to feed designs to Business Objects to use in creating Universes. <br />
 <br />
Anyone else out there doing these things?</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2008/04/wow_almost_a_ye.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2008/04/wow_almost_a_ye.php</guid>
<category></category>
<pubDate>Fri, 25 Apr 2008 14:45:00 -0700</pubDate>
</item>

<item>
<title>Metadata and Semantics</title>
<description><![CDATA[<p>Hey, I know the last entry was full of excitement; but it proved to be short lived.</p>

<p>Even though I have a license for the Ab Initio GDE, I haven't had a chance to use it.  Nor has the promised availability of installing Metadata Manager for Informatica 8.</p>

<p>Informatica has been giving us problems again.  The new version, even though its been out for a while, has some problems; mostly with the transition from older versions to new.  But that has precluded me from installing Metadata Manager yet.</p>

<p>In the meantime, I was able to attend the 3rd Annual Semantics Conference in San Jose last week.  It was really helpful.  I learned that Ontology and Semantics is useful to me as the next level of abstraction above Metadata Modeling; so I will be able to use it to find the common points between Ab Initio's EME and Informatica's MM and my own metadata model.  More will be coming soon on the details of the conference.  In the meantime, here is a link.</p>

<p>http://www.semantic-conference.com/default.html</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2007/05/metadata_and_se.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2007/05/metadata_and_se.php</guid>
<category></category>
<pubDate>Tue, 29 May 2007 10:15:00 -0700</pubDate>
</item>

<item>
<title>More Fun</title>
<description><![CDATA[<p>Its been 2 weeks since I started trying to install the new version of Informatica on my desktop.  We are going to 8.1.1 in Development first; and I am going to try to get Metadata Manager working on that version instead of continuing to fight to get it to work on 7.1.4.  Back in August, I was trying to install Super Glue (Metadata Manager's old name) on my desktop, and Adobe SVG (the plug in they use for being able to zoom in on data lineage diagrams) was giving me problems.  It would not install on my machine, no matter what.  So I ended up getting a new box; and that solved the problem.<br />
So, here I am in March trying up install the Client tools for 8.1.1; and they install ok; but when I try to set up a domain for connecting to the servers I get a cryptic error about not being able to save the domain information.  After trying every which way from sunday for a week; the help folks at Informatica come to the conclusion that I need to re-image my machine!  That is finally over, and I have 8.1.1 running.</p>

<p>Next week, I will try setting up the Metadata Manager again.</p>

<p>Its hard to type with crossed fingers...  ;-)</p>

<p>Oh!  And I almost forgot!  I have a temporary license for Ab Initio's GDE now!</p>

<p>Ab Initio's Enterprise Meta Environment is so different from Informatica.  The EME is set up on a corporate server (we are in a subsidiary); and after only a small amount of tweaking, it can show a complete Data Lineage diagram from our ultimate source system, through Informatica, to Ab Initio, to the ultimate target and into a Business Objects Universe!</p>

<p>(I heard that overuse of exclamation points is the sign of a diseased mind!)</p>

<p>It was very exciting.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2007/03/more_fun.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2007/03/more_fun.php</guid>
<category></category>
<pubDate>Fri, 30 Mar 2007 16:30:00 -0700</pubDate>
</item>

<item>
<title>Ab Initio Profiler</title>
<description><![CDATA[<p>I read the documentation for the Profiling tool from Ab Initio.  I've worked with their ETL tool before, but not had the chance to use the profiler.  They are possibly coming for a demo in the next month or so; and if it looks useful, we will try to convince the people using Ab Initio to pick up the Profiler as well.</p>

<p>Apparently, it not only collects statistics, but does analysis across and within data sets for dependencies and correlation; as well as being able to generate transformation code to use for validation.</p>

<p>Looking through the docs, the only thing I can think of that its missing is trending analysis; and it may just be because I missed it.</p>

<p>-------------<br />
Added Feb. 1, 2007:</p>

<p>We had the demo, see the next entry.  I realized that I did't say much about the Profiler in that entry, and it seems more relevant here.</p>

<p>The Ab Initio Data Profiler is better than others I have seen.  You can profile an entire dataset, or a sample thereof (with parameters), or you can profile a single run of a graph against that data; allowing you to schedule the heavy load of a full profile, and then keep it up to date piece by piece.</p>

<p>We did run into problems with our initial profile runs on some data, but that is because our server we can run the profiler from is in a different location from where the data is, and we were filling up the pipe between.  We were able to throttle the process and keep from hogging the pipe, but it makes things run quite slowly.</p>

<p>What was really impressive was that we got a pre-view of the demo about 3 or 4 weeks prior, and my manager asked about analyzing trends between runs of the incremental profile, and using that trending to determine, in the map, wheter we wanted to load the data to the target.  They weren't able to do it at the time, but when they brought the tool in for the demo this week, they had a way to accomplish that task.</p>

<p>I know, it was probably something they were already working on, but it was impressive none the less.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2007/02/ab_initio_profi.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2007/02/ab_initio_profi.php</guid>
<category></category>
<pubDate>Thu, 01 Feb 2007 09:45:00 -0700</pubDate>
</item>

<item>
<title>Ab Initio Demo</title>
<description><![CDATA[<p>The folks from Ab Initio were here Tuesday to give us a demo of their product.</p>

<p>It was a great demo.  I have worked with Ab Initio before, and all the great things I remember liking about it were still there, plus a few new features.</p>

<p>After working with both Informatica and Ab Initio, I have to say that I prefer Ab Initio for several reasons.<br />
Ab Initio is easier to work with.<br />
Informatica has the PowerCenter Designer; where you put together the mappings of the data from source to target, and enter the business rules for transformation.  But to define a source there is another "tab" that you have to switch to, and the targets are defined on another tab; and the reusable pieces of maps (called mapplets) are on yet another tab.  Then the connection between a logical source definition in the tool, and the actual table/file in the computer is in a completely separate tool; called the Workflow Manager.  Then, when you run the thing; there is yet another application that you use to monitor the execution.  If you want to see the data you are operating upon, you have to use some other tool to get to it (I use TOAD to see the data we have on Oracle, Teradata SQL assistant, and QMF for our DB2 source).  If you want to see what each individual component is up to?  You are out of luck.  *If* you can get the debugger to run, you might be able to track what the components are up to, but beware, of you have too many components on the map, the debugger won't even load.   In Informatica, parallelism is left to the physical hardware implementation at the network level. (That is, to get parallelism in Informatica, you require more than one server to run it on).</p>

<p>With Ab Initio, it is all in one place.  You drag and drop the components in the window of the Graphical Development Environment (GDE); there are database table components and file components and myriad transformation components.  Then, to define the input columns, you can import DDL, or double click on the component and use a text edit mode to enter it.  Same thing for the output definitions.  Plus, you can also put a URL or the database connection information into the component and actually browse the data you are defining the DDL for.  The tool gives you a visual indication when the information in the component is not complete enough for the graph to execute.  Then, when you are ready to run it, click on a button and it starts to execute; no window switching.  Plus, you can see the record counts as each component processes, so you can see which components are working as expected; or if there are bottlenecks in your process.  And thats not even mentioning the debugger, which is a quantum leap beyond the execution data.  Another huge advantage of Ab Initio is that parallelism is built in from the graph level.  With Ab Initio, there are components to "Partition" and "Departition" a data flow; which allows the programmer to insert parallelism in the process at the graph level.  And between graphs or between checkpoints within graphs you can land your data flows to "multifile" data sets on disk.</p>

<p>There is even more, especially when it comes to metadata and the tools related to it, but I'm afraid my posting would stray too far toward a rant at that point.</p>

<p>We will not be going whole hog to Ab Initio, because we have significant sunk cost in Informatica.  But we will be using Ab Initio as much as we can (sharing an implementation at the Corporate office); especially for metadata.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2007/02/ab_initio_demo.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2007/02/ab_initio_demo.php</guid>
<category></category>
<pubDate>Thu, 01 Feb 2007 09:45:00 -0700</pubDate>
</item>

<item>
<title>Data Modeler&apos;s Workbench</title>
<description><![CDATA[<p>This is Steve Hoberman's book.  I met Steve at the same conference I met Dave Hay at, in February this year.  Steve was very practically minded; and a great counterpoint to Dave Hay's abstractness.  My partner and I used the Meta Data Bingo idea to great success.  "Ensuring High-Quality Definitions" was quite helpful as well.  I heartily recommend this book to anyone working on a Meta Data repository.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/data_modelers_w.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/data_modelers_w.php</guid>
<category></category>
<pubDate>Fri, 17 Nov 2006 12:00:00 -0700</pubDate>
</item>

<item>
<title>Data Model Patterns, A Metadata Map</title>
<description><![CDATA[<p>This is Dave Hay's new book.  Its an application of Information Technology techniques to the business of Information Technology.  I was able to hear about the book from Mr. Hay at a seminar in February before the book came out; and his presentations were so compelling that I had to have it.  The book was definitely worth the wait.</p>

<p>Its organized along the lines of the Zachman Architecture Framework, with a few tunings of his own; and goes along building data models; the first one is a data model of a data model (hmm, maybe thats why Aristotle is on the cover).</p>

<p>It sounds esoteric, but I found it invaluable in refining my own metamodel and catching some nuances I had missed.</p>

<p>For a flavor of what the book is like, check out Mr. Hay's articles in TDAN, especially the one about <a href="http://www.tdan.com/i033ht02.htm">Modeling Baseball Cards</a>.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/data_model_patt.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/data_model_patt.php</guid>
<category></category>
<pubDate>Thu, 16 Nov 2006 16:15:00 -0700</pubDate>
</item>

<item>
<title>Update</title>
<description><![CDATA[<p>Hi Y'all,</p>

<p>Sorry its been such a long time since posting; here's the catch-up.</p>

<p>Things are still moving along on the Metadata front; we now have SuperGlue from Informatica; and in between deliverables on other projects I try to get it running.</p>

<p>Its been installed twice; the first time I had it to the point where I had a couple Xconnects defined and running, although not on a scheduled basis.  The Xconnects are the canned set of mapping that ship with SuperGlue and import the data from various sources; the most important of which is the Informatica Repository itself.</p>

<p>Unfortunately, not long after, we lost a disk in the array where our databases were kept; and the DBA's decided not to recover the SuperGlue repository.</p>

<p>So I started from scratch.</p>

<p>I am now to the point where its time to start defining the sources and then configure the Xconnects again; but other projects are eating up my time again.</p>

<p>On the Ab Initio front; one of my co-workers went to a seminar recently on Ab Initio; and also spoke to a manager in another part of our company who is receptive to the idea of sharing his instance with us.</p>

<p>We will see which avenue bears the most fruit.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/update_1.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2006/11/update_1.php</guid>
<category></category>
<pubDate>Mon, 13 Nov 2006 09:45:00 -0700</pubDate>
</item>

<item>
<title>update</title>
<description><![CDATA[<p>Hi Ya'll,</p>

<p>Things are moving along finally.</p>

<p>After an energy infusion from the DebTech conference, and an enforced lull from metadata concerns (enforced that is by other work deadlines rearing up)</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2006/04/update.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2006/04/update.php</guid>
<category></category>
<pubDate>Fri, 14 Apr 2006 11:45:00 -0700</pubDate>
</item>

<item>
<title>Meta-Data and Data Modeling Summit</title>
<description><![CDATA[<p>Orlando Florida, 2/27 - 3/1/2006<br><br></p>

<p>DebTech put on a nice conference.<br><br></p>

<p>I was able to hear Claudia Imhoff speak on "Sarbanes-Oxley - A Legislative Mandate for Business Intelligence!";<br>Tom Haughey gave a great tutorial on Advanced Modeling;<br> Bonnie O'Neil gave us examples and tools for building a Data Dictionary with "whatever is lying around".<br><br />
Dave Hay waxed philosophical on "Data Model Patterns" and "Advanced Data Model Patterns";<br> and Steve Hoberman brought us back to concreteness.<br><br />
Malcolm Chisholm gave a couple of good talks on Repositories (Build or Buy) and "Master Data Management vs. Reference Data Management".<br><br />
David Schlesinger gave us some great ideas of how to facilitate a sane regulatory compliance and information security strategy with some simple metadata techniques.<br><br></p>

<p>I found myself wishing it would last a little longer.<br><br></p>

<p>Bonnie O'Neil infected me with her energy and enthusiasm.<br><br></p>

<p>Dave Hay took us back to Aristotle's "Posterior Analytics" and modeled semantics and made me want his new book "Data Model Patterns - A Metadata Map" (not to be confused with his 1995 title "Data Model Patterns: Conventions of Thought").  He also humbled me with his dazzling display of Origami expertise.<br><br></p>

<p>Thank you all for an edifying and fun experience.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2006/03/Meta-Data_and_D.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2006/03/Meta-Data_and_D.php</guid>
<category></category>
<pubDate>Tue, 07 Mar 2006 09:30:00 -0700</pubDate>
</item>

<item>
<title>Introduction</title>
<description><![CDATA[<p>I was invited to open a blog at this site, and so I did.  I am not entirely sure what I will post, or how often, but be patient and I hope that I will be able to provide some useful information, if only by negative example. ;-)<br><br><br />
I work at a small company that was bought by a much larger company, and partnered with a similar company.  In other words, Company A and Company B were in the same industry, but weren't direct competitors due to the regional nature of their businesses, they were bought by Company C and combined into a nationwide company.<br><br></p>

<p>Company A had a data warehouse, and company B had a large Operational Data Store (ODS); Company C has a very large data warehouse that wants data from A and B.  I support the collection of data from companies A+B to provide reporting, as well as trying to supply the data requirements of C.<br><br></p>

<p>We use Informatica to transport data between DB2, Oracle and Teradata, and its working pretty well for us.  Our data volume from the source system (yes, there is only 1) is getting to the point where we are going to have to do something to keep up.  <br><br></p>

<p>Metadata wise, we are still in the infancy stage.  We recognize (well, maybe not *all* of us) that metadata is important, but maybe not how important it is.  I have a metadata model that I would like to implement, but without tools its going to be too much work for 1 person.<br><br></p>

<p>I guess thats deep enough for an introduction.  I will expand on what we are doing, what challenges we are facing, and some of the solutions we are persuing in later posts.</p>]]></description>
<link>http://www.beyeblogs.com/SidelongGlance/archive/2005/10/Introduction.php</link>
<guid>http://www.beyeblogs.com/SidelongGlance/archive/2005/10/Introduction.php</guid>
<category></category>
<pubDate>Tue, 25 Oct 2005 14:15:00 -0700</pubDate>
</item>


</channel>
</rss>