<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Sharpening Stones, Walking on Coals</title>
<link>http://www.beyeblogs.com/sharpeningstones/</link>
<description>Business Intelligence - How business motivations, technology, and ingenuity are learning to work together to benefit businesses, customers, and communities.</description>
<language>en</language>
<copyright>Copyright 2008</copyright>
<lastBuildDate>Sun, 17 Aug 2008 20:46:50 -0700</lastBuildDate>
<generator>http://www.movabletype.org/?v=3.33</generator>
<docs>http://blogs.law.harvard.edu/tech/rss</docs> 


<item>
<title>Talking Tech</title>
<description><![CDATA[<p>I've just starting having meetings with our internal storage team to discuss how we can optimize throughput between the SAN and our Oracle data warehouse.  Turns out that there are some times over the past few months where our test/UAT server has nearly saturated the fiber channels connecting that server to the SAN!</p>

<p>During both that conversation with the storage team and another casual one I was having with our Unix server team, I mentioned the idea that our BI/DW team is starting to have discussions with some data warehousing specialists.  I'm paraphrasing, but "ugh!" was the universal response.  "We don't like <em>proprietary </em>solutions here.  We've standardized on X storage vendor and Y servers.  We don't want to support that."</p>

<p>We're only in preliminary conversations about bringing in a specialty data warehouse solution, but I certainly want to maintain a good relationship with our server and storage teams.  Any advice on how to help them see the logic and design behind specialty data warehouses?</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2008/08/talking_tech.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2008/08/talking_tech.php</guid>
<category></category>
<pubDate>Sun, 17 Aug 2008 20:46:50 -0700</pubDate>
</item>

<item>
<title>Triadic Continuum (Setting Context)</title>
<description><![CDATA[<p>Review the <a href="http://www.amazon.com/Practical-Peirce-Introduction-Continuum-Implemented/dp/0595441122">book </a> and other sources for more information.</p>

<p>One of the key operations in the Triadic Continuum data structure is "setting context," and I would argue that this key operation might be prohibitive in anything but the simplest examples.  I haven't fully studied and analyzed real-world scenarios, but here's my argument:</p>

<p>First, remember that in the Triadic Continuum, data itself is never duplicated.  Every "sub component node" in the tree structure contains not any observed data, but rather a pointer to "sensor nodes" where data is actually stored.  Like a columnar database, this is great in reducing the amount of data being stored.</p>

<p>Second, think about the ideas of context, constraint, and focus.  These are three key concepts in getting information from the Triadic Continuum.  Context is the idea that you have to provide some level of initial boundary to the question being asked: "I can about sales" for example.  I can imagine this initial definition of context being brutally difficult.  Imagine if you had a single index in a database that represented each column value that appeared anywhere in any column of any table.  Here are a couple of concerns:<br />
<ul><br />
<li/>That gets to be a pretty big single tree or hash table or anything to search through at the beginning of each query.<br />
<li/>Tracing from the sensor node back to the associated subcomponent nodes may result in a very large list of nodes to use in the context.  Potentially millions, hundreds of millions depending on how broad the context is.<br />
</ul><br />
It also seems to me that another huge variable in the performance of the triadic continuum is what order pieces of data are observed and stored in the structure.  Thinking about the observation of customer information:  If I look at gender first, then the first level of the tree is nice and small, just two nodes (unless you work in health care, then there are seven genders).  If I look at zip code and then gender, then the first level is about 43,000 nodes and the next level is 86,000 nodes.  If I search for "Male" in the first scenario, I get 1 node; in the second I get 43,000 nodes.  Likewise, if I search for "82601," I get 2 nodes in the first scenario and 1 node in the second.  I'm sure there are some significant and important implications to this fact.</p>

<p>If anyone wants to pay me something equivalent to my current salary to go back to school and study this more thoroughly, let me know!</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/12/triadic_continuum_setting_cont.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/12/triadic_continuum_setting_cont.php</guid>
<category></category>
<pubDate>Fri, 21 Dec 2007 20:00:00 -0700</pubDate>
</item>

<item>
<title>Triadic Continuum (Prologue)</title>
<description><![CDATA[<p>One of the Vice Presidents that I work with sent me an email the other day with the subject line "Is this something we should be looking at?"  He's no pointy-haired boss, but a subject line like that, followed by a cut and paste of an article from an industry magazine certainly does bring images from Dilbert to mind.  As I read <a href="http://www.dmreview.com/specialreports/2007_44/10000157-1.html">the article</a>, those images only got stronger.</p>

<p>This particular article is about something called the triadic continuum - yes, sounds like something out of Star Trek, doesn't it?  Still, it is interesting - especially if you have a PhD in cognitive science.  There's a <a href="http://www.amazon.com/Practical-Peirce-Introduction-Continuum-Implemented/dp/0595441122">book </a>about this invention that goes into more detail for those of you who are curious.</p>

<p>After a couple of back and forth comments in which I tried to look smarter than I am (and considering pitching the idea that if my boss wanted to send me back to school for my PhD that I'd be happy to go), this VP ended with: "But they said <span style="font-style: italic;">simple, scalable, universal</span>."  Yeah, right there on the package!  ;)</p>

<p>My snide remark (being in a healthcare field) was going to be something like:<br />
<span style="font-size:85%;"><br />
<span style="font-family:courier new;">Right, and so is DNA!</span><br />
<span style="font-family:courier new;">  Simple - only 5 compounds in the whole thing!!</span><br />
<span style="font-family:courier new;">  Scalable - from a flee to a whale!!</span><br />
<span style="font-family:courier new;">  Universal - it is a defining characteristic of life!!</span></span></p>

<p>I didn't send that response, yet.<br />
DNA is simple, universal, and scalable, too.</p>

<p>You'll find Dan Linstedt blogging about the Triadic Continuum in <a href="http://www.danlinstedt.com/forums/index.php?showtopic=426">the future</a>, I'm sure.  I'm still busy reading <a href="http://www.amazon.com/Practical-Peirce-Introduction-Continuum-Implemented/dp/0595441122">the book</a>, but my gut tells me there are some very interesting things to be learned from this data structure.  I don't think it's <strong>the</strong> solution to moving along Ackhoff's data-information-knowledge-wisdom curve, but I've already starting thinking about how to leverage some of the algorithms from the book in relational data models (even though that's what the inventors would discourage us from).</p>

<p>You'll see more from me in future posts as I try to decompose the ideas in the book and relate them back to the day-to-day information management challenges we all face.</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/12/triadic_continuum_prolog.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/12/triadic_continuum_prolog.php</guid>
<category></category>
<pubDate>Wed, 05 Dec 2007 09:45:00 -0700</pubDate>
</item>

<item>
<title>Visualization</title>
<description><![CDATA[<p>Admittedly, I'm a picture-person.  By that, I mean that I tend to think about things using spacial relationships; and I've been this way for a long time.  In grade school, I can remember thinking, "I don't remember exactly what year Wyoming became a state, but I know that I read the answer about half way down a left-hand page somewhere in the middle of the chapter."  I couldn't remember the detail, but I could recall the context in which that information can be found.  Perhaps that just a sort of photographic memory... with astigmatism.</p>

<p>I know that other people think in different ways.  My wife, for instance, reads a sentence (or more) at a time.  It took me a while to understand that... "by a sentence, you mean one word at a time trailing into sentences?"  "No, I mean I look at the page and see phrases and sentences, not single words."  Wow.  That's so different than the way I read.  Maybe that says something about my intelligence, or maybe it just says something about how different people consume data in different ways.</p>

<p>I have a colleague that I've worked with for some years who likes reports with lots of numbers on them.  Now, I'm a bit of a math geek, too -- I got a shirt from my daughter a couple years back with this on it:</p>

<p>01000100<br />
01000001<br />
01000100</p>

<p>Still, I think there's more value that can be derived from that sheet of numbers than can be delivered in the bare numbers themselves.  Choosing the right pictorial representation can be very hard, but it can also be very powerful.  Here are a couple of examples of relatively simple (and underused) visualization techniques that I think are very powerful.</p>

<p><b>Radar Charts</b>:<br />
If you plot two or more series on a single <a href="http://en.wikipedia.org/wiki/Radar_chart">radar chart</a> (or spider chart) you can deliver information about both which of the series "covers the most area" as well as in which dimension each series "out performs" each other series.  So, you've got both point-by-point comparisons as well as an overall comparison available in a single view.  The numeric equivalent might be detailed rankings and an overall average ranking.</p>

<p><b>Bubble Charts</b>:<br />
This <a href="http://en.wikipedia.org/wiki/Chart">kind of chart</a> also allows you to deliver more than one kind of information in a single view.  Bubble charts provide for both a horizontal and vertical plot as well as a third measure that is conveyed using the relative size of the marker on the chart.  So, you could plot "number of infections" versus "number of flu shots" for a set of regional clinics and use the size of the marker on the chart to represent "number of nurses on duty" -- if you had a reason to think some pattern might come out of those three metrics.</p>

<p><b>Gantt Chart</b>:<br />
Don't think I'm just referring to project plans, dependencies, and resources here.  Gantt charts can be used convey more information than that.  A <a href=" http://en.wikipedia.org/wiki/Gantt_chart">gantt chart</a>, after all, isn't much more than an "activity" versus "time" chart anyway.  Imagine a chart with patient volume on the Y-axis and time on the X-axis.  Great.  You know how patient volume changes throughout the day or over the year.  Add a set of gantt charts to the bottom showing holidays, the school year, and other special events and your chart now allows you to correlate patient volume with various other pieces of information other than "time" generically.</p>

<p>You can imagine adding more dimensions to each of these kinds of charts using shape and color.  You could add animation over time to show a trend.  You could add sound as another mechanism for delivering information, reflect an upward or downward trend, for example.  Three-dimensional charts are fine, but imagine a 7-dimensional chart that uses X, Y, Z, color, shape, size, and animation to convey so much more information.  Clearly there's a practical limit to how much information someone can consume from a single view of some piece of information, though. </p>

<p>Still, reports with lots of number on them probably isn't enough to justify a multi-million dollar reporting tool any more.<br />
</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/visualization.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/visualization.php</guid>
<category></category>
<pubDate>Tue, 07 Aug 2007 19:45:00 -0700</pubDate>
</item>

<item>
<title>Pushmi-pullyu</title>
<description><![CDATA[<p>Most of the data warehouse related projects I've been involved in that are related to a transactional system implementation, migration, or enhancement have followed the implementation of that OLTP system rather than lead or been concurrent with.  I believe that's primarily because the data warehouse was also often the "reporting database" for those implementations.  Part of the reason was to help justify the construction of the data warehouse - "We can off load reporting from the transactional system to the DW!  The BI team can create those reports for people!"  Very honorable, indeed, but those kinds of projects are often fraught with scope and requirements changes as the transactional system is implemented and people figure what it really means in the world of day to day work. </p>

<p>I'm in the middle of a 5  year initiative that is gutting and reimplementing the underpinnings of a all of the operational systems of a 20,000 employee organization right now.  This is happening, though, because the powers-that-be in my organization decided it was part of their mission to become an "information enabled" organization.  Besides  implementing new and standardized applications across the company to ensure that we could collect lots and lots of information, the leadership also create a Performance Management department, which funds the Business Intelligence and Data Warehousing group.  Insightful, I think.</p>

<p>Because Business Intelligence has such a prominent role in the organization, the implementation of reporting functionality (operational and some analytical) is happening alongside the implementation of the new ERP and billing systems.  What's even more insightful, I believe, is that in some cases, there are Performance Management objectives influencing how the new systems are being rolled out - specifically in the are of master data management and conformed dimensions.</p>

<p>The theoretical concert between operational changes and business performance management makes a lot of sense - someone with great analytical modeling skills looks at data and determines that the business could perform better if some business process were altered.  So, the business process is changed... sometimes by implementing a new application system that can help make people more productive... but that also means the business model and underlying data on which the original analysis was build changes... and the analytical model breaks.  So, the data warehousing group waits for the application to go-live so they can rework their technical infrastructure to get the data and put it in some new set of data structures for the analysts to build new models on top of and go through the next iteration of process improvement.  Those cycles take years to happen.</p>

<p>Wouldn't it make more sense if the operational applications that are built to improve business performance used a strategic analytical model as an input into what the system should look like and what kind of data it should capture and deliver back to folks doing performance management?  Would it make any sense for the data warehouse to be built before the operational application, and be used as a validation that the new operational processes being implemented actually match what was designed to improve business performance?  Once a business has gone through an initial cycle of process improvement, who should be pushing whom?</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/pushmipullyu.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/pushmipullyu.php</guid>
<category></category>
<pubDate>Sat, 04 Aug 2007 23:15:00 -0700</pubDate>
</item>

<item>
<title>Walking on Coals</title>
<description><![CDATA[<p>With the second phrase from REM's Exhuming McCarthy, "walking on coals," try not to focus on the <a href="http://en.wikipedia.org/wiki/Fire-walking">science</a> of fire-walking, but rather the psychology and imagery - the idea of walking on coals as a test of mind over matter.  </p>

<p>In the world of Business Intelligence and Data Warehousing, here we sit with gobs and gobs of data (rarely enough or exactly the right data, it seems, but gobs none the less).  Here we sit with the world in front of us represented in tables and records, entities and attributes, facts and dimensions.  The work it takes to derive value from that mound of information is very much a mind over matter challenge.  Often, it looks simple on paper, but as you dive into the complexities of reality and start revealing "exceptions to rules" and "obscure business logic" and "poor quality data," it takes real cleverness and fortitude to turn that information into action.</p>

<p>Maybe this wasn't Michael Stipe's intent, but it works for me.</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/walking_on_coals.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/walking_on_coals.php</guid>
<category></category>
<pubDate>Fri, 03 Aug 2007 19:45:00 -0700</pubDate>
</item>

<item>
<title>Sharpening Stones</title>
<description><![CDATA[<p>You'll notice the reference to REM's lyrics to Exhuming McCarthy in the name of this new BEYEBlog.  I have to give full credit to my wife for choosing that inspiring idea as the title for my blog.  I think it's incredibly fitting to describe how Business Intelligence and Data Warehousing fit into the world today.</p>

<p>Consider the first phrase "Sharpening Stones."  The image I get in my head is of Neanderthals in the stone age chipping flat rocks into point spears that they use to hunt mammoths and saber-toothed tigers.  It took tens of thousands of years to get from there to metal spear tips.  Lucky for us, technology moves a lot faster.  We've gone from the mere idea that a <a href="http://en.wikipedia.org/wiki/Difference_engine">computing machine</a> could be built in the 1800s to incredible <a href="http://en.wikipedia.org/wiki/Iphone">small and powerful electronic devices</a> in only 200 years.  Still, I think that the information age we're in today is closer to the stone age than the bronze age.  The computers that we build are giving us the ability to create and gather information in a way that we couldn't have ever imagined before.... but we're still barely figuring out what to do with all of that information that technology allows us to collect.  </p>

<p>We're still just sharpening stones.  We take a "collect everything that'll fit on the biggest hard drive approach" to information hoping we find something valuable.  We don't have the precision and power of metalworks.</p>]]></description>
<link>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/sharpening_stones.php</link>
<guid>http://www.beyeblogs.com/sharpeningstones/archive/2007/08/sharpening_stones.php</guid>
<category></category>
<pubDate>Fri, 03 Aug 2007 19:30:00 -0700</pubDate>
</item>


</channel>
</rss>