August 17, 2008
Talking Tech
I've just starting having meetings with our internal storage team to discuss how we can optimize throughput between the SAN and our Oracle data warehouse. Turns out that there are some times over the past few months where our test/UAT server has nearly saturated the fiber channels connecting that server to the SAN!
During both that conversation with the storage team and another casual one I was having with our Unix server team, I mentioned the idea that our BI/DW team is starting to have discussions with some data warehousing specialists. I'm paraphrasing, but "ugh!" was the universal response. "We don't like proprietary solutions here. We've standardized on X storage vendor and Y servers. We don't want to support that."
We're only in preliminary conversations about bringing in a specialty data warehouse solution, but I certainly want to maintain a good relationship with our server and storage teams. Any advice on how to help them see the logic and design behind specialty data warehouses?
Share:
Posted by Paul Boal at 8:46 PM | Comments (0)
December 21, 2007
Triadic Continuum (Setting Context)
Review the book and other sources for more information.
One of the key operations in the Triadic Continuum data structure is "setting context," and I would argue that this key operation might be prohibitive in anything but the simplest examples. I haven't fully studied and analyzed real-world scenarios, but here's my argument:
First, remember that in the Triadic Continuum, data itself is never duplicated. Every "sub component node" in the tree structure contains not any observed data, but rather a pointer to "sensor nodes" where data is actually stored. Like a columnar database, this is great in reducing the amount of data being stored.
Second, think about the ideas of context, constraint, and focus. These are three key concepts in getting information from the Triadic Continuum. Context is the idea that you have to provide some level of initial boundary to the question being asked: "I can about sales" for example. I can imagine this initial definition of context being brutally difficult. Imagine if you had a single index in a database that represented each column value that appeared anywhere in any column of any table. Here are a couple of concerns:
That gets to be a pretty big single tree or hash table or anything to search through at the beginning of each query.
Tracing from the sensor node back to the associated subcomponent nodes may result in a very large list of nodes to use in the context. Potentially millions, hundreds of millions depending on how broad the context is.
It also seems to me that another huge variable in the performance of the triadic continuum is what order pieces of data are observed and stored in the structure. Thinking about the observation of customer information: If I look at gender first, then the first level of the tree is nice and small, just two nodes (unless you work in health care, then there are seven genders). If I look at zip code and then gender, then the first level is about 43,000 nodes and the next level is 86,000 nodes. If I search for "Male" in the first scenario, I get 1 node; in the second I get 43,000 nodes. Likewise, if I search for "82601," I get 2 nodes in the first scenario and 1 node in the second. I'm sure there are some significant and important implications to this fact.
If anyone wants to pay me something equivalent to my current salary to go back to school and study this more thoroughly, let me know!
Share:
Posted by Paul Boal at 8:00 PM | Comments (0)
December 5, 2007
Triadic Continuum (Prologue)
One of the Vice Presidents that I work with sent me an email the other day with the subject line "Is this something we should be looking at?" He's no pointy-haired boss, but a subject line like that, followed by a cut and paste of an article from an industry magazine certainly does bring images from Dilbert to mind. As I read the article, those images only got stronger.
This particular article is about something called the triadic continuum - yes, sounds like something out of Star Trek, doesn't it? Still, it is interesting - especially if you have a PhD in cognitive science. There's a book about this invention that goes into more detail for those of you who are curious.
After a couple of back and forth comments in which I tried to look smarter than I am (and considering pitching the idea that if my boss wanted to send me back to school for my PhD that I'd be happy to go), this VP ended with: "But they said simple, scalable, universal." Yeah, right there on the package! ;)
My snide remark (being in a healthcare field) was going to be something like:
Right, and so is DNA!
Simple - only 5 compounds in the whole thing!!
Scalable - from a flee to a whale!!
Universal - it is a defining characteristic of life!!
I didn't send that response, yet.
DNA is simple, universal, and scalable, too.
You'll find Dan Linstedt blogging about the Triadic Continuum in the future, I'm sure. I'm still busy reading the book, but my gut tells me there are some very interesting things to be learned from this data structure. I don't think it's the solution to moving along Ackhoff's data-information-knowledge-wisdom curve, but I've already starting thinking about how to leverage some of the algorithms from the book in relational data models (even though that's what the inventors would discourage us from).
You'll see more from me in future posts as I try to decompose the ideas in the book and relate them back to the day-to-day information management challenges we all face.
Share:
Posted by Paul Boal at 9:45 AM | Comments (0)
