« April 2008 | Main
July 27, 2008
Maximizing Data Warehouse ROI?- Keep Most Detailed Data
This is one of those common-sensical approaches to get the maximum out of your Data Warehouse. As you make your investments in Data Warehouse environment, you can enhance your ROI, by using this environment for diverse applications. Having most granular (or detailed) transaction level data is core to broad-basing the Data Warehouse applications.
Traditional use of Data Warehouse environment for the purpose of back-room analytics, is no longer applicable. It can now serve as a single reference source for any of your BI related information needs. Some examples are:
- Summary Analytics
- Enterprise Reporting (needs transaction level data)
- Performance Management
- Data Mining (needs transaction level data-mostly)
- Business Modeling (sophisticated models need transaction level data)
- Operational BI like single customer view for telemarketers, customer support (needs transaction level data)
- Ad-hoc queries by operational staff (needs transaction level data)
- Root Cause investigation for issues, where you need to drill down to problem areas (needs transaction level data)
- Business Applications with Embedded BI modules (may need transaction data)
There are reasons for using Data Warehouse as a single reference information source. This helps you to:
- Maintain consistency: If your summary MIS/analysis and enterprise reports are coming from different sources, you will struggle to keep the numbers in synch.
- If your production data needs an offline fix (like standardizing customer and product IDs), its better to do that data-fix in one place. If you have separate enterprise reporting and analysis platforms, you will need to do that data transformation at two places, instead of one.
- Data Auditability: A single information reference point having detailed data will provide a good audit-trail of your summary transactions/analysis.
- ETL synergy: If you have diverse systems, and you want to have some level of information integration, its better to do it at one place. Doing ETL for summary data warehouse and a detailed reporting database, will almost double your efforts.
- Overall platform ease: You maintain only one information infrastructure (administration, scheduling, publishing, performance tuning...).
- Ease of Change Management: Any change in your information requirements, or changes in your source systems will be managed and done at one place.
Then the question is- If this approach is so good, why many companies use Data Warehouse only for summary data? Keeping granular data in data-warehouse has its own challenges and has its own demands:
- Brings forth the real issues with transactional data: In summary data warehouses, you can ignore some of the transaction level data issues and do some patch-work to ensure that aggregated data has a level of acceptable quality. Bringing in granular data, will need more incisive surgery on your data issues. This will extend the time of implementation.
- ETL efforts go up: This is related to the first point. Your key plumbing task in DW will become larger and more complex.
- Existing robust and stable reporting and querying platforms: Why to fix, which ain't broken? etc...
My answer to the above reasons will be that while you can be flexible for functional level data marts, you should go for granular data from Day 1 for your enterprise data warehouse. If you are in a hurry, create a quick-fix data mart on the side. For an enterprise level data warehouse, you can start with few high-priority business themes, but ensure that they are designed for long-term usage (i.e. granular data).
You can refer field tips for BI and Data Warehouse in my portal Business Intelligence and Performance Management
.Posted by Rajan Gupta at 10:30 PM | Comments (0)
July 22, 2008
Universal Data Warehouse Dimensions- Is it possible?
I have talked a lot about having universal & foundation dimensions in the data warehouse in my portal Business Intelligence and Performance Management Institute. In brief, Data Warehouse is typically a dimensional model, which is different from relational/OLTP model. In dimensional model, you have dimensions and their attributes in their own tables linking through a central 'Fact Table', which carries all the numbers (or measures).
Now, the idea is that one should have one standard dimension (and attributes) for a business entity. Lets say that there are 20 star-schemas (or cubes in OLAP lingo) in a data warehouse, and ten out of them are using customer dimension. The concept of foundation dimension is that all these ten cubes should be using an identical customer dimension. This helps in doing analytics across multiple cubes and significantly reduces the change and development effort.
One reality check one has to take is that for large organization, it is extremely difficult to have a single customer dimension. This is because they may be operating very diverse markets or products. Getting all the business heads to agree to a single dimension may take years. Many of my clients have asked on if these 'standard' dimensions are practically possible.
My answer to them is that yes its possible, if we do apply the following techniques:
Create Super-Sets: Take a convenient path of creating 'super-set' dimension, which can absorb diverse entitites. For example creating a super set customer dimension, which can absorb retail, corporate and group-customers.
Create different foundation dimensions: You do not need to create a single customer dimension for the whole organization. You can create 'retail customer' dimension, 'corporate customer dimension' etc... Both these dimensions will be different, but they will be universal within themselves. The assumption is that you will not be required to do cross-navigation across retail and corporate customers.
Start with high priority Dimensions: Instead of trying to make universal dimensions for all entities, identify top 5-6 which are most critical. Pareto principle equally applies here. The critical dimensions can be customer, vendor, product, location, sales channel etc...
Posted by Rajan Gupta at 1:15 AM | Comments (0)
July 20, 2008
Data Warehouse vs. BI
Data Warehouse and BI are considered synonymous. The reality is that Data Warehouse is one of the components of BI. There are many more components, which are needed to be in place for an actionable BI platform from an IT perspective. These elements are:
OLAP server- This is a multi-dimensional database, which provides extensive analytical capabilities. It sits between data-warehouse and the end-user tools. It picks data from Data Warehouse, summarize it and store it in its OLAP multi-dimensional database. The OLAP database is designed in a way that it helps analytics and other BI functions. OLAP has wide range of pre-built analytical functions, which can be used by users or application which are accessing it.
Enterprise Reporting- These set of tools, provide enterprise level (mostly scheduled) reporting. These tools take their data from OLAP or directly from Data Warehouse. OLAP typically has summary data. When you need detailed transaction level data, one will have to take it from Data Warehouse. In the past BI was typically used only for analysis. However, as Data Warehouse and OLAP combination is expanding its use, enterprise reporting tools have started using DW OLAP as the source.
Query and Analytics Tools- These are the tools, which enable you to do wide range of analysis. This typically deals with summary data (you would not look for individual transactions in our analytics). Many BI platforms is that they enable you to drill down to the transaction level, if you need to investigate into details. This means that in the back-end, you move from OLAP (summary) database to the detailed data warehouse database.
Performance Management Tools- This is the world of Dashboards, scorecards, setting standards, goals, reporting on the performance variance etc. This breed of tools, enable you to manage the functional or enterprise performance, and link it to analytics and enterprise reporting. Therefore, if you have a dip in performance, you can do analytics to find the root cause, and use reporting to list the transactions which are contributing to the same.
Data Mining Tools- While the above three end-user tools (reporting, analytics and performance management) are core to an organization's BI capability, Data Mining is the next level of sophistication. These are knowledge discovery tools, which generate patterns, trends and co-relations on the data.
Business Modeling tools- These tools enable you to create models (like pricing models, actuarial models, business planning models, sales projection model...).
Therefore, when your vendor states that they offer BI solution, do check on what components are they offering. While there are end-to-end BI platforms like SAS and Business Objects, there are many vendors, which provide competitive individual components.
For details around end-to-end BI, you can refer my portal Business Intelligence and Performance Management Institute.
Posted by Rajan Gupta at 9:45 PM | Comments (0)
July 15, 2008
Data Warehouse Infrastructure- Consider all layers
Estimating the Data Warehouse infrastructure is tricky. One can never take an intelligent guess on the number of users, the kind of queries and the kind of usage that Data-Warehouse will be put to.
Data Warehouse is one among many components of business Intelligencce platform. The other important components are:
- ETL tools (which extract and load data into Data Warehouse)
- OLAP Server (which picks data from Data Warehouse and load it in analysis friendly multi-dimensional form)
- End-user tools (like enterprise reporting, analytics tools, data mining tools...), which sit over OLAP and Data Warehouse to make use of the data stored in them. In other words, they create information out of the data.
Data Warehouse is core, as it provides the sanitized, integrated and consistent data to the end-user tools. These end user tools generally access this data through OLAP. They go to the data-warehouse, when they need transaction level data. The summary level data is generally available at OLAP level itself.
Given this back-ground, one has to understand that all query processing in business intelligence does not happen at Data Warehouse level. Out of 1000 users accessing business intelligence platform, only 100 may be accessing data warehouse. Rest of them may be accessing and querying the OLAP Database, or enterprise reports repository.
As you decide upon you Data Warehouse infrastructure needs (including licensing), you have to consider the load of ETL, and the volume of data which you will store. However, for the infrastructure related to the number of users, query load etc, one needs to take into account the entire architecture of OLAP and End-user tools.
More details are available in my recent field tip Data Warehouse Infrastructure Estimate in my portal Business Intelligence and Performance Management.
Posted by Rajan Gupta at 10:15 PM | Comments (0)
Data Warehouse can wait- Start with Data-Marts
An organization, how-much-so ever big or small it is, can wait for enterprise data warehouse. It should start with few critical Data-Marts. The reasons are:
An Enterprise Data Warehouse is a long term commitment: There are many imperatives (or foundations), which are key for a Data Warehouse. The examples of these imperatives are foundation or conformed dimensions, fine-grained granular data, comprehensive star-schemas etc...These elements need high level of readiness and investments to build these foundations. These foundations (though great for data marts as well) can be compromised for initial set of Data-marts.
Business Learning- Initial set of data-marts will provide great learning, less on the IT side and more on the business side. Here are the set of learnings from business side:
- Creating business themes
- Building Data-Mart Business Requirements
- Building Dimensional Model
- Testing of Data-Mart
- Taking business decisions around the extraction and transformation
- Generating the information out of the Data-Mart through end-user tools (like reporting and analytics application)
Examples of IT Learnings:
- Extraction, Transformation and Loading design
- Processing Load Management
- Handling Data Explosion (data goes up exponentially as you add sparse fields- where most of the records are blank)
- Change Management (end-to-end impact analysis if you make a change in the Data Mart Model)
Show-case for sponsors: A successful Data-Mart makessponsorship of a Data Warehouse much easier.
Quick-hit: A Data-mart is a quick hit and gives earlier gratification.
Non-Disruptive: It does not take away the attention of an organization from other big things.
You can refer my portal Business Intelligence and Performance Management for more details.
Posted by Rajan Gupta at 3:15 PM | Comments (0)
July 11, 2008
IBM Data Governance Prediction- My Anxiety
I came across a news item talking about the IBMs top 5 predictions for Data Governance (http://www.ebizq.net/news/9883.html?rss). These predictions are coming out of IBM Data Governance Council, comprising 50 top-notch companies from wide range of sectors.
While the predictions were interesting and point to a brighter future for data governance, this one raised some level of dissonance. As a general disclaimer, I beg pardon if I have mis-interpreted the text of this prediction. These are only initial thoughts, and I do plan to dig out more. Please cascade this blog post to your network as it can generate some healthy discussion.
The text of that prediction is-
'The role of the Chief Information Officer (CIO) will change making this corporate officer responsible for reporting on data quality and risk to the Board of Directors. The CIO will have the mandate to govern the use of information and report on the quality of the information provided to shareholders.'
My Dissonance- I don't agree that CIO role should be taking the ownership of reporting on information quality as well as governing the use of information. It may lead to role conflict and de-focus the CIO role. These are the questions, I will seek clarifications on to address my dissonance:
-
CIO's core role is to be a strategic internal service partner to the business and operations to 'make it happen'. Should we be mixing the role of a service provider to a quasi-governance and oversight role.
-
Will we not create a confusion between the roles of Audit, CFO, Internal Control and Data Steward (if you have one)? I feel that there can be enough roles to oversee and audit. The issue today is more towards owning and delivering on the tasks related to data governance.
-
A CIO cannot have the mandate to report on the quality of the information provided to shareholder. This may set him/her up for failure. This has to be a business role, which encompasses not only IT, but the business processes, manual controls, compliance and regulatory checks outside of systems etc...Aren't we conflicting this role with that of a CFO and CEO? CIO may be responsible for certifying that the data lying in the systems is consistent. However, how can the CIO take the ownership to govern the manual adjustment figures entered by finance at the period-end processing??
-
If something is wrong with the information provided to shareholders, where the buck will stop? Will it be the CIO or CFO?
-
CIO cannot have the mandate to govern the use of information. The use of information is defined by the user access matrix and distribution lists of various reports and outputs made by the system. This access matrix has to be defined by the Business and internal control. How can a CIO decide on which groups and which functions can use that information?. Secondly, not all information is in the systems, and lot of it is manual.
-
From my point of view, the data governance and quality needs more business ownership at all the levels, as Data Governance goes much beyond the system boundaries. Much of the data issues are either due to faulty data entered in the systems or a lack of robust business specifications for IT systems. Isn't the prediction seem to be recommeding the move in the other direction?
Looking forward to your comments. This post should be taken as invoking comments from readers, and inviting discussion. I will add more to this post. You may also refer my portal www.bipminstitute.com. Our main theme has been that data quality and data governance is much more a business issue than an IT issue.
Posted by Rajan Gupta at 9:30 PM | Comments (0)
Business owned applications are a reality- Manage it
A real-life medium to large size organization will have hundreds (if not thousands) of small to medium sized 'applications' which are owned by business and are not on IT radar. The key reason is that IT is not able to (rightly so) meet all the business demands within the time and money constraints it has. Therefore, working units in the business create their own applications, which may range from excel based to a full-fledged IT platforms. Many a times, these business units have their own 'captive' IT units.
Many of these systems, over time grow, spread and become an important link within the business processes. While being critical, they don't have the level of robustness and reliability, which is inherent with IT-owned systems. This generates a financial, operational and compliance risk.
These applications also become an important part of your data quality and BI agenda. This is because they carry important and business critical information. In my experience, a fair proportion of effort on any enterprise level Data quality or BI initiative goes into mapping, extracting and transforming the data from these sets of apps.
The response of an organization may range from 'fight' to 'flight'. My recommendation is to accept the reality, formalize it and mange it. The informal business applications are here to stay and you cannot take away the reasons, which lead to their existence. Here are the steps one can follow:
- Step 1- First of all, one needs to have a sponsorship from the owning business functions to open-up their world and let the teams working on Data Mapping or data quality program.
- Step II- One can create a quick inventory list of all informal applications, and do a first level prioritization.
- Step III- More detailed analysis of the inventory list by using a standard set of questionnaires. Some of the questions in that questionnaire would be:
- Will the key business processes come to stand-still if the application does not work for one day, one week, and one month?
- Does this application stores or processes the financial data?
- Does this application stores or processes the data related to the privacy laws, like credit card numbers, personal contact details?
- Does this application have a disaster recovery in place?
- Step IV- Short-list the applications, where you to have the first go. Make a road-map to bring the critical applications into IT fold.
- STEP V- Issue guidelines on the management of information applications. As part of these guidelines, you can include:
- What can be part of the informal applications and what can't be.
- Procedure of periodic check on the inventory
- Procedure for aligning with IT principles and architecture for a given class of applications
- Sign-off from IT on controls and quality related areas etc...
There are multiple benefits of this approach:
- Business and IT can work collaboratively.
- Awareness of risk is half the battle won. Once you know the soft spots, you can work on them.
- Your Data Quality, Data Integration and BI initiatives will be smoother and efficient.
In other words, formalize this reality and you will be able to manage the risk much better. For more details on this subject, you may refer Business Applications are a reality- Manage it in my portal www.bipminstitute.com .
Posted by Rajan Gupta at 8:15 AM | Comments (0)
July 10, 2008
BI Service Providers- Big may not be the Best
Dear Readers! I have many links to my portal here, as I am struggling to cover a big subject in a single post.
As BI has picked-up pace, many IT service providers who have grown big through OLTP systems business, have started their 'BI practice'. Many a times, the 'OLTP' and 'ERP' DNA becomes a significant barrier in building a true-blue BI capability. Therefore, 'Big' may not be 'Best' here.
I have mentioned at some places in my portal www.bipminstitute.com , BI and Data Warehouse (as a key part of BI) require a different mind-set and capability-set to manage. Some of the reasons are as follows:
- Business requirements are fluid and constantly changing.
- Business Requirements are difficult to articulate and capture.
- Dimensional Model is pretty different from OLTP data modeling
- Needs in-depth domain expertise to model and design.
- Load Management is unpredictable.
- Testing is vastly different
- The DW modeling has to be extensible and flexible , even if business does not ask for it.
- Short attention span from stakeholders, as life can go on without BI (unlike an ERP).
- Storage space and infrastructure needs are less predictable.
- BI is 80% business and 20% IT (disclaimer- I am not short-changing IT, but emphasizing upon the criticality of business stake-holding)
- Etc... Etc..
An OLTP-based Vendor has to understand these unique aspects, and bring that fundamental shift (as an economist will say 'macro-economic restructuring') in the skills and mind-sets.
I have had to struggle to find a service provider which can provide a mix of business, IT, Process and Modeling skills under one roof. The few which I was able to find were too exhorbitant to afford. Finally I had to resort to a combination of 2 to 3 service providers to complete the skill-basket.
You can refer Data Warehouse has unique challenges and Business Intelligence Vendor Evaluation to complete the picture.
Posted by Rajan Gupta at 1:15 PM | Comments (0)
July 9, 2008
Simple and Effective- Periodic Reports Rationalization
There are more simple steps an organization can take to manage its information than complex technology driven ones (though both are necessary). These steps do not need massive IT investments. They not only boost effectiveness, but also create a strong foundation for your IT-based BI platforms. Here is one of such steps:
On periodic basis, rationalize your reports and views in terms of
-- Reports not getting used
- Duplicate or nearly duplicate reports
- Report having mis-matching formulae.
You can address the above by de-activating the un-used reports, creating super-set reports, fixing formulae etc...
Key points to note are:
- It takes little time. I have seen people rationalizing 200 reports in a single day, after few months of experience.
- You don't need perfection. 70-75% achievement is good
- This does not require funding, high-level sponsorship or a go-ahead etc. It can be driven by the CIO along with assigned IT and business analyst.
The critical success factor is regularity. If you do it well, when you go for your BI investments, you will have much smarter (and leaner) business requirements in place. If you have not been doing this, please try it once and share your feedback.
More details on this subject are in Periodic Rationalization & Prioritization of Information has multiple benefits in my portal www.bipminstitute.com
.Posted by Rajan Gupta at 10:30 PM | Comments (0)
Open Source Business Intelligence- A fitment with Caution
You can reduce your BI costs, by gradually testing and adopting Open-Source BI in a select set of areas. There is a difference within open source and commercial open source. In this page we are talking about commercial open source, which is not free of cost but of minimal cost, with adequate support and services infrastructure. Pentaho and Jasper are examples of commercial open source.
I recommend the evaluation and use of Open Source BI for non-core BI areas like presentation & visualization tools, business modeling tools, web-based query and reporting tools (not enterprise reporting tools) etc. In other words, the open source BI is not recommended to be used for core production engines of your BI environment.
While big players like pentaho and Jasper have bagged some large contracts for core BI platforms, they will need many years of satisfactory field reports to generate wide-spread confidence.
The 'non-core' approach will give you a great cost advantage, as it save cost for large scale viewer licenses. It is also a way to test the robustness and support capability of these tools and Vendors. Some of the open source end-user tools are fairly competitive in terms of features. When you acquire a commercial open source, one can be more diligent on service and support factors.
In case you need some more detail on the same subject, you may refer Open Source BI- A cautious fit in your BI plans in my portal bipminstitute.com
Posted by Rajan Gupta at 11:15 AM | Comments (0)
July 8, 2008
Demystifying MDM- What is Master Data Management?
Dear All,
While most of my posts are complete by themselves, you have to link to my knowledge portal for this one, as it is a long page. I think you will achieve a good clarity about MDM, its various domains and how it is different from CRM, BI, Meta data management, Data Warehouse etc...Few snippets:
- Customer Data Integration is not MDM, but one domain out of it.
- Product Master Data Management and Customer Master Data Management are toughest of all.
- MDM can exist without any Data Warehouse or BI platform.
- CRM is not customer data integration...
Please refer What is Master Data Management? in my knowledge portal www.bipminstitute.com.
Posted by Rajan Gupta at 10:30 PM | Comments (0)
New Data Standards?- What about existing applications
It will be called an accomplishment, if an organization is able to create data management standards (like universal business rules, universal domain values and universal data models for all data entities). Examples are universal standards for customer entity, product entity etc..
Beyond this accomplishment, there is a question on what to do of the data (like customer master as per the old Customer ID structure) and applications (having data validation rules as per the old rules), which exist and are working on a scattered set of old standards?
Creating a big-bang project of changing the existing portfolio as per the new data standards is not an option many organizations would like to choose. One would need to create a funded road-map for this purpose. Here is the mix of tricks which this road-map can use to make it cheaper and faster:
- Ride on the IT business portfolio plan- A large IT initiative may absorb the retrofitting cost.
- Identify and focus on the key data elements which are having widest impact- The criticality will be depending upon the financial, regulatory and productivity impact. It also depends on the cost escalation if you fix it later.
- Use BI environment to enforce the standards- BI through its ETL may provide a work-around, which can avoid the correction in the production (or source systems).
- Try one go change for an application- If you do get to fix it, try to do all the fixes for an application once and for all. Experience tells me that it becomes messy, if we do it in installments.
Though this post is complete in itself, you can refer to more details on this subject in New Data Standards?- What about existing applications in my portal bipminstitute.com.
Posted by Rajan Gupta at 7:00 AM | Comments (0)
July 7, 2008
Big things start with a simple step- Customer Data Quality
Today after I did 17th hole, my friend (so close a friend that I have refused to have him as a customer), who is managing USD one billion dollar enterprise, shared his data related problems with me. He has multitude of issues related to data quality, data integration, single view of customer etc...Before he could go further, I suggested to him one simple step he can take, which can cover 50% of his journey to resolve his issues (and opportunities). I am sharing the same tip with you. This may sound simple, but if done well, can lay a strong foundation of your journey for data and information management.
"Can you spare 20% of the time of a person, who is part of your management team and is among your top performers take the single point 'sail or sink responsibility' for integrity and management of all customer data in your organization?'. Forget about all other aspects of data quality, data integration, data warehouse etc...Put 2-3 full-time resources under him. Make this responsibility as minimum 20% weightage of his goals, for next one year. This role will be called - Business owner of Customer Data-Group"
If the answer is yes, just go ahead and give him the responsibility. Hire a good consulting company, which can work with him to create a road-map, and milestones to work on the customer data. Come back to me after this person is on the job for few months.
Here are some of the questions, which my friend asked me, and quick answers (in italics) I gave:
'Will he not ask about his goals and measures of success'- If he is your top performer and senior manager, he will himself prepare the draft road-map and get it firmed-up. You just have to spell out the expectations at a high level.
'How will we figure out the funding and organization structure?' - A senior executive is paid for managing ambiguities and unknowns. He will come with the proposal for funding and organization, once he has figured it out. Give him the support from experts. Most of the aspects of Data and information management can be understood by applying Ist principles.
What about the other areas like financial data, production data etc..?- Start with the customer data, and as you fix it, you will fundamentally improve many processes, which will positively impact other data groups.
DISCLAIMER- My friend had major issues with customer data. If you have more issues with say production data, you can apply the same principle for production (instead of customer). The simplicity in production is that you generally have a logical owner (say head of order fulfillment). Its the data-groups like Customer, Vendor, Sales Leads, Locations, Products etc...,which have a bigger challenge.
You can find some more detail on data-group in Data-Group Master. It may sound little technical, but is easy to understand. It is part of a paid product, but not to worry, as the abstract itself will give you some idea of what I am talking about.
Posted by Rajan Gupta at 8:15 PM | Comments (0)
July 6, 2008
Building Business Intelligence Business-Case
Business Intelligence business case can be a simple subject, when a business function is looking for creating a data-mart. In this case the demand comes from business and business has worked out their mathematics to justify the needs. The main issue comes when you are building a business-case for foundation investments for an end-to-end platform.
The example of foundation investments include meta-data repository, enterprise data warehouse platform, enterprise reporting tool etc.. These investments can be of significant order. The responsibility of preparation of business-case comes upon a hurridly appointed "BI champion', who could be CIO, CFO or a major business-head.
Some hard-hitting justifications include regulatory and compliance adherence, customer satisfaction, customer cross-selling and up-selling and avoiding financial write-offs.
Quantifying the business benefits is also a challenge, and one can apply tricks like involving an external vendors and asking specific questions from business owners.
I have placed a page on some of the hard-hitting justifications on the BI, and also some tricks on how one can enhance the benefit quantification impact. Please refer Building Business Intelligence Business case in my portal bipminstitute.com
.Posted by Rajan Gupta at 8:45 PM | Comments (0)
Maximizing the usage of your Data Warehouse
It sometimes intrigues on why a Data Warehouse environment with all its benefits and user friendliness is not able to build the Business intelligence platform usage as per their expectations. The fact is - a transaction system (like an ERP), which is part of business process of an organization, has captive and committed users. However a BI environment has to struggle for pulling people in, as their work is not going to stop (unless you are using BI for enterprise reporting or operational BI), if they not use BI platform. Here are some of the tricks one can adopt to maximize the usage of a BI environment:
- Maximizing performance
- Optimizing the end user usage of data
- Ongoing user training
- Knowledge circles
- Actual usage of information
- Form an Information Management Council
For more details, please refer Maximizing Benefits and Usage in my portal bipminstitute.com.
Posted by Rajan Gupta at 8:30 PM | Comments (0)
Maximizing effectiveness of Data Steward
A Data Steward is responsible for the health of the data within an organization. An effective role of Data steward can help you to form a strong foundation for any of your BI initiatives. This role has a great potential and delivery capability, if an organization can adopt the following approach:
- Make this role accountable for defining the data quality goals
- Make this role accountable for the Data Quality KPIs
- Make Data Steward accountable for delivering on the Data Quality Goals
- Empower and rightly position the Data Steward
- This role should be part of Business
- This role should be a subject matter expert on the businesses, processes and IT interface.
- Assign functional level Data Stewards
- Assign distinct data stewards for distinct data group
- Avoid process-based data steward
To understand the details behind each of the above factors, please refer How to Maximize the effectiveness of Data Stewardship
.Posted by Rajan Gupta at 8:30 PM | Comments (0)
July 5, 2008
Data Warehouse Testing is Different
Data Warehouse testing is a specialist subject and has significant differences vis-ΰ-vis, typical transaction system testing. The testing approach, preparation, resources and skill readiness will be driven by the DW testing being peculiar on following counts:
- User-Triggered (OLTP) v/s System triggered (Data Warehouse)
- Batch (Data Warehouse) v/s online gratification (OLTP)
- Volume of Test Data high (Data Warehouse) vs low volume (OLTP)
- Infinite Possible scenarios/ Test Cases (Data warehouse) vs. Finite Cases (OLTP)
- Test Data Preparation tricks are different for DW vs. OLTP.
- Programming for testing challenge. Generally we need to create more complex scripts for Data Warehouse.
To understand why, please refer Data Warehouse testing is different in my portal bipminstitute.com .
Posted by Rajan Gupta at 11:45 AM | Comments (0)
Integrating stand-alone BI platforms- Gradual Approach
By the time organizations come out with the idea of an enterprise Business Intelligence Program, there are many stand-alone BI environments which have mushroomed by that time. Business and IT stakeholders are wary to rock the boat on stabilized and business critical data-marts. Business leaders are short of stamina to focus on enterprise BI program, as it is more convenient to grow the functional level existing data-marts.
The question is on how to then proceed for integrating these environments. This is with the assumption that you don't want to reinvent the wheel and want to leverage on what is already done.
Instead of a big-bang, one can go for a gradual approach whereby a decent proportion of the plumbing work can be done in the back-ground before you start engaging broad-set of stakeholders.
The steps that I will recommend is:
- Integrate the ETL
- Integrate the front-end tools
- Integrate the Data Marts
Please refer to the gradual BI integration approach, the steps and cautions in Integrating your stand-alone BI environments- Gradual Approach in my portal bipminstitute.com .
Posted by Rajan Gupta at 11:45 AM | Comments (0)
Documenting Data Integration- No Choice
Among the most significant challenges of Business Intelligence is to keep track on what, when, how and who of Data Integration. Unlike Data Warehouse and OLAP Server , data-integration is not 100% achieved through configuring data-integration tools. The reaons that you cannot rely only upon the DI tools to do the documentation are:
-
Source Systems Structure documentation- You DI system may be able to link-up to the structure of Data-Base of the source system, but the underlying programs which impact the data may not be included. You need to document the areas of the source system (Data-base, Busniess Rules impacting the data, Data- processing windows..) which impact your data integration.
-
Complex Routines- For complex integrations, you have to go beyond the sheer configuration 'click and choose' capabilities, and write programs to do the extraction and transformation. You need to maintain the functional specs of those programs, as finally business also has to sign-off on those transformation rules.
-
The risks and gaps left in DI- You may not be able to achieve 100% perfection in DI (say Extraction and Transformation as an example) and would have lost some data due to it being dirty or incomplete. You also might have done the cleansing and enrichment of historical data. One needs to keep track to enable the future investigation, for Audit reviews and also to be able to explain the mis-match between the reports from production systems and from BI platform.
More reasons and further details on this post can be seen at Documenting Data Integration in my portal bipminstitute.com .
Posted by Rajan Gupta at 11:45 AM | Comments (0)
Design your Data Warehouse for broader application
It is a myth that Data Warehouse main purpose is for data analytics. When you are creating the business case for your Data Warehouse, you can include the following benefits, with only some of them belonging to the traditional analytics:
- Enterprise Reporting
- Offline Operational Data Store
- Data Analytics
- Data Mining
- Business Modeling
- Operational BI
As you design and scope your Data Warehouse, you should account for potential uses, which go beyond data analytics. This is how your Data Warehouse design will be influenced, if you are going for more broad-based applications:
- More granular data
- More Descriptive Attributes
- More Robust and scalable platforms
- Load and Job schedule Management
- Your Dimensional Model
- Your OLAP strategy
This topic is dealt in more detail in Data Warehouse is not for analytics. Design it for broader applications.
Posted by Rajan Gupta at 11:30 AM | Comments (0)
July 3, 2008
Dimensional Modeling vs. Relational Modeling
Dimensional modeling is a unique way to model a data warehouse and is different from relational modeling used in the OLTP systems. To clarify the terminology- Dimensional model is stored in the databases in relational form (relational tables linked to each other through foreign keys). Its the way your logically and physically model the database that is different.
Dimensional modeling has the advantage of:
Responding to ad-hoc and huge queries
Symmetrical model
Extensible and scalable
To understand the details on the comparison, and why one should go for dimensional modeling one can refer Dimensional Modeling vs. relational modeling.
Posted by Rajan Gupta at 11:00 AM | Comments (0)
Assess Data Warehouse Project Readiness
Enterprise Data Warehouse initiative is a high investment initiative. Your approach and estimation should also be driven by the readiness levels of your organization. The factors encompass people, process and technology. Examples are
Strength of the Sponsor
Application Topology
Interest of the Stakeholders
Data Topology
Levels of Data Warehouse skills
Management Culture in the organization
Business Performance Management processes
IT system Topology
Each of these factors will influence your DW plans. Please refer to Assessing Data Warehouse project readiness . This page provides details around each of these factors, and the recommended approach depending upon the level of readiness around each of these factors.
Posted by Rajan Gupta at 10:45 AM | Comments (0)
