Triaging source data

One of the most difficult things to do in data warehousing is to engage a new source system. Learning about the fields the system has to offer the data warehouse, when they are populated, how "clean" the fields are and when you can get after them with your extract job can be daunting. Then, after going through the process of attaching the extract jobs, scheduling and beginning the cycles, you would want to be set for a while.

Not so fast. Usually 1 day to 2 weeks after putting a data warehouse – any iteration – into production (or prototype) -, users who previously communicated requirements in abstract terms are now seeing the results and requiring changes. New fields and new transformations are not unheard of at this point.

Although data warehousing is very dynamic, it is possible for a practitioner to think beyond initial, spoken requirements and "prime the pump" by bringing additional fields into the ETL process. This concept, known as "triage" works very well if you have a staging area where initial loading from source is "dropped" prior to the majority of the transformations.

With triage and a staging area, the staging area can contain many more fields than are moved forward to the actual data warehouse. Then, if a new field is needed in the warehouse, there is no effect on the source extracts (and no accompanying disruption of source operation and negotiation with the source system team).

But wait, you say. What about

    Requires Free Membership to View

the historical data that usually accompanies such new data sourcing? In tomorrow's tip, we'll address that…

For more information, check out SearchCRM's Best Web Links on Data Warehousing.

To ask William a question about this strategy, simply visit our Ask the Expert section in the  Business Intelligence category.


This was first published in July 2002

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.

    Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.