One of the most difficult things to do in data warehousing is to engage a new source system. Learning about the fields the system has to offer the data warehouse, when they are populated, how "clean" the fields are and when you can get after them with your extract job can be daunting. Then, after going through the process of attaching the extract jobs, scheduling and beginning the cycles, you would want to be set for a while.
Not so fast. Usually 1 day to 2 weeks after putting a data warehouse – any iteration – into production (or prototype) -, users who previously communicated requirements in abstract terms are now seeing the results and requiring changes. New fields and new transformations are not unheard of at this point.
Although data warehousing is very dynamic, it is possible for a practitioner to think beyond initial, spoken requirements and "prime the pump" by bringing additional fields into the ETL process. This concept, known as "triage" works very well if you have a staging area where initial loading from source is "dropped" prior to the majority of the transformations.
With triage and a staging area, the staging area can contain many more fields than are moved forward to the actual data warehouse. Then, if a new field is needed in the warehouse, there is no effect on the source extracts (and no accompanying disruption of source operation and negotiation with the source system team).
But wait, you say. What about
For more information, check out SearchCRM's Best Web Links on Data Warehousing.
To ask William a question about this strategy, simply visit our Ask the Expert section in the Business Intelligence category.
This was first published in July 2002