The concept of the persistent staging area is to keep all data, both from a "triaged" (see yesterday's tip) and a historical perspective in the staging area. That way, when requirements change post-production (again, see yesterday's tip), you not only have the ETL "primed", you also have the historical data primed and ready to be moved forward to the warehouse – in the persistent staging area.
Persistent staging areas almost always require a separate DBMS instance from the data warehouse DBMS due to the volume that will accumulate in them.
Since historical data is also kept in the warehouse, the distinctness for the persistent staging area lies in its capturing of triaged data, ready for historical loading of required data post-implementation. It will be bigger than the warehouse itself.
Although I usually do not use this technique in my data warehouses, if there was a high likelihood that requirements would be very dynamic after production and disk cost were not an issue, it would be very applicable.
For more information, check out SearchCRM's Best Web Links on
To ask William a question about this strategy, simply visit our Ask the Expert section in the Business Intelligence category.
This was first published in July 2002