One way to comprehend the tradeoffs made in an ETL process is to compare it to going out to a restaurant. Some shops are enjoying filet mignon at a 4-star restaurant while others are making a run for the border. Consider the following:
When the mood strikes for a meal, the restaurant needs to be open. If the restaurant is designed to only work at night, that doesn't work. If the ETL is designed to run at night only, with processes dependent on the time of day (i.e., replete with references to CURRENT DATE - 1 DAY and the like), that ETL will not be able to be anytime. When the users decide they want more frequent access, it's time to rewrite the ETL.
ETL processes that run periodically should be dependent only on picking up where the last process left off. Usually this will be gathering data from last night (for nightly ETLs) forward, but as anyone who has been around these projects for any length of time knows, jobs do not always run when they should and load frequency requirements change. When frequency requirements change for 4-star ETL processes, they are simply scheduled to run more frequently.
When going to a restaurant, all things being equal, if the food is fresh, that is preferable. One evidence of a 4-star ETL is fresh data. If the ETL is supposed to run nightly, it ran last night and the database contains data through last night. That being said, nightly may not be frequent enough. Most data warehouse programs
Tomorrow we'll continue our visit to the ETL restaurant, a diner most BI professionals come to dine at.
For more information, check out searchCRM's Best Web Links on Intro to BI.
This was first published in January 2002