Another question I received at guru sessions in San Diego is typical of questions I receive regarding database size and the ability of database management systems to support terabyte-sized volumes of data.
Indeed, data volumes are growing in most data warehouses due to the advent of third-party data, the addition of new data areas to the warehouse and the continual accumulation of historical data. It seems as much as we plan for the "age off" of older data from the online, available warehouse, it never happens. Users want the value that historical data provides. More data, cleaner data and better data are all extremely useful to warehouse users trying to gain a competitive edge.
Terabyte-sized data warehouses used to be the exclusive domain of the Fortune 50 and usually only those high-data generating industries such as telecommunications and retail. Now, terabyte-sized data warehouses are the norm and it is not at all unusual to expect a Fortune 500 division or a midsize company, in most any industry, to have warehouse effort(s) that will get to this size easily.
Before addressing the question of the ability of a DBMS to fit this size, lets compare apples-to-apples. When I speak of a terabyte-sized warehouse, I am referring to the entirety of the data warehouse, staging area, data marts, and multidimensional databases -- and the total disk size that needs to be purchased for all. This "total disk size" includes system overhead, indexes, temporary
The reason I use this measuring stick is simple. It's the biggest possible measuring stick and it's what the vendors use since they are trying to support their position that they support high volumes.
For more information, check out SearchCRM's Best Web Links on Data Warehousing.
This was first published in May 2002