If you are just comparing based on your raw, input data, keep this in mind -- you need to buy at least 3, and probably closer to 5, times that much total disk size. It's true in all our implementations. Also take a look at www.tpc.org, specifically benchmarks TPC-H
The exact way to size is to know your architecture, all your tables and how many rows will be placed in each table for the next 3+ years. Rows per page can then be computed followed by the number of pages necessary. Of course, usually you need the system in place well before the time you know all this information, including the location of all indexes and summary tables. This precise approach is neither necessary nor feasible. Hence, the rules of thumb and the importance of getting into something scalable.
So, one important step in determining the DBMS for large-sized warehouses is to compare your profile to existing warehouses in production for the DBMS you are considering. When you do this, just make sure you compare apples-to-apples, understanding something about the take-up factors and the architectures of the systems that the vendors put forward to you for comparison.
For more information, check out SearchCRM's Best Web Links on Data Warehousing.
This was first published in May 2002