Tip

Sizing data warehouses, part 2

If you are just comparing based on your raw, input data, keep this in mind -- you need to buy at least 3, and probably closer to 5, times that much total disk size. It's true in all our implementations. Also take a look at www.tpc.org, specifically benchmarks TPC-H

    Requires Free Membership to View

and TPC-R. They list combinations of hardware and DBMS that the vendors are putting forward for benchmark purposes. In the disclosures, the "take up" from raw disk is mentioned. Many of them go well beyond 5 times raw, but since none of us are implementing the benchmarks and needing every ounce of performance possible at any price, 3-5 times is a safe estimate.

The exact way to size is to know your architecture, all your tables and how many rows will be placed in each table for the next 3+ years. Rows per page can then be computed followed by the number of pages necessary. Of course, usually you need the system in place well before the time you know all this information, including the location of all indexes and summary tables. This precise approach is neither necessary nor feasible. Hence, the rules of thumb and the importance of getting into something scalable.

So, one important step in determining the DBMS for large-sized warehouses is to compare your profile to existing warehouses in production for the DBMS you are considering. When you do this, just make sure you compare apples-to-apples, understanding something about the take-up factors and the architectures of the systems that the vendors put forward to you for comparison.

For more information, check out SearchCRM's Best Web Links on Data Warehousing.


This was first published in May 2002

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.

    Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.