Datawarehousing hardware requirements

Datawarehousing hardware requirements

Can you please guide me to what would be the preferred hardware requirement for a data warehousing setup with the following components:

  1. An ETL process to populate the entire data warehouse.

     

  2. Data warehouse size 100-1000 GB, 25 million fact records with five to six dimensions having more than 50,000 members, and one to two SCD's having 1 million records. The other 35-38 dimensions having less than 10,000 members.

     

  3. The Microsoft analysis services containing:
    • Eight regular cubes with distinct count measure in it.
    • A single regular cube consisting of around 20-25 additive (SUM) measures.
    • Six virtual cubes consisting distinct count, additive and calculated measures with 130-140 measures in total.


The

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning editorial writers on the latest customer relationship management (CRM)and call center technology issues today. Our goal is to keep you informed on the hottest issues facing this fast-changing industry.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchCRM.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchCRM.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

first thing I'm going to look at is database size. You have provided a very large range of database size, from 100 GB to a terabyte. I would recommend differently for each extreme so I encourage you to try to get more specific about this. And furthermore, is this initial data warehouse size or does it account for growth?

We use database size as a primary determinant of platform efficiency because, in larger implementations with hundreds of users, usage tends to become predictably multi-faceted. In other words, most 3 TB data warehouses will have 250–400 users with 75% doing desktop OLAP (across a complete bell curve in terms of actual usage volume), 15% writing reports and 5% data mining, etc.

Pick a platform that works for at least your anticipated three-year scale. For example, if the system is not anticipated to become a terabyte within three years, you can choose a system that will not work today at the terabyte level. If you happen to get to that level in, say, five years, in all likelihood, the system you choose will also have scaled itself to support that level of usage then. In other words, let's not buy a system to "kill an ant with an atom bomb."

When I say "work," I mean work with high likelihood of success and without the need to tune a system at its 99th percentile of tuning. The systems we recommend have a "low" risk factor for BI platforms which means a low likelihood of not scaling or needing augmentation with other systems (you want to buy systems that "do it all") and having to be replaced.

You have provided information about the database, but not the users and usage. That will also be an important input to the hardware requirements. Usually BI systems will have heterogeneous usage that approaches a vibrant usage mix. But maybe your shop won't. Another assumption is that an end-user shop would not wish to become an R&D shop for niche software in its main areas of BI software need. Here again, maybe your shop is OK with taking some chances.

Nonetheless, your shop is "up there" in terms of SQL Server usage and you may consider something like a Dell PowerEdge 8450 8 Intel® Pentium® III Xeon™ 900 MHz processors with 24-32 gigabyte (GB) RAM but do get more specific with the requirements before you go shopping.

This was first published in March 2004

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.