Performance management is a major challenge for most data warehouse projects. While there are numerous ways to handle this subject, summarization is by far one of the most effective performance enhancement techniques. Summarization offers enhanced query performance (i.e., reduced response time) by eliminating real-time calculations and reducing I/O, CPU, RAM and swap requirements.
However, determining when to create pre-summarized tables can be a significant challenge. The question often faced is which level of dimensional hierarchy is suitable for the summarization. Below are some considerations for building a summary table:
- How frequently will the summary table be used? Monitor user queries for frequency and level of summarization desired from the lowest grain fact table. No real point in wasting disk space if the summary table will rarely be accessed.
- Will the row compression ratio achieved be beneficial? Typically look for a reduction of at least 1/3 of rows from the lowest grain fact table.
- Is the dimensional hierarchy considered for summarization relatively static? Since at summarization the parent-child relationships in a dimensional hierarchy are summarized to the parent level, any future changes to that relationship necessitate creation of a new summarized entity. Avoid building the summary table if changes are frequent and table size large.
For more information, check out searchCRM's Best
Requires Free Membership to View
When you register, you'll begin receiving targeted emails from my team of award-winning editorial writers on the latest customer relationship management (CRM)and call center technology issues today. Our goal is to keep you informed on the hottest issues facing this fast-changing industry.
Hannah Smalltree, Editorial Director
This was first published in March 2002
Join the conversationComment
Share
Comments
Results
Contribute to the conversation