Many are not aware of the significant differences between DBMS architectures. One architecture that most are unfamiliar with is column-oriented DBMS. These DBMS store all columns of a table together as opposed to storing all the columns of a row together.
For example, instead of:
John Smith 123 Main St. Anywhere, CA USA
Jane Doe 456 Elm St. Anywhere, CA USA
John Jane
Smith Doe
123 Main St. 456 Elm St.
Anywhere Anywhere
CA CA
USA USA
is stored. Single-column linear functions such as AVG, MIN, MAX, and SUM are going to perform well in this type of storage approach because all the data needed is together and there will be fewer I/Os. Extraneous columns not relevant to the linear function do not need to be "skipped over" in this architecture. Multi-column retrievals and joins are more conducive to a row-oriented storage approach where predictable linear function needs can be pre-calculated.
Additionally, the storage method facilitates compression because it is likely that value repeatability will occur from one row to the next as with CA and USA in the example above. This facilitates compression, especially for low cardinality columns.
On balance however, all things being equal, this approach is not conducive to the active, mixed-workload environments that comprise today's best practice data warehouses.
For more information, check out SearchCRM's
Requires Free Membership to View
When you register, you'll begin receiving targeted emails from my team of award-winning editorial writers on the latest customer relationship management (CRM)and call center technology issues today. Our goal is to keep you informed on the hottest issues facing this fast-changing industry.
Hannah Smalltree, Editorial Director
This was first published in July 2002
Join the conversationComment
Share
Comments
Results
Contribute to the conversation