Data is not always stored in rows

One architecture that most are unfamiliar with is column-oriented DBMS.

Many are not aware of the significant differences between DBMS architectures. One architecture that most are unfamiliar

with is column-oriented DBMS. These DBMS store all columns of a table together as opposed to storing all the columns of a row together.

For example, instead of:

John Smith 123 Main St. Anywhere, CA USA
Jane Doe 456 Elm St. Anywhere, CA USA

John Jane
Smith Doe
123 Main St. 456 Elm St.
Anywhere Anywhere
CA CA
USA USA

is stored. Single-column linear functions such as AVG, MIN, MAX, and SUM are going to perform well in this type of storage approach because all the data needed is together and there will be fewer I/Os. Extraneous columns not relevant to the linear function do not need to be "skipped over" in this architecture. Multi-column retrievals and joins are more conducive to a row-oriented storage approach where predictable linear function needs can be pre-calculated.

Additionally, the storage method facilitates compression because it is likely that value repeatability will occur from one row to the next as with CA and USA in the example above. This facilitates compression, especially for low cardinality columns.

On balance however, all things being equal, this approach is not conducive to the active, mixed-workload environments that comprise today's best practice data warehouses.

For more information, check out SearchCRM's Best Web Links on Data Warehousing.


This was first published in July 2002

Dig deeper on Data governance

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

SearchBusinessAnalytics

SearchDataManagement

SearchSAP

SearchOracle

SearchAWS

SearchContentManagement

Close