Definitions of data quality

Definitions of data quality

There are many definitions of data quality running around. Some like to start with data in its usage expectations. Others take a scorecard approach that assesses data based on its apparent technical merits. I suggest a blend of both approaches.

Data quality is about the data meeting user expectations. There are components of cleanliness that can be derived from robust data models -- referential integrity rule adherence, cardinality adherence and the like. Usually we have DBMS referential integrity turned off in the warehouse, so how is your programmatic referential integrity doing? How many violations of the indicated cardinalities on the data model that the data warehouse/mart was implemented from are still holding true within the data? If you have specialization/generalization in the model, are those rules (like an employee must be either a "full-time employee" or a "contractor") holding up in the actual implementation? A robust logical data model is very important to data warehousing data quality. Please don't start your data warehouse implementation by sitting down and typing the words "CREATE TABLE."

There are also measures associated with data value appropriateness. Are data columns being used for multiple meanings? Especially for numeric data, are there reasonable domains for values? Finally, does the data conform to the expected set of "clean" values for the column?

Form your data quality scorecard based on user expectations

    Requires Free Membership to View

    When you register, you'll begin receiving targeted emails from my team of award-winning editorial writers on the latest customer relationship management (CRM)and call center technology issues today. Our goal is to keep you informed on the hottest issues facing this fast-changing industry.

    Hannah Smalltree, Editorial Director

    By submitting your registration information to SearchCRM.com you agree to receive email communications from TechTarget and TechTarget partners. We encourage you to read our Privacy Policy which contains important disclosures about how we collect and use your registration and other information. If you reside outside of the United States, by submitting this registration information you consent to having your personal data transferred to and processed in the United States. Your use of SearchCRM.com is governed by our Terms of Use. You may contact us at webmaster@TechTarget.com.

and comprise it of both structure and data value metrics. Characterize your data and give it a score. You owe it to the funders to take care of this high risk to data warehouse success.

William McKnight answers your questions on Business Intelligence in searchCRM's  Ask the Experts.


This was first published in October 2001

Join the conversationComment

Share
Comments

    Results

    Contribute to the conversation

    All fields are required. Comments will appear at the bottom of the article.

    Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.