Expert: Data quality is misunderstood
By Kerry Glance, Editor
23 Feb 2005 | SearchCRM.com
![]() |
|
| Larry English | |
SearchCRM.com: What is the best way for companies to get
started with data quality programs?
The first step is to assess whether the company understands the principles of quality management as
applied to data and information. Sometimes organizations have a real technology bias to
implementing data quality. They tend to implement profiling tools to discover the patterns in the
data or data cleansing tools that address the correction of data. But information quality
management is more than just data in the database.
|
|||||||||||||||||
SearchCRM.com: What is the most common mistake
organizations make when it comes to data quality improvement?
Most organizations don't understand real quality management principles. Data quality is not just
data cleanup. Data cleanup or data correction is the equivalent of "information scrap and rework"
with defective products in manufacturing. You have to either fix them or scrap them. Data is
subject to information quality decay -- people move, people get married, people get divorced -- and
if we do not have processes in place to capture that updated information, then we will be
condemning ourselves to continual data correction. Real information quality improvement applies
process improvement. When you find defective data, you need to determine the root cause of the
problem and once you do that, you can define processes to prevent the recurrence of those defects.
SearchCRM.com: What are some techniques for assessing information quality?
Profiling to discover anomalies is one. Another, validity assessment, is to define business rules
and measure for conformance. But one of the most important techniques is accuracy assessment. With
customer information, for example, you have to verify the data that you have against the real world
subject that the data represents. It requires actually taking a sample of customer records and then
contacting the customers to verify the information. When one of my clients -- a large financial
organization -- assessed their customer data, they did a validity test. That is, they measured
whether the customers' marital status codes had valid values ("M" for married, "S" for single,
etc.). There were virtually no errors -- no invalid values. However, when they contacted the
customers to verify it, they learned that 23.3% of those codes, while valid, were inaccurate.
SearchCRM.com: So it takes a combination of sound processes -- including measuring for accuracy and
not just validity -- and then a process improvement, not just data correction or cleansing.
|
||||
Right. One problem is that we have automated silos of information and processes, which means a lot
of redundancy. If all of the databases had data that was named and defined consistently with
consistent values, we'd have a much easier time of doing reconciliation and identifying duplicate
customers. But most of the time, they are defined uniquely to a given departmental line of
business. Managing the business down the silos creates fiefdoms and that creates politics. The
politics are there because most organizations reward behavior that is individual. But when it comes
to creating information for other parts of the business, there is not a willingness because there
is no reward. So we have to reorient the organization to a horizontal, value-chain view.
SearchCRM.com: What exactly does that entail?
English: We have to reorient the organization to a horizontal, value-chain view. Information tends
to be produced in one part of the business, but used in another part, usually downstream.
The real accountability is not with an appointed data steward but with the manager who oversees
processes that create or update data. That manager needs to be accountable for the quality of
information produced by their processes for others in the organization who depend on it. It's a
supplier/customer relationship. And there are forward-thinking companies that are putting that
accountability into managers' job descriptions. We have the technology, but we have not understood
the principles of the Information Age. The Information Age requires management of processes
horizontally across the value chains, and it requires holding managers accountable for their
information. As a result, we have forced our knowledge workers to become information hunters and
gatherers. SearchCRM.com: Do you think vendors understand this "value chain approach" to data
quality?
Some of them do. But in many cases, even if software providers do have a process methodology, it
tends to be based on the features and functionality of their tools and not on quality management
principles, per se. SearchCRM.com: What should companies look for when evaluating data quality
tools and technology?
There are a variety of information quality software functionalities such as profiling or cleansing.
Defect prevention is another example and it's actually the highest form of software capability in
the information quality space. But companies need to understand the problems they are seeking to
solve. Stop focusing on fixing the defects (i.e., "scrap and rework") because it's costly. Experts
like [W. Edwards] Deming have taught us that there's a better way: designing quality in. So my
advice is to look for tools that help solve the process problem -- tools not just for cleansing,
but also for defect prevention capabilities. And it's important to understand the limitations of
technology -- electronically you cannot correct all data and you may or may not be able to
guarantee the accuracy of the data provided. The best place to solve the problems with defective
data is right at the point of knowledge capture -- verify the information is correct and complete.
The little bit of time it takes to do that prevents so much grief and customer alienation.
CRM Solutions from SearchCRM, White Papers, CRM Expert Advice, CRM News
CRM Resources
