Over time, anyone seriously using customer information for business intelligence must begin to apply the functions of de-duplication and merging to the data. Ideally this occurs up front in the process and is planned into the build and constant update of the customer data. Also, if possible, these "cleansing" processes can be started at the point of data entry where data is filtered and entry operators are provided with access to existing information to help ensure that transactions are assigned to the appropriate customer.
But even environments that have improved data entry environments and processes need to deploy back-end data warehouse de-duplication and matching.
The purpose of de-duplication is fairly obvious -- we want one record per customer. In the race to establish an accurate profile, we can't afford to assign one customer's records to multiple customers because the customer went to a different store, changed employers or re-registered with us.
Equally important is the need to "duplicate" those customers who appear similar otherwise. For example, the suffix part of a name can be important when distributing prescriptions to John Doe Sr. and John Doe Jr. when both live in the same house.
Merging is the process of defining relationships amongst customers such as householding, other family and influence circles. Merging takes existing records from similar or disparate systems and merges them into one record or builds
Parsing and matching technologies are key to effective de-duplication and merging. We'll explore them in the next tip.
For more information, check out SearchCRM's Best Web Links on Business Intelligence and Data Analysis.
This was first published in March 2002