Aggressive and literal matching are two techniques used in the process of comparing customer (or other) records for purposes of de-duplication and merging to achieve quality customer data.
Literal matching requires field values to be exactly matching to be considered a match. Aggressive matching takes more liberties with customer data values, bringing in intelligence and uses partial and heuristic matching where close or partial matches are scored.
One approach to aggressive customer data matching is to assign comparative scoring to compare two data fields.
Many methods of customer data comparison are possible, including direct comparison, comparison of extended attributes and various forms of spelling and phonetic checking. The method that yields the highest score can be compared to a threshold value to determine the success of the match. Alternatively, each method can be weighted producing an overall comparative score that, again, can be compared to a threshold value to determine the success of the match.
In aggressive customer data matching, a direct comparison would be the equivalent of literal matching, which compares fields literally. Using extended attributes takes a look at the literal and phonetic values in adjoining, relevant fields (like address, phone number and zip code for comparing customers for householding). Spelling checks consider small spelling differences and common misspellings for words. Finally, phonetic checks will consider words with similar sounds as potential matches.
For more information, check out this guide on data quality.
This was first published in March 2002