Text mining technology, which promises to dig deep into your company's textual data and deliver valuable, actionable intelligence, was a hot topic several years ago. But it has been relatively slow to evolve and gain widespread popularity.
"It hasn't gotten the huge traction people thought it would," said Alexander Linden, a research vice president with Stamford, Conn.-based Gartner Inc. "It's occupying niche industries. We remain optimistic, but the adoption speed is slow."
Areas like claims fraud in the insurance industry, criminal intelligence in government agencies and clinical trial testing in the life sciences industry have all seen some success with the technology, but such scenarios remain fragmented and infrequent. Where text mining has found some success is in customer-facing operations and in integrating with CRM systems.
"The breakthrough in text mining is with some of the CRM solutions and the people who have applied that to CRM," Linden said.
One company that has found some text mining success is Palo Alto, Calif.-based Hewlett-Packard Co. The high-tech giant has used the Text Miner application from SAS Institute Inc. in Cary, N.C., to smooth its merger with Compaq and to see what its most valuable business-to-business customers were discussing.
Randall Collica, a senior business analyst, joined the company after the Compaq merger. He works out of HP's call center offices in Littleton, Mass., providing marketing intelligence for the company's campaigns in the Americas region.
"When I first came into the CRM project, I noticed we had a lot of character-based data -- unstructured data -- that we would love to have made use of," he said. "IT was struggling to find ways to report on it. It was too difficult to do in any kind of volume."
Telesales operators working out of the Littleton call centers had a field associated with customer records where they could enter notes on customer interactions, business process issues or text they cut and pasted from customer e-mails.
"I looked [at it] and said this is valuable information for marketing analysis," Collica said.
HP had a system for clustering structured data, so Collica was able to cluster groups of customers with similar characteristics, then mine the text from the free form field and determine what the company's most valuable customers were talking about. Additionally, Collica was able to synch the structured data with the unstructured. HP could then see what kind of service problems or concerns different customer segments were having.
Millions and millions of SKUs
As a result of the Compaq merger, HP found itself with products that had entirely different histories and hierarchies.
"Our group had the wonderful, dubious task of finding out how products merged together," Collica said. "Until we had custom hierarchy, we were in a bit of a fix."
The answer lay in taking the product descriptions of the million stock-keeping units (SKU) from one database and the million SKUs from another and applying Text Miner. The textual analysis was able to merge them into common themes and sets and establish a rough set of similar product groups.
"A lot of the time in marketing they don't care whether something is in one product division or another, they just want to know [whether] it [is] a laptop or desktop," Collica said. "For eight months, we were able to solve that problem."
HP does have the advantage of possessing a good amount of textual data in the first place -- with multiple call centers, Web-based data on customers like .pdf files and other data sources. It also has seven analysts working with SAS data mining tools.
According to Linden, it is companies with large amounts of data that are getting the most value out of text mining technology, but people with the necessary skills to use it are not easily found.
"Data mining is a skill," Linden said. "With text mining, not only do you need data mining but you need linguistic skills. It's quite a bit more challenging to deploy these solutions if you have to build it yourself. The customization hasn't really happened yet."