Mining Data for Nuggets of Knowledge

When Richard Fairbank, CEO of Capital One, met with Wall Street analysts last month, he told them an unusual story about why his company, a Virginia-based issuer of Visa and MasterCards, has been doing well at a time when other credit card issuers are struggling with rising interest rates and ineffective promotions. Recognizing that students represented an undertapped market, Capital One recently delved into new mailing lists and pitched highly targeted offers tailored to students’ needs. Result: students, who often dump most credit card offers into dustbins, responded in droves. As Business Week (Nov. 22) points out, Capital One’s mailings returned 70% more responses than similar offers to other mailing lists of students.

What was Capital One’s secret? The answer lies in a fast-emerging but arcane discipline called data mining, which more and more companies are beginning to employ. As computers have proliferated, most businesses now routinely collect large volumes of data about customers and their transactions. Telecommunications companies and credit card issuers, for example, track phone calls and transactions by millions of customers. Data mining uses sophisticated models drawn from fields ranging from statistics and computer science to artificial intelligence to drill through billions of bits of data for nuggets of information. Ultimately, that information can translate into knowledge and insights about customers and markets.

While companies have long used statistical tools to monitor customer behavior, what sets data mining apart is that it can juggle huge volumes of data, according to Jacob Zahavi, a visiting professor at Wharton. "Conventional statistical methods work well with small data sets," he explains. "Today’s databases, however, can involve millions of rows and scores of columns of data."

Zahavi and two colleagues—Lyle Ungar from the University of Pennsylvania’s Computer and Information Science department and Robert Stine of Wharton’s Statistics department—will teach a short Executive Education course in December about data mining. While industries ranging from financial services to telecommunications have been turning to data mining, the discipline already faces several challenges. One is coping with the coming of the Internet, which has made it easy and inexpensive for companies to collect even larger volumes of data. Another is the strategic challenge that managers face: Who takes ownership of data mining projects in companies where teams from several departments must collaborate to implement them?

Companies like Capital One seem to have answered that question, if the results obtained through their data mining efforts is any indication. Capital One executives analyzed how various classes of customers respond to features ranging from annual fees to interest rates. Then they set varying rates of fees, interest rates and features, developing in all some 7,000 varieties of MasterCard and Visa products for different customer groups. Such segmentation allows credit-card companies to manage risk, Zahavi explains. "High-risk customers get offers with high interest rates while those who pay their bills on time get offers with low interest rates," he says. If the data mining model works, it increases the likelihood that customers will receive offers they are most likely to respond to, rather than toss in the garbage can.

Ungar points out that credit-card issuers have also found data mining useful in fraud detection. Companies do this by scanning databases for transactions that don’t jell with a customer’s past card usage patterns. For instance, a credit card transaction in which a customer, who does not usually do so, buys $2 worth of gasoline immediately raises a red flag, Ungar says, because it might suggest experimental use of a stolen credit card. Credit-card issuers also use data mining to identify customers who represent a high risk of declaring bankruptcy. Telecommunications companies also employ data mining models to track fraudulent phone usage. Long international calls to a country that a user has never called before can signal a stolen phone card or some other kind of abuse.

Pharmaceutical companies often use data mining for both clinical and marketing operations. Stine explains that big drug firms often sort through massive databases of compounds to screen out the most potentially successful ones, a task that is nearly impossible to perform manually. Sometimes, success arrives serendipitously. Pfizer, for example, did not intend to develop Viagra as a treatment for impotence. The drug’s original purpose was to relieve angina pain, but data analysis revealed that men who used it experienced sexual arousal, which eventually led to its development as a treatment for impotence.

Despite such successes, data mining also faces major challenges. Zahavi notes that on the technical front, the key hurdle is to develop better algorithms and models that can handle increasingly larger data sets as rapidly as possible. "Scalability is a huge issue in data mining," he notes. "Another technical challenge is developing models that can do a better job analyzing data, detecting non-linear relationships and interaction between elements."

The key business challenge is identifying problems that can suitably be analyzed with data mining tools. This issue has emerged as a particularly nettlesome one since the explosion of the Internet, as more companies move towards introducing e-commerce. "The rules of customer behavior are different in the Internet environment than they are in the physical world," says Zahavi. "Data are easier to capture on the web, but at the same time customer decision periods are much shorter. Once someone logs onto a certain website, you want to be able to make the right offer to that user during that session. Special data mining tools may have to be developed to address web-site decisions."

Even if data mining experts figure out ways to deal with the impact of the Internet, a major challenge will still remain. As Stine puts it, at one pharmaceutical company the biggest issues that came up during data mining meetings were organizational rather than statistical. "The database was being designed by the computer sciences group, but the chemistry group was going to collect the data and the statistics group was going to organize it," he says. "So who was responsible for this project? Each group had its own quarterly objectives which didn’t necessarily depend on the success of the data mining venture." Stine argues that unless organizations find ways to overcome this kind of silo mentality, it can undermine data mining projects.

For companies that succeed in data mining, however, the rewards can be enormous. "Every company now has more data than ever before about who is using its products and how, and this feedback contains nuggets or patterns," says Stine. "If you could only see those nuggets and recognize those patterns, that could result in a burst of insight." And some bursts of insight can be very profitable. Ask Capital One.

More From Knowledge at Wharton

From Amazon to Uber: Why Platform Accountability Requires a Holistic Approach

The YouTube Algorithm Isn’t Radicalizing People: Why User Choice Matters on Social Media

Employees Have Specific Expectations Around Inclusive Work Environments & Culture

Looking for more insights?