Equifax and FICO on Applying Machine Learning to Open Data
Credit rating organizations explore different approaches to understanding a flood of transaction information.
Teams that work with open data may feel like they face an explosion of information these days, but there are resources being brought to bear to process such data and stem the tide.
Last week’s FICO World conference in New York revealed some of the varied ways the credit niche of the financial world tries to apply big data analytics and so-called decision technology.
The conference was largely a showcase for data analytics company FICO, but some presentations spoke to a broader context -- using machine learning and other resources to process vast amounts of data. Peter Maynard, senior vice president of data and analytics for strategic client and partner engagement at Equifax spoke about a partnership between his consumer credit reporting agency and FICO. He was joined by Tom Johnson, senior director with FICO, to discuss their joint effort combining data in a platform for decision making.
Part of the drive for the platform, Maynard said, was a desire to bring together data assets from FICO and Equifax. “We both developed machine learning capabilities exploiting AI,” he said. Through the platform, Maynard said, those capabilities are put to work for use cases such as decisioning or predictive purposes. AI and algorithms, he said, are effective at identifying interactions and patterns of behavior in more efficient ways than a data scientist. “The work has been done for you,” he said. “You don’t have to go out and get all the data sources.”
Johnson said the intent of creating the platform was to address problems such as the inherent inefficiency in analytics when it comes to ever-growing datasets. The goal, he said, was to deliver insights quickly to help organizations be more responsive. “A lot of times, clients told us it might take three months to years to get new analytics,” he said.
More data can lead to greater competitive insight, Johnson said, but the time spent waiting for analysis meant circumstances might have changed significantly. The dashboard could be used to compare performance with an organization’s biggest competitors, he said. “It allows you to take advantage of opportunities in the market that you need to move quickly.”
Johnson said the platform was designed to increase speed and reduce friction through interoperability with existing third-party systems. He said analytics derived from the platform can be used in multiple lines of business, as well as across the credit life cycle. “We didn’t want you to have to rip and replace all of your systems in order to take advantage of this,” he said.
There is a need, Maynard said, for fresh and current understanding of data for organizations to remain viable. “If you’re not testing new data every time you build a new strategy or model, you’re falling behind your competition.”
Understanding data, though, may call for varied approaches. Later in the conference, Joseph Murray, director of analytic science at FICO, spoke about research in unsupervised machine learning. Technology FICO developed for quantitative estimation, he said, is used in platforms to detect and prevent abuse in procurement and expense reports. Some challenges, though, require unsupervised machine learning, Murray said.
Supervised machine learning, he said, uses data that is already labeled so it is known what classifications are intended to be detected. That might be to spot what is fraud or not fraud. “The machine learning model’s job is to try to separate those two classes,” said Murray.
Some of the most common paradigms in machine learning, he said, are focused on supervised learning. At FICO, Murray said his team deals with challenges that require unsupervised machine learning. “In our case, a lot of what we are looking for is very rare events, unusual behaviors and anomalies,” he said.
Joseph Murray, director of analytic science at FICOImage: Joao-Pierre S. Ruth
Unsupervised machine learning, Murray said, can be applied to “tip of the iceberg” type problems with data. “We know there is a lot hiding under the surface but we’re not able to identify which class or label it belongs to,” he said. There are instances where it can be difficult to label all data or get feedback to discern its true underlying classes. That can be the case, Murray said, in the anti-money laundering world where the aim is to find suspicious actors. “Most of the systems will only be able to identify a small percentage of the population to be subject to further investigation,” Murray said.
FICO uses self-calibrated, quantile estimation, self-adapted algorithms, he said, to cover such needs. While it might be possible to look at a set of transaction data over an extended period, it can be difficult to do so quickly, especially with streaming data. Murray said this is where unsupervised machine learning can come into play. “That’s the type of algorithm we’ve developed,” he said, “that can work with a small set of memory, a small footprint in database storage, and accurately and quickly update.”
About the Author
You May Also Like