Bloomberg Adds Machine-Learning To Apache Solr Search

Bloomberg has created an open source way to use machine-learning models to weight searches, then add their values to the Solr search engine.

Charles Babcock, Editor at Large, Cloud

February 22, 2017

3 Min Read
Kevin Fleming, Bloomberg

Bloomberg has found the Solr open source search engine valuable as a core element of several Bloomberg products. Now it has come up with a Solr plug-in that lets Solr users build a machine learning model of what information is most valuable to them, then add that information through the plug-in to Solr searches.

Since Bloomberg is a major distributor of financial news and the supplier of the frequently-used Bloomberg Terminal into which business news is streamed to the financial services industry, having an effective search engine available is part of maintaining its competitive advantage.

Creating a means to train that engine to rank search results according to criteria decided by its internal teams gives Bloomberg an ongoing advantage over other news and terminal services suppliers. Bloomberg has done that through its Learning-To-Rank plug-in for Solr. It has also made the code for the plug-in open source through the Apache Software Foundation, host to the Solr search engine project.

"We have smart people working on Solr," said Kevin Fleming, member of the CTO's office and head of the open source community at Bloomberg, in an interview. "You want to be a good community citizen, you want to contribute code and fix bugs. You also want to make good business decisions," he explained on Feb. 13, as he prepared to address a session at the Linux Foundation's Open Source Leadership Summit at Lake Tahoe Feb. 14-16.

Making the code open source assures Bloomberg customers that its ongoing development will continue at a rapid pace through an independent governing board and outside contributors, Fleming explained. It will continue even if Bloomberg's interest in the project flags. Solr and Learning-To-Rank will remain available as open source, even if Bloomberg were, for some reason, to drop its involvement.

Want to see how another piece of open source code is coming close to dominating container management? See How Kubernetes Came To Lead Container Management Pack.

Bloomberg has enhanced its news services' appeal by illustrating that it knows how to help its customers make use of the latest technology. Solr users can build ranking models and coach them on what they want through machine learning. Learning-To-Rank can then take the results from those models and apply them to the search results.

"Tens of thousands of news sources are fed into Bloomberg Terminal or Bloomberg Professional Services," noted Fleming. A ranking model is something like a spreadsheet with designated "features" and weighting factors for what the user wants to look for. The goal is to let computers sift the thousands of results and assign a specified value to the rankings instead of forcing customers to manually go through hundreds of repetitious searches, filtering down the results to what they really want.

"This will save a lot of search teams a lot of time," he predicted.

Bloomberg provides the software framework for building the machine-learing model, which remains a proprietary offering. Making the plug-in open source, however, means other parties will be able to build competing model frameworks that will also work with the Learning-To-Rank plug-in and Solr.

"The plug-in will do exactly what the model tells it to do," said Fleming. "Every major search application has that as a goal – to present results that are most desirable to the user," he said.

The LTR plug-in was regarded as useful enough and mature enough to be included as part of the latest 6.4 release of Solr. It's not activated automatically. Solr users need to choose to use the plug-in but "it's available and ready for everyone to experiment with," Fleming said.

Fleming said the plug-in was so new to Solr that Bloomberg's development team hasn't gotten a lot of feedback on it, but he expects ideas for recommended new features and functions to start as soon as the Solr community is familiar with it.

About the Author(s)

Charles Babcock

Editor at Large, Cloud

Charles Babcock is an editor-at-large for InformationWeek and author of Management Strategies for the Cloud Revolution, a McGraw-Hill book. He is the former editor-in-chief of Digital News, former software editor of Computerworld and former technology editor of Interactive Week. He is a graduate of Syracuse University where he obtained a bachelor's degree in journalism. He joined the publication in 2003.

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like

More Insights