Breakthrough NoSQL-style text analysis speeds search for technology developers.
IBM and North Carolina State University (NCSU) have revealed details of an innovative text analytics application that is helping the university's Office of Technology Transfer find potential partners.
NCSU faculty, staff and students developing new technologies don't always know where to turn to commercialize their inventions. It's job of the Office of Technology Transfer at the Cary, NC, campus to manage the university's intellectual property, finding best-possible development partners for early stage technologies.
Seven full-time licensing professionals in the office handle the task of combing through sources including SEC filings, tech blogs and Internet sites to find would-be partners. But conventional search methods were too time-consuming to keep up with demand.
"We manage more than 3,000 technologies at any one time, so the case load is large and we needed something that could speed the process of finding potential partners," said Billy Houghteling, Director of the Office of Technology Transfer.
Houghteling's group started working with IBM about eight months ago as part of a much larger innovation management initiative across the entire 16-campus NCSU system. To meet the unit's fine-grained search needs, an IBM team recommended two enabling technologies: IBM Big Sheets and the IBM Content Analyzer.
Big Sheets is a non-relational data-analysis tool that provides a spreadsheet-style analysis front end for Hadoop. Often associated with the budding NoSQL movement, Hadoop is an open-source alternative database that can integrate huge volumes of unstructured data. Web content and social-network data streams are typical of the kinds of data handled in Hadoop.
IBM Content Analyzer is a language-aware search and text analytics engine. The tool has been integrated with Big Sheet so it can explore and analyze unstructured data exposed therein. The two tools work together to intelligently spot relevant content with far greater precision than simplistic keyword-search tools.
In a pilot test conducted this spring, the Big Sheets-Content Analyzer combination turned up concise, highly targeted lists of partner prospects for two promising technologies developed at NCSU.
"The tools identified the same 10 to 15 companies that we identified with our traditional triage methods, but they also uncovered 25 additional companies, more than doubling the number of potential partners," Houghteling said.
To be precise, the Big Sheets/Content Analyzer combo didn't hand researchers 40 leads on a silver platter. Rather, prospects were winnowed down from relevance-prioritized list of about 1,000 candidate companies. Researchers focused on the first few hundred hits to get to the list of final candidates. But the bottom line is that the Big Sheets/Content Analyzer approach took a much shorter time to yield a more complete list of prospects.
"With these tools I'm confident we can take a triage process that took about four months to identify a licensee down to as little as one month," Houghteling said.
Big Sheets and the IBM Content Analyzer are now being deployed as part of a larger collection of tools and repositories that will enable the Office of Technology Transfer to quickly explore the data sources it has always used. The key difference is that they'll cut to the licensing stage in a matter of weeks instead of months.
Ironically, a well-known analytics language got its start at NCSU before it was commercially developed by SAS. There's even a SAS Hall on campus, but apparently the home-town tech vendor doesn't have a monopoly on NCSU's choice of analytics.