Q&A: IBM's Aaron Brown on Text Analytics for Legal Compliance

The burden of legal e-discovery mandates is driving demand for "knowledge discovery" technologies. The program director of IBM's Content Discovery and Search unit discussed the use of text analytics in legal work and other emerging applications.
So we have conceptual and semantic search, information extraction, clustering for term reduction and document processing, link and association analysis: many varieties of text-analytics and data-mining mojo. What's IBM's near-term approach to meeting legal-sector needs?

We're putting a substantial focus on adapting the technology to the domain, working closely with key clients and partners with very good visibility to and deep expertise in the ways the legal world works. We're also banking on the market shift that's already happening as more and more corporations, fed up with the cost of paying outside providers to handle e-discovery reactively, are start bringing e-discovery in-house. As they do this, e-discovery becomes a proactive solution purchased and operated as a joint venture between IT and Legal.

Successful technology solutions focus on end-user requirements. We technologists should remember this principle, which applies to every business domain including in legal sector. Your thoughts?

I’ve read your blog on this point… and you're absolutely right that domain-specific applications, workflows, and interfaces are essential to the infiltration of text analytics into the legal space. I've been spending a lot of my customer-facing time recently talking with legal officers at many of our large clients, and this is a message that comes through loud and clear. We've seen that they're less impressed with the latest whiz-bang visualizations and extraction technology, but when we demonstrate how we can embed that same analytics technology so it disappears into tools that accelerate or simplify their existing, deeply entrenched processes, they can't get enough of it.

What about other application domains? Beyond proven text-analytics successes in areas such as intelligence and life sciences, what are the most promising, emerging applications?

Following close behind compliance and discovery is a second cluster of applications around the topic of customer insight — applications that leverage text analytics to help companies reach a deeper understanding of their customers and the public at large to drive enhanced business value. This cluster starts with solutions that mine insights out of direct customer interactions with contact centers – the so-called "voice of the customer" – and extends to solutions that mine public discourse on the Web, and other sources internally and externally, to provide insight into public perception of products and services.

In the first set of solutions, text analysis brings the unstructured aspect of the customer dialogue, such as notes, e-mail, chats, voice transcription, or even the public conversation on the Web, together with traditional structured data already being collected, synthesizing the full view of the customer. We use the resulting insights to improve agent performance and compliance, to discover which types of interactions lead to better results, for early warning of quality issues, to find new opportunities for cross-sell/upsell or for developing new products and services, and for competitive intelligence. In the second set of solutions, text analytics provides insight into the ongoing conversation taking place on the Web (and increasingly within the company), helping detect emerging trends and patterns in the tone of conversation, helping highlight product quality issues early on, and exposing opportunities to improve marketing and future product capabilities.

And you cited ECM earlier...

We're seeing a new cluster of text-analytics applications starting to emerge around more traditional content management use cases. These are very exciting as they are leading indicators that text analytics is expanding beyond its traditional reaches into a market that's rich with content but sparse on analysis and insight today.

Enterprise are starting to see text analytics as a something they can apply as a horizontal capability to better understand the volumes of content that they've been storing and managing over the years. That might mean analyzing customer correspondence files to identify patterns linking customer satisfaction to buying behaviors, analyzing contracts or financial instruments to extract unusual elements and kick off business processes to mitigate them appropriately, mining insurance claim materials to identify patterns that indicate fraudulent behavior, or any other use case that revolves around the wealth of insight buried in archived content. It's going to take some time for these applications to mature, but over the long term I definitely see this as a substantial growth area for text analytics.