Brand-reputation management, market research, competitive intelligence, and customer-service needs have led to fast growth. Here what's driving interest.
Software and service text-analytics revenues now total $835 million globally, according to a 2010 market study I completed recently.
Growth is steepest for applications that seek business insight in social networks, online media, and surveys. Applications include brand-reputation management, market research, competitive intelligence, and customer service and support. For these applications and others, text analytics brings automated, natural-language processing techniques to bear to identify and extract names, facts, relationships, sentiment, and other information in blogs, forums, news, social updates, e-mail, and a range of enterprise sources.
My $835 million market-size estimate covers software licenses, service subscriptions, and vendor-provided technical support and professional services. Despite strong growth, it remains a small fraction of Gartner's $10.5 billion 2010 valuation of the broader BI, analytics, and performance-management software market.
My estimate captures the value both of core content-analysis capabilities and of text analytics' contribution to four content-related application categories: Information capture (most often via Web scraping), information management (descriptive metadata and "unstructured" text), text-fueled enterprise applications, and search-based applications.
The search-based applications category is worth an estimated $300 million of the $835 million text-analytics total value. It includes Web and enterprise search, e-discovery, and business, scientific, and legal information services. The last three are typically accessed via search interfaces and rely on knowledge bases populated by mining textual sources, such as judicial records, research papers, and online forums. Examples include the West Litigation Monitor from Thomson Reuters, Elsevier's SciVerse platform, and ConsumerBase from NetBase.
The enterprise-applications category comprises software and services for business functions such as customer relationship management (CRM), market research, enterprise feedback management/surveys, and competitive intelligence.
Enterprise information management (EIM) systems store both text and accompanying descriptive metadata (such as author, title, topic, publication date, and tags) that facilitates publishing the stored text as "smart content." EIM places a premium on multi-channel publishing, reuse, and targeting.
Information capture or acquisition starts with Web crawling and page or document retrieval, for example, locating and scraping prices from online commerce sites for competitive intelligence purposes. It further includes text extraction from mark-up and binary formats such as HTML, PDF, and Word; data cleansing that removes ads, menus, spam, and other extraneous content; metadata extraction, and deduplication.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.