Embedded, automatic and easy new approaches meet growing demands for do-it-yourself data analysis.
Attivio: Universal and Unified
Enterprise search and BI have each been around for decades, largely operating in information silos, one restricted to documents and the other to data collected from operational and transactional systems. Attivio's aim, dating to the company's 2007 founding by refugees of FAST (a Microsoft subsidiary since 2008), has been to break down the database-document barrier by providing a search interface that relies on a single, unified index. Attivio delivers results in familiar BI dashboards and analysis widgets.
Attivio pulls data from a very wide variety of disparate sources, from files and databases and also e-mail, content-management, and enterprise-application systems via APIs and connectors (supplied by the company and partners).
The Attivio Active Intelligence Engine (AIE) will extract content (text, metadata, structure information), manipulate it, enrich it, and link or join it.
"Enrichment components such as sentiment and entity extraction and classification can be used to add intelligence to the integration process," says company co-founder and CTO Sid Probstein. "They require some setup work, mostly training on the customer's data."
Attivio performs "dynamic schema creation" based on discovered data values and types, and "we have a number of components that identify and report on integration opportunities after a small data set is processed," Probstein says.
Detecting, by name or by content, that two columns, from same or different sources, appear to be the same.
Detecting that two tables, from same or different sources, appear to be joinable based on some common key values.
Detecting anomalous values within a single column or table.
Detecting type differences between columns that have similar names.
Detecting duplicate or near-duplicate records based on a variety of keys."
Attivio AIE's dynamic schemas support ad-hoc integration of diverse data, but it is by no means the only credible search-BI technology on the market. Endeca's Information Access Platform (IAP) uses similar techniques to provide similar capabilities, targeting online and mobile commerce and publishing in addition to search-BI. Other, specialized platforms adapt these integration techniques to focused business problems and information domains.
FirstRain Senses Time
FirstRain is a business-information search and monitoring tool that mines and integrates information from the open Web -- news, blogs, and industry, government, scientific, and academic sources -- in addition to a set of key corporate-information databases. The aim, per the company's Web site, is to "derive relationships, spot changes in management or business structure, and track trends across industries."
"The application of semantic analysis that is 'business structure aware' is crucial to be able to identify and deliver relevant business information that is scattered throughout [disparate] sources," says the company's technology vice president, Marty Betz. Also crucial is the ability to synthesize time sequence from pages found on the open Web.
(Time sequence is important! Indeed, the number-three result returned by a Google search for "us senator pennsylvania" was now-former Senator Arlen Specter's now-disappeared Senate Web page.)
"By analyzing the flow of content through our pipeline, the system can dynamically model and adjust its understanding of the market ecosystems around companies and industries," Betz says.
Betz describes the use of trending and anomaly detection, applied to unstructured narrative content from a variety of sources, to enable a different class of questions to be systematically asked, analyzed and answered via answers that require "connecting the dots."
So in FirstRain we have broad-but-selective content acquisition and integration, with the application of goal-relevant organizing principles, to respond to a high-value business need: timely access to corporate developments.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.