informa
/
News

Seeing the Connection

Thanks to text mining, visualization software can now depict relationships between concepts found in documents and email.

Relationship-network visualization comes into its own when linked closely to emerging "unstructured information" analytics. For example, Spoke infers social networks, depicted in the SpokeMap, by mining email, looking at the people with whom you interact, and email frequency and patterns and then linking this information with Web-search results. While I related deficiencies in its approach in an earlier column on social networking, it's compelling when applied in larger-scale enterprise settings. The visualization options appear to boost the value of Spoke-managed social networks significantly.

Inxight's Rao expressed the goal as "trying to create a conceptual model of [an information] space and then project it into a perceptual model," a diagram. For text-mining vendors such as Inxight, the information space is a set of documents. Its software identifies, extracts, and taxonomically organizes concepts to form categories. It tags identified concepts within the documents and then classifies the documents according to the categories using statistical algorithms. The result is conceptually based, linked clusters of the sort depicted in the Lombardi diagrams, ones that trade interactivity for artistic merit. You can try one variation of these approaches yourself by installing the trial version of the Grokker Web-search visualization tools.

Although relationship-network visualization can help you understand, analyze, and exploit connectedness, I do see a shortcoming in its inability to drive analytics. The path from data to graph is one way, and you can't generate or alter models implemented in equations and data structures by manipulating a visual display. These are hard problems that will ultimately be solved. In the meantime, relationship-network visualization is poised to become as invaluable for work with relationship-centered information as conventional charts are for numeric data.

Seth Grimes [[email protected]] heads Alta Plana Corp., a Washington, D.C.-based consultancy specializing in business analytics and demographic and economic statistics.


Resources

Advizor Solutions: www.advizorsolutions.com

Grokker: www.grokker.com

Inxight Software: www.inxight.com

Spoke Software: www.spoke.com

Tom Sawyer Software: www.tomsawyer.com

TouchGraph: www.touchgraph.com

Mark Lombardi "global networks": pierogi2000.com/flatfile/lombardi.html

"Spheres of Influence: The Bush Campaign Pioneers": washingtonpost.com/wp-srv/politics/pioneers/pioneers_spheres.html?nav=tablet

Additional Columns at IntelligentEnterprise.com:

"The Word on Text Mining" Dec. 10, 2003
"Matchmaker, Matchmaker" (social networks) April 17, 2004