Seeing the Connection - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Software // Information Management
07:15 PM
Connect Directly

Seeing the Connection

Thanks to text mining, visualization software can now depict relationships between concepts found in documents and email.

Numbers communicate facts but lack meaning on their own. It takes textual, tabular, or graphic presentation to lend them context and the ability to tell a story. The same applies to non-numeric information — particularly representations of concept-based and interpersonal relationships. With text mining's emergence, we can now systematically extract concepts from email and textual documents and exploit them to classify, analyze, and automatically process source materials. New classes of relationship-network visualization software provide the same graphic accessibility to this "unstructured" information as charts and dashboards provide for the numeric data that has, until recently, been the near-exclusive basis for enterprise decision-making.

By the Numbers

For numeric data, we gravitate toward graphics rather than data tables or narrative text because graphics are efficient. Graphics depict facts in abstract but directly accessible forms that emphasize relationships — how values have changed, what contributing factors are most significant — relocating contextual clutter to a backdrop. It's those relationships — not the context-lacking absolutes — that matter most in corporate decision-making.

Monitoring business processes, measuring performance, and constructing data warehouses, forecasts, and predictive models are worthless unless they impart actionable information. However, most corporate information is unstructured and not easily analyzed, much less depicted in basic line, bar, pie, and scatter charts depicting relative numeric values.

Visualizing Relationships

Organization charts are the ancestors of the relationship-network graphics that I'm describing. Org charts can be useful but they're not good at depicting data that's dynamic, high-volume, or doesn't fit a rigid hierarchy. Computer- and communications-network diagrams add the missing elements but were designed to map physical (rather than conceptual) networks. They do effectively use symbols and lines that are varied in type, orientation, size, and coloring to imply how the nodes and connections in the mapped networks may be classified. And they effectively use layout to depict dispersal or distribution of nodes in a space, even if what is mapped is the physical rather than the conceptual arrangement we seek to depict in relationship graphics.

Work by now-deceased artist Mark Lombardi was the first attempt I know of to graphically depict extensive, hierarchical, multidimensional relationship networks. His New York Times obituary explained:

Sometimes measuring as much as 10 feet across, these drawings nonetheless had tremendous visual verve, delicately tracing the convoluted unfoldings of contemporary morality tales like the savings and loan scandal, Whitewater, Iran-Contra and the Vatican bank scandal.

The small circles in his drawings identified the main players — individuals, corporations, and governments — along a time line. The arcing lines showed personal and professional links, conflicts of interest, malfeasance, and fraud.

Solid lines traced influence, dotted lines traced assets, and wavy lines traced frozen assets. Final denouements like court judgment, bankruptcy, and death were noted in red. (Roberta Smith, March 25, 2000)

New classes of network-visualization tools arose out of academic and industrial research in the 1990s. They automate the graphic rendering Lombardi did painstakingly by hand. I've seen these tools used mostly in connection to law enforcement and counterterrorism text mining but their use is spreading. For example, perhaps in the spirit of Lombardi's work, the Washington Post ran a couple of interesting relationship charts earlier this year, one depicting the business and government connections of controversial former Department of Defense official Richard Perle and the other illustrating President Bush's fundraising network. The print versions are of course static; the online version (see "Spheres of Influence" in Resources) adds interactivity that lets the user explore the diagram by focusing and zooming in on sections and nodes and viewing associated annotations.

The Post's relationship graphs, even if they support interactive exploration, remain static in the sense that there's no active database connection and no ability to alter the diagram layout, arrangement, focus, orientation, or other aspects of the graphs, but then the Post's aim is to disseminate information, not provide extensive online analytics. A variety of commercial tools from companies including Advizor Solutions, Inxight, Tom Sawyer Software, and others offer these capabilities. (Advizor's software is licensed by BI providers including Business Objects and Information Builders.)

Figure 1 displays a form of data constellation generated by Advizor's software, and Figure 2 is a social-networking implementation of the TouchGraph open-source visualization system. Visit the Web sites of the software providers listed in Resources for dozens of intriguing relationship-network representations.

FIGURE 1 Data constellation from Advizor Solutions

FIGURE 2 The Spoke system provides interactive social-network visualization via SpokeMap, based on the TouchGraph open-source system.

Extraction and Display

Relationship-network visualization is of course not an end in itself. Ramana Rao, CTO and cofounder of Inxight, which provides tools for both visualization and text mining, explained in an interview that "there are two halves to these problems: an extraction half and a display half.... The diagrams can be only as good as the data." The aim is to enable knowledge discovery in the data, and, in these cases, the visualization tools provide the best means of understanding how the inherent relationship networks are organized — how nodes are distributed and clustered, what paths exist between nodes, and their costs (indicating the proximity of the internode relationships) — and ultimately how they may be optimized and exploited.

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
1 of 2
Comment  | 
Print  | 
More Insights
InformationWeek Is Getting an Upgrade!

Find out more about our plans to improve the look, functionality, and performance of the InformationWeek site in the coming months.

10 Things Your Artificial Intelligence Initiative Needs to Succeed
Lisa Morgan, Freelance Writer,  4/20/2021
Tech Spending Climbs as Digital Business Initiatives Grow
Jessica Davis, Senior Editor, Enterprise Apps,  4/22/2021
Optimizing the CIO and CFO Relationship
Mary E. Shacklett, Technology commentator and President of Transworld Data,  4/13/2021
White Papers
Register for InformationWeek Newsletters
Current Issue
Planning Your Digital Transformation Roadmap
Download this report to learn about the latest technologies and best practices or ensuring a successful transition from outdated business transformation tactics.
Flash Poll