There are two features in this visualization that you won't get with off-the-shelf technology: explanatory textual annotations and editorial judgment in deciding what words represent major themes. (N.B., the Times' editorial judgment also extended to excluding candidates they didn't see as major such as Dennis Kucinich and Ron Paul.)
I found the Times's second visualization, Naming Names, to be an interesting stunt that helps those who are so inclined to understand a bit about campaign dynamics. It depicts a sort of social network, which candidate commented about another candidate. There are annotations here too, representative quotations, and additional data is cleverly folded in: the thickness of a segment corresponds to the total number of words spoken by each candidate in each debate.
Lastly, the on-line versions of the graphics use the now familiar red and blue colors for Republicans and Democrats, respectively, which lends the visualizations immediate familiarity.
Contrast a set of tag clouds for words used in an April 26 debate of Democratic candidates. The arrangement of the clouds makes it difficult to compare them, but worse, the size of each word in each cloud is relative to other words spoken by the particular candidate. That is, word sizing is not consistent across the set of clouds; a given size will represent a different number of utterances in each cloud.
These tag clouds were reportedly created by the head of a media-analysis firm, yet even beyond lack of comparability, little editorial judgment was applied. How else to explain the inclusion of words such as "away," "given," and "brian"? (Was the moderator perhaps named Brian?) And then there's a general defect with tag clouds, that the spatial arrangement of words within a cloud means nothing. Wouldn't it be great if folks who generate tag clouds would find a way to position words according to, say, semantic similarity? "Iraq," "war," and "fighting" would be clustered, as would "health," "mental," and "insurance," creating structure from what is otherwise just a bag of words.
So compliments to the New York Times for effective and judicious use of visualizations to communicate insights from what in less skillful hands is just a jumble of data.
Seth Grimes is an analytics strategist with Washington DC based Alta Plana Corporation. He consults on data management and analysis systems.I do admire a nice visualization, one whose composition suits the nature of the underlying data, one designed to communicate rather than as a means of showing off technology. Given these criteria, the New York Times delivered twice last Sunday with a pair of visualizations that nicely distill presidential-campaign themes and dynamics from what was otherwise a mighty big pile of words: debate transcripts. The Times's visualizations are useful in another way. They exemplify good design, especially when contrasted with other technology-first visualizations on similar topics.