There's growing demand to analyze Facebook, Twitter and other social media, but most tools fall short. Here are six capabilities to look for in next-generation products.
On to my categories: Metadata, Resolution, Integration, Alignment, Interface, and Walk the Talk.
Metadata: Even the most basic SMA tools can pull in Twitter, blog, news, Facebook, and other feeds, if not directly, then via services such as Spinn3r, Gnip, Factiva, and Moreover. Raw feeds are fine, but I'm interested in what's in and surrounds the feeds, namely metadata. Metadata records who posted, where, and when; it may also capture location.
Strong SMA will look further at the message "envelope" to discern interrelationships. Is a given message a retweet, a reply, a comment, or some other form of link to an external resource or message? Is it part of an exchange? Let's not look at messages in isolation, as so many tools do. SMA tool makers: Help us understand message diffusion and discourse (threaded conversations) with an analytic that incorporates demographics.
Resolution is the ability to extract data from the content of social postings and other source materials, both non-explicit metadata and information locked in the content. By "non-explicit metadata," I mean primarily identity information. Take my own Twitter account as an example. My profile includes my real name and my account location (which is different from a tweet's location), and I also include, as many Twitter users do, a link to a page with lots more information about me. A strong SMA tool will tap this information.
Content analysis is the real challenge, getting at the entities (names of people, companies, places, products, etc.), facts, opinions, and signals (for example, "I'm shopping for a new car"; "Can anyone recommend a restaurant in Duluth?"). For this, you need sophisticated natural language processing (NLP) and sentiment analysis with the ability to resolve parts of speech and, especially for source materials longer than tweets, to spot co-references including anaphora -- instances where, for example, "Barack Obama," "the president," and "he" refer to the same person.
In my opinion, SMA done right can resolve sentiment at the feature level, related to entities and topics, and can distinguish opinion holder from object. A tweet that says:
@consumerist Gilbert Gottfried Loses Aflac Duck Gig Because He Thinks The Japan Tsunami Is Hilarious
illustrates the latter need: @consumerist ≠ Gilbert Gottfried. I found that tweet, by the way, via a search on "japan tsunami" using the TweetFeel Twitter sentiment tool. TweetFeel correctly saw the "japan tsunami" sentiment expressed as positive; it's sentiment about Gilbert Gottfried that is negative, but it's not "Gilbert Gottfried" that I searched on.
Resolution needs to extend to complex, compound messages. The tweet
@OutsellInc RT @jillfgibson: Love this take of #visualization through the ages RT @infobeautiful Vintage InfoPorn No.1 http://bit.ly/ibl9Uy
involves three tweeters and three submessages (let's call them), which strong SMA would recognize and decompose. Again, message-level analysis ignores information.
The Integration imperative is captured in two snippets from my Down-with-Silos tirade, above. I look for social analysis that "bridges into the enterprise systems that mediate and record value-generating ($$$) business transactions" and that can tell me "what hearers did in response" to social messages.
To integrate, or link records across sources, you need to capture or discern identity. I think the information is more available than most people would suppose, with significant digital sleuthing involved in discerning it. Of course, sleuthing, or even use of openly available or permitted identity information -- few users read terms of service nowadays -- can create a creepiness factor at best and a violation of privacy rules, with reputation implications at worst.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.