Big Data is Watching You! - InformationWeek

InformationWeek is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them.Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

IoT
IoT
Software // Information Management
Commentary
8/11/2010
03:55 PM
Curt Monash
Curt Monash
Commentary
50%
50%

Big Data is Watching You!

The subjects of large-scale analytics may be categorized as: people, financial trades, electronic networks, and everything else... Some of the most interesting -- and potentially privacy-invading -- use cases are concentrated in the areas of identifying individuals, groups of people, or behaviors of (groups of) people. For example

There's a boom in large-scale analytics. The subjects of this analysis may be categorized as:

  • People
  • Financial trades
  • Electronic networks
  • Everything else
The most varied, interesting, and valuable of those four categories is the first one. That may change some day, with the growing importance of machine-generated data, and of big-data science in particular. But I think it's a fair assessment at the present, and for at least the next few years.

Some of the most interesting use cases are concentrated in the areas of identifying individuals, groups of people, or behaviors of (groups of) people. For example...

  • comScore works hard to identify individual web surfers -- i.e. to deanonymize them -- even though they may have given incomplete or false personal information.
  • Other companies at least try to figure out which information in a user's profile is unreliable, so as to classify them better. (Yes, there are 62-year-old video-game-obsessed Lady Gaga fans, but that's generally not the way to bet.)
  • Multiple telecom vendors try to identify who their most influential customers are (to a first approximation, they're the ones most often called by the most people, but it surely gets more sophisticated than that). This information is then used to reduce churn, either by working hard to retain those users, or -- if they do churn -- to move very fast to retain the business from their friends.
  • Other kinds of companies do similar kinds of analysis, to the extent that they have enough of a social graph to do so. (This application is a case where the term "social graph" is not a misnomer.)
  • Turing detectives (I just coined that phrase) try to determine whether users are humans or bots.
  • Central to detecting insurance fraud is identifying suspiciously close connections between claimants, service providers, and so on.
  • Identifying groups of people is also important in flagging insider trading. Even more important are other kinds of analysis, along the lines of "is this normal innocent trading behavior?"
  • Intelligence agencies try to detect networks of terrorists and their sympathizers. They further try to identify unusual patterns of communication or meetings along those networks that might indicate terrorist acts are being planned. (Civilian law enforcement agencies can use similar techniques.)
In most cases, the analysis and/or run-time execution of the relevant models is done with the help of analytic DBMS. Other technologies that come into play include non-DBMS MapReduce (Hadoop), graph engines, and CEP (Complex Event Processing). The vendor most heavily represented on that list is probably Aster Data, because:
  • Aster Data is focused on hard-core analytics.
  • I talk a lot with Aster Data, and in particular had a long, detailed use-cases discussion with them last week.
  • The comScore example happens to come from a speaker at an Aster event I also participated in. And by the way, all this only scratches the surface of what will be possible down the road. It's based mainly on where you live, what you purchase, how you behave on websites, and who you communicate with. Other kinds of data, which could be used to be yet more intrusive, generally aren't involved.
I actually have two points in drawing up this list. One is golly-gee-whiz about how a lot of analytically sophisticated applications are actually getting into production. The other is to highlight the privacy and liberty threats If This Goes On Unchecked (which is why I didn't include some other less-people-focused examples). There's also a related danger that, to the extent we don't get some smart regulations to keep us safe(r), we'll get a bunch of stupid regulations instead.

The Analytic Era has only just begun.The subjects of large-scale analytics may be categorized as: people, financial trades, electronic networks, and everything else... Some of the most interesting -- and potentially privacy-invading -- use cases are concentrated in the areas of identifying individuals, groups of people, or behaviors of (groups of) people. For example

We welcome your comments on this topic on our social media channels, or [contact us directly] with questions about the site.
Comment  | 
Print  | 
More Insights
News
Rethinking IT: Tech Investments that Drive Business Growth
Jessica Davis, Senior Editor, Enterprise Apps,  10/3/2019
Slideshows
IT Careers: 12 Job Skills in Demand for 2020
Cynthia Harvey, Freelance Journalist, InformationWeek,  10/1/2019
Commentary
Six Inevitable Technologies and the Milestones They Unlock
Guest Commentary, Guest Commentary,  10/3/2019
White Papers
Register for InformationWeek Newsletters
Video
Current Issue
Data Science and AI in the Fast Lane
This IT Trend Report will help you gain insight into how quickly and dramatically data science is influencing how enterprises are managed and where they will derive business success. Read the report today!
Slideshows
Flash Poll