What is Prism? If you're the vendor that sold it to the National Security Agency, Prism is a proprietary black box that applies state-of-the-art predictive analytics to big data to infer relationships between known terrorists and their social networks. That's marketing jargon, so let's break it down.
Note that the only thing proprietary in that last paragraph is the vendor's hokey sales pitch. Everything mentioned there can be built with open-source tools, specifically a scalable distributed graph such as Neo4j and some natural language processing (NLP) libraries from Stanford University. So if you're in government IT or purchasing, don't buy the vendor BS.
First, the graph ...
In theory, every person in the world can be a node on a graph. And every communication between two people is just a relationship between those two unique nodes. So if you were able to compel Verizon and every carrier in the world to give you their complete call records, you could create the world's largest game of Six Degrees of Kevin Bacon.
Supplement those phone records (as the thing that connects two people) with emails, instant messages, known aliases and financial transactions, and your ability to infer relationships dramatically improves.
That, by the way, is the same kind of inference engine that companies such as Amazon use to figure out which products to suggest you buy. It's a more sophisticated way of asking if you want fries with that. Only in this case, instead of advancing commercialism, law enforcement gets to quickly determine the social networks of known terrorists.
This isn't some dystopian Minority Reports-like future. This is good old-fashioned policing supplemented by technology. Instead of manually sifting through phone records and drawing lines on a whiteboard between grainy pictures of suspects (a la every serial killer movie you've ever seen), the NSA is using a graphing engine.
And for the best reason possible: to speed up the narrowing of the search.
Next, the NLP ...
So now you know who's communicating with whom. How can you make sense of content: the billions of hours of real-time voice and email exchanges between people? You certainly don't want to hire tens of millions of analysts to listen, translate and raise their hands whenever someone that's two degrees away from some blind sheikh uses the word jihad.
6 Tools to Protect Big DataMost IT teams have their conventional databases covered in terms of security and business continuity. But as we enter the era of big data, Hadoop, and NoSQL, protection schemes need to evolve. In fact, big data could drive the next big security strategy shift.
Big Data Brings Big Security ProblemsWhy should big data be more difficult to secure? In a word, variety. But the business won’t wait to use it to predict customer behavior, find correlations across disparate data sources, predict fraud or financial risk, and more.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.