Hadoop and other big data analysis apps promise to help cops catch more criminals, but balancing privacy concerns won't be easy.
Despite a name lacking any street cred, a mission only vaguely understood by most non-specialists, and concerns raised by privacy and Constitutional-law experts, the Hadoop big data analysis app is gathering fans on both sides of the thin blue line.
Most enthusiastic are analysts specializing in either crime or data analysis, both of which can claim a clear potential benefit from piles of data large enough that the answer to a given question can't help but be in there somewhere.
For example, big data comprised of more than 1 million police emergency-call records helped redraw a central Texas city's police patrol-beat boundaries into fiefdoms far more easily patrolled than if the entire city were one big hot zone, according to Scott Dickson, a Texas crime analyst and consultant.
By pooling results of all those 911 calls, Dickson was able to assign specific types and levels of risk in different areas of town to decide how many patrols were needed in which areas, and for what type of crime. Frequent but low-intensity crimes such as vandalism or burglary require more day-to-day attention than murders, for example, because murder is comparatively rare.
The result was a more efficient use of a very scarce resource (cops on the beat) and a high level of confidence in levels of police coverage, Dickson said.
Most departments have vast piles of data that could be analyzed to make staff and budget decisions more efficient, but only the largest, best-funded police departments are able to consider it, according to the report "Big Data Solutions for Law Enforcement" from consulting company CTOVision.
"The largest, best-funded agencies such as the LAPD handle information with special departments, contracts with firms like IBM, and partnerships with universities," according to Alexander Olesker, technology research analyst at Crucial Point, which owns CTOVision. "As a result, many departments only collect information structured and small enough to fit in a spreadsheet and do nothing more complicated than sums and averages with it to determine crime rates," he said. More departments able to store more data about more incidents can make it easier to reach and defend decisions about what to do about crime hotspots and repeat offenders, Olesker wrote.
Big data analysis can not only pinpoint potential trouble spots in the neighborhood before violence erupts, it can predict more crimes, more accurately than any other available method.
UCLA mathematician and big data pioneer George Mohler made the connection that the incidence and the quantifiable instance of crime makes it as predictable as any other cyclical human activity or characteristic--birth rates or political affiliations, for example--making it theoretically possible for the police to get to a crime scene before the culprits.
In Santa Cruz, Calif., big data analysis helped solve a crime by estimating the probability that houses in a particular area are more likely to be robbed during a specific day and time during the week.
Those records are not complete enough and the algorithms are not sophisticated enough to identify actual criminals or specific crimes before the crimes are committed, but that doesn't stop police from wishing big data could keep both the streets and the cops safer while reducing crime and saving money for taxpayers by making police work more efficient.
Efficiency Vs. Privacy
That raises the central contradiction of big data within the discussion of public interest and law enforcement: The legitimate use of big data unquestionably delivers tremendous benefits to those seeking answers buried like molecules of needle in a galaxy-sized haystack, according to Omer Tene, associate professor at the College of Management Haim Striks School of Law.
"Data deluge"--the construction of massive databases of customer-usage data in an effort to find a thread of truth within it--raises the immediate risk of privacy violations that would turn most consumers off to big data and law enforcement at the same time.
Big data analysis can predict the type, outbreak and severity of influenza epidemics, for example, only if analysts are able to assemble enough data to make analysis meaningful, he wrote.
Consent-based regulation--asking customers to opt in to various forms of behavior tracking--distorts any picture emerging from the data because it removes the customers most able to provide a contrarian view of the question at hand.
Privacy advocates continue to object to both data-mining behavioral analytics as tools for police investigations for the same reason they object to racial profiling--both are efforts to punish people for acts they may not yet have committed. Consider the privacy-protection activists that warn about initiatives like those in Arizona and California to take pictures of every license plate on every car that passes through the state on the off chance one is occupied by a fugitive. That's not law enforcement, it's Minority Report, some complain.
One solution is to create a balance between the needs of police investigators and privacy protection for citizens, Tene wrote.
That would require limits on what police were allowed to do to collect new data and how they might be permitted to do so without violating rules of privilege and controls over authority laid out in the Constitution.
"Privacy advocates and data regulators increasingly decry the era of big data as they observe the growing ubiquity of data collection and the increasingly robust uses of data enabled by powerful processors and unlimited storage," Tene wrote. "The question of the legitimacy of data use has always been intended to take into account additional values beyond privacy, as seen in the example of law enforcement, which has traditionally been allotted a degree of freedom to override privacy restrictions."
In an age when everyone from AT&T to websites to government agencies is gathering as much data on individuals as possible to sell them something or arrest them the instant they commit a crime, reaching a practical balance between the two is likely to be very difficult.
That, at least, according to analysts and, oddly, the Federal Trade Commission, which is looking at how effective Do Not Track programs are in protecting consumer privacy and how far it might have to go in restricting the ability of marketers and cops to strike while the iron is still getting hot.
Minority Report was an interesting movie but, like most movies, didn't accurately represent either the technical possibilities or ethical concerns accurately when addressing the issue of cops and big data.
The Agile ArchiveWhen it comes to managing data, donít look at backup and archiving systems as burdens and cost centers. A well-designed archive can enhance data protection and restores, ease search and e-discovery efforts, and save money by intelligently moving data from expensive primary storage systems.
2014 Analytics, BI, and Information Management SurveyITís tried for years to simplify data analytics and business intelligence efforts. Have visual analysis tools and Hadoop and NoSQL databases helped? Respondents to our 2014 InformationWeek Analytics, Business Intelligence, and Information Management Survey have a mixed outlook.
Join us for a roundup of the top stories on InformationWeek.com for the week of December 14, 2014. Be here for the show and for the incredible Friday Afternoon Conversation that runs beside the program.