Sometimes it's not the size of the data but what you can do with it.
The City of Buffalo, N.Y, has been doing neighborhood "clean sweeps" for more than a decade, trying to address trouble spots in neighborhoods. But the city wasn't using much more than gut feeling on where to go.
"We were going by what law enforcement was telling us were hot streets," recalled Oswaldo Mestre Jr., the director of Buffalo's division of citizen services. Mestre, who oversees Buffalo's call center, had data of his own from the 311 system installed in 2006. By 2011, he thought he should be able to use it to get a better idea of what neighborhoods needed clean sweeps. He also knew that the previous head of MIS, Raj Mehta, had finagled a way to hire a geographical information specialist to help with a redistricting project in 2010.
[ It takes more than tech skills to be a big data guru. Read more: 3 Key Skills Of Successful Data Scientists. ]
So Mestre asked for a meeting with IT in early 2012, and presented his data dilemma. Daryl Springer, now Buffalo's supervisor of data processing operations, was in that meeting. "Oswaldo had a whole bunch of data and data sources from agencies in and out of the city, from the 311 application, from the Common Council (Buffalo's legislative branch), and from citizen services, and they wanted to compile all this data," Springer said.
Mestre wanted to see what it would look like to geocode housing violation data on a density map, along with data on major crimes -- murder, assault, armed robbery and narcotics -- to see if there were correlations.
Springer knew it would be a challenge; "It's always a snap working with lots of different data sources," he said, with a sardonic laugh. More seriously, Buffalo had not done a significant geocoding project of this sort. But Springer knew he could work with the call center's CRM System, Kana's LAGAN, to set up a way to extract housing violations data into an Excel spreadsheet for geocoding, which took him a few days of work.
The amount of data was not huge by big data standards; for instance, the 311 system in Buffalo gets about 300,000 calls a year. Still, Christopher Conlee, GIS specialist in Buffalo's MIS department, found that a year's worth of calls, mixed with crime report and housing violation data, was useless. "You get to a certain point where you have too much information on a map and it loses its value," Conlee said. "You can make any correlation or see any types of trends you want at that level."
It took Conlee a day or so of working with the data to find that limiting the data set to six months gave the best picture of the city's issues. He created two maps, one built around crime data, the other including socioeconomic data derived from the U.S. Census bureau.
The Clean Sweep maps are some of the most complex GIS visualizations Buffalo uses. Conlee receives data in Excel format from the nine Common Council districts, from the police department, from the 311 system, and from the Mayor's office and community block clubs. He then geocodes the data and uses a spatial analysis module in ESRI's GIS product to create density maps.
Creating these visualizations required Buffalo to expand its virtual data center, moving from two CPU cores to five on both of its GIS servers, and from about 4 GB of RAM to as much as 8 GB.
For Mestre, the maps have been a revelation. With them, he can show where poverty, crime and 311 calls overlap, suggesting neighborhoods with special needs. Working through Mayor Byron W. Brown, Mestre began expanding the groups that participated in the clean sweeps, adding a number of local government service departments, county and state agencies and non-profits. Kana Software's Steve Carter said many of its customers do things similar to clean sweeps, but Buffalo is unusual for the breadth of cooperation between services and agencies, and the things that get fixed.
On a recent Wednesday afternoon, Mestre has just come from a Clean Sweep in Delavan Grider, part of the Masten district in east Buffalo. Along with his department were 70 to 80 workers from different city and county agencies, as well as some area non-profits, and local utilities because it was an area that had indicators of illegal cable and electricity sharing. Streetlights were fixed, vacant buildings boarded up, home inspections were performed.
"I'm not saying that doing the Clean Sweep is the end of all problems," Mestre said. "But in neighborhoods that feel like they've been forgotten, they say like, 'wow.'"
Buffalo residents have a lot more opportunities to say wow. Last year, the first with the new density maps, Buffalo more than tripled the number of clean sweeps, doing 27. This year, Buffalo will set a new record, 28 clean sweeps. It also expanded its Enforcement Mini Sweeps, where groups like the Buffalo police, state parole officers, county probation officers and the Buffalo Peacemakers Gang Intervention Program target particular high-crime areas. Last year, there were two. This year, there are 20 so far, on track for 25.
Those numbers aren't big data, either, but they're huge in the life of the city. "Our partners have grown, the amount of clean sweeps we've done has grown, my department hasn't grown," Mestre said. He credits various partner groups for stepping up, giving better data to help refine the clean sweep process. And he gives IT a plug, too. "I'm excited about the fact we can go up to our IT department and get a one-stop shop, somebody who understands our needs and can respond."
Making decisions based on flashy macro trends while ignoring "little data" fundamentals is a recipe for failure. Also in the new, all-digital Blinded By Big Data issue of InformationWeek: How Coke Bottling's CIO manages mobile strategy. (Free registration required.)