4 min read

Big Data's Big Question: What To Keep

Keep as much data as your budget will allow, advises security expert -- it may answer questions you haven't thought up yet.
5 Big Wishes For Big Data Deployments
5 Big Wishes For Big Data Deployments
(click image for larger view and for slideshow)
How much data should your organization save? Storing terabytes of digital information is not only costly, it also can lead to decision-making headaches such as, "What data do I need to keep?"

Opinions on this evolving topic vary, naturally. Some big data gurus advise against saving everything, calling that approach a waste of space and money. On the other hand, increasingly cheap disk and cloud storage may make the save-it-all approach attractive to digital hoarders because, well, you never know when those expense reports from 1955 might come in handy.

Enterprise security expert Joe Gottlieb is no data hoarder, but he does recommend that organizations err on the side of caution when it comes to saving information. Formerly the CEO of Sensage, a security information and event management acquired by cybersecurity firm KEYW last October, Gottlieb is now the head of commercial products at KEYW.

[ Data visualization can play an important role in identifying critical information. Read Big Data: 6 Ways To Find What Matters. ]

"You don't know what question you're going to answer tomorrow, and when you ask it, you'll be relieved that you kept the data," Gottlieb told InformationWeek in a phone interview.

But once you've saved this data, don't forget about it. "Err on the side of keeping more stuff, but keep an eye on it," Gottlieb added. "Keep as much as you can afford to, and push yourself to use it. If you're not using it, you're going to start to feel badly about spending the money to store it."

If you're jumping feet first into big data, don't ignore the security implications. "Are people just chasing big data solutions now, and not necessarily wrapping them with the necessary security?" Gottlieb asked rhetorically. "Of course -- as is similar to any other juggernaut industry trend."

When it comes to big data security, cutting-edge solutions such as predictive analytics may work in some cases, but they're still rough around the edges. "If you have an open security data warehouse, like we do, you could aim predictive analytics at it," said Gottlieb. "You could start running analysis on this data and use predictive analytics algorithms to indicate something that might happen in the future."

This security technology, however, needs work before it goes mainstream. "We're not yet seeing the maturity in that state to really predict attacks," said Gottlieb. "What is more readily available today is to understand that attacks happen over time, and they start throwing off events that are indicative of an initial level of compromise before they get to their ultimate destination or target."

And that's the profile of a botnet or malcode, which may attach itself to an end user workstation when, for instance, the user clicks on something that maybe they shouldn't have, Gottlieb noted. "Then there's code on the user's machine, and that code starts to signal to a control server, and starts to migrate laterally from machine to machine until it gets to an application server that happens to have a lot of data on it," he added.

As the malcode moves across your network, that lateral movement shows subtle behavioral changes -- if you're paying attention. "These attacks tend to go unnoticed over long periods of time," said Gottlieb. "But if you could notice them via data analysis, you'd be able to thwart them."

One good way to find malcode is to study users' behavior. Gottlieb recommends asking these questions:

-- How much are users downloading?

-- Are they creating new accounts?

-- Is data moving into other user accounts?

-- Are users making different connections after making large downloads?

-- Is this activity being initiated with so much speed that a human couldn't do it? If so, "clearly some sort of automated malcode is doing it," Gottlieb said.

The big data market is not just about technologies and platforms -- it's about creating new opportunities and solving problems. The Big Data Conference provides three days of comprehensive content for business and technology professionals seeking to capitalize on the boom in data volume, variety and velocity. The Big Data Conference happens in Chicago, Oct. 21-23.