Just how big is big data? What problems does Hadoop solve? This mom explained it in plain English using the example of an online game.

Daria Hutchinson, Sr. Manager, Technical Publications, Platfora

July 2, 2014

4 Min Read

to record every 30 minutes. For just one day, that would be over 14 million things.

Jo: Wow. That sounds like a lot!

Me: If you think it's big, then it's big.

Variety: Big data is complicated
Me: The second thing about big data is that it comes in all different formats, which makes it hard to store it all together in the same place and to look at it all together at the same time. For example, your game has an online version and a mobile app, which are both collecting different data about what you're doing.

Jo: That sounds complicated.

Me: It is complicated. Companies are collecting data from a lot of different places, and they need somewhere to put it so it can all live together. There is software called Hadoop that solves this problem. Hadoop is one big file system, and just like the file system on your laptop, it can hold many different kinds of files easily. Hadoop solves the size problem of big data, too. If your data gets too big to fit on your computer, you can just add more computers. It grows as your data grows.

Jo: So Hadoop solves all of the problems of big data?

Me: It solves some, but not all of them. It does a good job at storing lots of data of all different kinds, but it is still hard to read and understand the data. Each type of file needs a special program to read it, just like you need a music player to listen to your songs and a photo viewer to look at your pictures. It is really hard to make these special programs for reading data, so companies don't always ask all of the questions they want.

Velocity: How long are you willing to wait for answers?
Me: The third thing about big data is that it can come in very fast at times. Just think about when your game has a special promotion. More players start using the game, so the game company has to collect more data than usual. It can also be hard to read the data quickly. With big data, you have to be fast at capturing the data, but you also have to be fast at reading it. Nobody likes to wait. For some questions, a little bit of waiting is acceptable. For others, answers are needed right away.

Jo: Like what?

Me: What if you bought a new purse in your game but had to wait 24 hours until you could use it. Would that be OK with you?

Jo: No. I want it right away!

Me: Tracking what you bought might have to be handled in a different way, so you never have to wait. How about adding new items to the store for you to buy? How long would you be willing to wait for that?

Jo: I'd be OK waiting a few days for new things to be added to the store… as long as they were cute things.

Me: So figuring out what new things to add to the store is something that can take a little longer to figure out. The game company might need to ask a lot of questions of the data. If the company had to wait a long time for the answers, it might not ask all the questions necessary to figure out what those things are.

Value: Big data means nothing without human insight
Jo: So if my game company could keep all the data, look at it all together, and ask questions really fast, they would be able to make the game more fun for me? Do you think they would ever add pets to their mobile app?

Me: That is the whole point -- finding something valuable in all that data is like winning a purse full of gold coins. When people can see the data and explore it easily, then they can learn some very interesting things. So, yes, if the people at your game company knew how much time and money you spent on pets, I bet they would add pets to the mobile app.

Jo: That would be really cool! Now I get big data.

Success! I had explained big data to a fifth grader. It was a great moment to share with Jo what I do during my workday and fuel her curiosity about technology. Perhaps one day she may also choose to join the growing ranks of women in our industry.

InformationWeek's June Must Reads is a compendium of our best recent coverage of big data. Find out one CIO's take on what's driving big data, key points on platform considerations, why a recent White House report on the topic has earned praise and skepticism, and much more (registration required).

About the Author(s)

Daria Hutchinson

Sr. Manager, Technical Publications, Platfora

Daria Hutchinson is the Senior Manager of Technical Publications at Platfora, a big-data analytics software company, where she writes all types of technical prose, including user guides, training curriculum, instructional videos, whitepapers, and online help systems. Daria is a former Nordstrom buyer turned big-data enthusiast. 

Never Miss a Beat: Get a snapshot of the issues affecting the IT industry straight to your inbox.

You May Also Like


More Insights