Just because it was the week between Christmas and New Year's Day, that didn't mean there was any time for rest in the world of big data and analytics. But just in case you took a few days off, we've got you covered. Here's what you may have missed if you blinked during the quiet holiday week.
We recap news about an analytics company with roots in serving the US intelligence community, new insight about what programming language you should learn, benchmarking on big data streaming technologies, and an important application of data science -- analyzing networks of characters in the popular holiday movie Love Actually.
Let's start off with the analytics company. Last week Palantir Technologies raised a hefty $880 million in its most recent round of funding, according to The New York Times, which cited people familiar with the transaction.
The new round brings the company's total amount raised to about $2 billion and makes it worth about $20 billion, according to the report.
Palantir offers analytics and visualization software and works with government agencies including law enforcement and intelligence, as well as with customers in the private sector. Former PayPal employees and Stanford computer scientists started the company in Palo Alto, Calif., in 2004, according to Crunchbase. The New York Times report places the employee number at around 2,000.
Alex Karp, CEO and cofounder, has indicated that the company has no interest in going public due to the nature of its work. According to its website, Palantir works on problems including crime investigation, government intelligence, insurance fraud investigation, disaster relief resource mobilization and delivery, identifying sources of disease outbreaks, and tracking the spread of a disease.
Palantir specializes in intelligence augmentation, which helps experts find patterns in large amounts of data, according to the report. For instance, it could build a profile of a terrorist cell from bank records and mobile phone calls.
The Programming Language to Learn?
If you are looking to advance your career or considering who to hire in 2016, you may be looking at particular skills sets, including coding. But what languages should you target? One programming language instructor has made a discovery in Google Trends: Starting in Nov. 2015, more people were searching for "learn Python" than were searching for "learn Java" for the first time ever. That's when those two points converged as Python continued to rise and Java edged ever so slightly upward over the last few years, the instructor reports.
"Looking at the last 5 years, the demand to 'learn python' was constantly rising," Oli Moser wrote. "Python has already become the number 1 programming course for beginners in many universities years ago. The main reason for this is simple: Python is simple. I think it has the simplest and most intuitive syntax from all programming languages."
The author goes on to recommend that those who aspire to big data positions learn this particular language.
Streaming technologies have been a big topic in big data circles in 2015, and just in time for the new year Yahoo, which is the development bed that spawned Hadoop, has benchmarked three stream processing frameworks: Apache Flink, Spark, and Storm. According to Yahoo, it has been using Storm since 2012, but wanted to look at other options and devised a benchmark now published on GitHub.
Yahoo tested an application related to advertising with 100 campaigns and 10 ads per campaign. It used five Kafka nodes to generate JSON events. More info about the testing is available at a Yahoo engineering blog, and tests are ongoing.
[Can't get enough of moneyball? Check out Secrets of NBA Stephen Curry's Shooting Hidden in Data.]
Love Is All Around
Finally this week, you may be one of the people who watches the British holiday-themed romantic comedy Love Actually as part of your more modern Christmas tradition. The film follows several different story lines of loosely related characters, and days later you may wonder, How is this guy related to this other woman? Well, now someone has used data modeling tools to deliver the answer and has written a blog post about it.
"Even on the eighth or ninth viewing, it's impressive what an intricate network of characters it builds," wrote David Robinson, a data scientist at Stack Overflow in the post. "This got me wondering how we could visualize the connections quantitatively, based on how often characters share scenes. So last night while my family was watching the movie, I loaded up RStudio, downloaded a transcript, and started analyzing."
What follows is a series of visualizations that add a new dimension to the film.
Robinson concludes: "If you look for it, I've got a sneaky feeling you'll find that data actually is all around us."
**Elite 100 2016: DEADLINE EXTENDED TO JAN. 15, 2016** There's still time to be a part of the prestigious InformationWeek Elite 100! Submit your company's application by Jan. 15, 2016. You'll find instructions and a submission form here: InformationWeek's Elite 100 2016.