Data integration vendors are hot to get in on big data
When Hadoop first emerged, we all heard it would displace ETL. That's at least partially true, for some transformation processing, but now data-integration vendors -- like Informatica, Paxata, and, now Pentaho -- are saying their stuff is needed for all sorts of data prep and processing ahead of big-data analysis. It's another case of offering an alternative to clunky MapReduce processing, but I haven't talked to enough customers who have validated how useful these tools can be in big-data-analysis scenarios.
The "80% of the work" line above seems like a relic of relational data warehousing approches, but I need to hear from more practitioners -- yes, this is a naked plea for comments from practitioners -- before passing this off as an overstatement or marketing ploy.