Big Data // Hardware/Architectures
09:16 AM
Connect Directly

HP Offers Appliance For Microsoft Analytics Platform

HP ConvergedSystem 300 promises fast deployment of SQL Server 2014 and HDInsight Hadoop.

Comment  | 
Print  | 
Newest First  |  Oldest First  |  Threaded View
User Rank: Apprentice
7/29/2014 | 12:37:58 PM
Re: Want a Hadoop Appliance? Go Build It Yourself
Replying comment by Mr. D. Henschen


Maybe Microsoft APS with HDInsight region(s) was designed:

- to take advantage of the internal rack network speed (54 GB/sec) when accessing non-structured data (Polybase allows to access structured and non-structured data just using regular T-SQL. You don't need to learn MapReduce, Hive, Pig, etc. to access non-structured data).

- or to comply with regulatory restrictions (like: "- All the sensitive data must be stored inside the country, not in a I-don't-know-where distributed storage farm")

- or just because accessing data outside on the appliance means to use corporate network which, many times, is not a dedicated network, and the performance is not as good as it should be.

- or because statistical information could be stored in the control node, so it's possible to create improved execution plans when accessing non-structured data

- or because the control node can manage the different workloads, providing better performance to end-user applications

- or because it's easier to manage the HDFS if it's on-prem

- or because it's easier to manage security (who access what) in structured and non-structured data.

- ...

So, IMHO, I guess there are many reasons to deploy such a "mixed" architecture in one rack.

As you mention, it's required, at least, 1 basic unit (2 compute nodes + storage in the HP appliance, or 3 compute nodes + storage in the Dell appliance) with SQL Server EE. All the other nodes could be HDInsight regions (in any flavour: Hadoop, Cloudera, etc.).
D. Henschen
D. Henschen,
User Rank: Author
5/9/2014 | 10:02:28 AM
Want a Hadoop Appliance? Go Build It Yourself
Microsoft APS follows the route of adding Hadoop -- in this case Microsoft's HDInsight distirbution -- to what is otherwise a database appliance -- in this case Microsoft SQL Server 2014 Parallel Data Warehouse. Want a Hadoop-only deployment? You'll have to use Hortonworks Data Platform for Windows. HP does have suggested hardware configurations for HDP, but if you want a Hadoop appliance, it's probably going to throw HP Vertica Community Edition into the bundle -- just as Pivotal likes to mingle Greenplum with the Pivotal HD Hadoop distribution.

I'm not sure these guys are thinking these things through to scale. Relational databases and Hadoop do work together, but if you're doing a data lake with Hadoop, I'm thinking you're only going to need one rack with RDBMS while the Hadoop cluster is going to potentially grow across many racks. Or maybe you would have a couple of RDBMS racks and a handful for Hadoop. Why intermix on a single rack unless your not likely to move beyond the confines of a single rack? PolyBase connects the two worlds no matter where the clusters might be: on separate racks or even in the cloud on Azure. Opinions?  
In A Fever For Big Data
In A Fever For Big Data
Healthcare orgs are relentlessly accumulating data, and a growing array of tools are becoming available to manage it.
Register for InformationWeek Newsletters
White Papers
Current Issue
Increasing IT Agility and Speed To Drive Business Growth
Learn about the steps you'll need to take to transform your IT operation and culture into an agile organization that supports business-driving initiatives.
Twitter Feed
InformationWeek Radio
Sponsored Live Streaming Video
Everything You've Been Told About Mobility Is Wrong
Attend this video symposium with Sean Wisdom, Global Director of Mobility Solutions, and learn about how you can harness powerful new products to mobilize your business potential.