Cloudera Aims to Replace MapReduce With Spark as Default Hadoop Framework  
   
  Monday, September 14, 2015  
  Looking to tie the Apache Spark in-memory computing framework much closer to Apache Hadoop, Cloudera announced it is leading an effort to make Spark the default data processing framework for Hadoop.
Most IT organizations consider MapReduce to be a fairly arcane programming tool. For that reason, many have adopted any number of SQL engines as mechanisms for querying Hadoop data.
Brandwein noted that there are at least 50 percent more active Spark projects than there are Hadoop projects. They also has five times more engineering resources dedicated to Spark than other Hadoop vendors and has contributed over 370 patches and 43,000 lines of code to the open source stream analytics project.Their long-term goal is to make it possible for Spark jobs to scale simultaneously across multi-tenant clusters with over 10,000 nodes, which will require significant improvements in Spark reliability, stability, and performance.
Cloudera also led the integration of Spark with Yarn for shared resource management on Hadoop as well integration efforts,to make Spark simpler to manage in enterprise production environments and ensuring that Spark Streaming supports at least 80 percent of common stream processing workloads. Finally, Cloudera will look to improve Spark Streaming performance in addition to opening up those real-time workloads to higher-level language extensions.
 
     
 
COPYRIGHT (C) 1992-2015 China HPC Technology ALL RIGHTS RESERVED. MAIL TO US .