Apache Spark Graduates from Incubator

By Joe Casad

Cool Big Data analysis framework offers up to 100x speed increase over ordinary MapReduce.

The Apache Foundation has voted to elevate the Apache Spark project from the Apache incubator and give it full status as an Apache top-level project. Apache Spark process defines itself as "… a fast and general engine for large-scale data processing." The tool has recently come to prominence as a fast-processing analytics framework for Hadoop. According to the Spark project website, Spark can "run programs up to 100 times fast than Hadoop MapReduce in memory or 10 times faster on disk."
The recent attention to Big Data technologies has raised the visibility of Spark, as developers look for ways to speed up processing of large and complex queries. The project claims to have had 5,000 commits in the past 6 months, and other promising data-analysis projects, such as Shark for SQL, MLib, and GraphX, are based on Spark.

02/18/2014