Free Apache Hadoop Alternatives
The best free alternative to Apache Hadoop is Apache Spark, which is also Open Source. If that doesn't suit you, our users have ranked seven alternatives to Apache Hadoop and five of them is free so hopefully you can find a suitable replacement. Other interesting free alternatives to Apache Hadoop are Apache Flink, Disco MapReduce, HPCC Systems and dispy.
Apache Hadoop alternatives are mainly Cloud Computing Services but may also be Web Analytics Services. Filter by these if you want a narrower list of alternatives or looking for a specific functionality of Apache Hadoop.- 9 Apache Spark alternatives
- Cloud Computing Service
- Web Analytics Service
- Free • Open Source
- Mac
- Windows
- Linux
Apache Spark™ is a fast and general engine for large-scale data processing. Speed Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing.
- - Apache Spark is the most popular Windows, Mac & Linux alternative to Apache Hadoop.
- - Apache Spark is the most popular Open Source & free alternative to Apache Hadoop.
Apache Spark Features
- 15 Apache Flink alternatives
- Cloud Computing Service
- Free • Open Source
- Mac
- Windows
- Linux
- BSD
Flink’s core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.
Apache Flink Features
- 6 Disco MapReduce alternatives
- Free • Open Source
- Mac
- Windows
- Linux
Disco is an implementation of mapreduce for distributed computing. Disco supports parallel computations over large data sets, stored on an unreliable cluster of computers, as in the original framework created by Google.
Disco MapReduce Features
No screenshot HPCC Systems offers an open source cluster computing platform used to solve Big Data problems. Its unique architecture and simple yet powerful data programming language (ECL) makes it a compelling solution to solve data intensive computing needs.
HPCC Systems Features
dispy is a Python framework for parallel execution of computations by distributing them across multiple processors on a single machine (SMP), among many machines in a cluster or grid. dispy is well suited for data parallell (SIMD) paradigm where a computation is evaluated with...
dispy Features
No screenshot