This topic is about Apache Flink – Flink vs Spark vs Hadoop.
Here is a comprehensive table, which shows the comparison between three most popular big data frameworks: Apache Flink, Apache Spark and Apache Hadoop.
Apache Hadoop | Apache Spark | Apache Flink | |
---|---|---|---|
Year of Origin | 2005 | 2009 | 2009 |
Place of Origin | MapReduce (Google) Hadoop (Yahoo) | University of California, Berkeley | Technical University of Berlin |
Data Processing Engine | Batch | Batch | Stream |
Processing Speed | Slower than Spark and Flink | 100x Faster than Hadoop | Faster than spark |
Programming Languages | Java, C, C++, Ruby, Groovy, Perl, Python | Java, Scala, python and R | Java and Scala |
Programming Model | MapReduce | Resilient distributed Datasets (RDD) | Cyclic dataflows |
Data Transfer | Batch | Batch | Pipelined and Batch |
Memory Management | Disk Based | JVM Managed | Active Managed |
Latency | Low | Medium | Low |
Throughput | Medium | High | High |
Optimization | Manual | Manual | Automatic |
API | Low-level | High-level | High-level |
Streaming Support | NA | Spark Streaming | Flink Streaming |
SQL Support | Hive, Impala | SparkSQL | Table API and SQL |
Graph Support | NA | GraphX | Gelly |
Machine Learning Support | NA | SparkML | FlinkML |
Next Topic : Click Here
Pingback: Apache Flink - Use Cases - Adglob Infosystem Pvt Ltd
Thanks for the post. Awesome.