This topic is about Apache Flink – Batch vs Real-time Processing.
In terms of Big Data, there are two types of processing −
- Batch Processing
- Real-time Processing
Processing based on the data collected over time is called Batch Processing. For example, a bank manager wants to process past one-month data (collected over time) to know the number of cheques that got cancelled in the past 1 month.
Processing based on immediate data for instant result is called Real-time Processing. For example, a bank manager getting a fraud alert immediately after a fraud transaction (instant result) has occurred.
The table given below lists down the differences between Batch and Real-Time Processing −
Batch Processing | Real-Time Processing |
---|---|
Static Files | Event Streams |
Processed Periodically in minute, hour, day etc. | Processed immediatelynanoseconds |
Past data on disk storage | In Memory Storage |
Example − Bill Generation | Example − ATM Transaction Alert |
These days, real-time processing is being used a lot in every organization. Use cases like fraud detection, real-time alerts in healthcare and network attack alert require real-time processing of instant data; a delay of even few milliseconds can have a huge impact.
An ideal tool for such real time use cases would be the one, which can input data as stream and not batch. Apache Flink is that real-time processing tool.
Next Topic : Click Here
Pingback: Apache Flink - Big Data Platform - Adglob Infosystem Pvt Ltd