Apache Flink – Batch vs Real-time Processing

Apache Flink

This topic is about Apache Flink – Batch vs Real-time Processing.

In terms of Big Data, there are two types of processing −

  • Batch Processing
  • Real-time Processing

Processing based on the data collected over time is called Batch Processing. For example, a bank manager wants to process past one-month data (collected over time) to know the number of cheques that got cancelled in the past 1 month.

Processing based on immediate data for instant result is called Real-time Processing. For example, a bank manager getting a fraud alert immediately after a fraud transaction (instant result) has occurred.

The table given below lists down the differences between Batch and Real-Time Processing −

Batch ProcessingReal-Time Processing
Static FilesEvent Streams
Processed Periodically in minute, hour, day etc.Processed immediatelynanoseconds
Past data on disk storageIn Memory Storage
Example − Bill GenerationExample − ATM Transaction Alert

These days, real-time processing is being used a lot in every organization. Use cases like fraud detection, real-time alerts in healthcare and network attack alert require real-time processing of instant data; a delay of even few milliseconds can have a huge impact.

An ideal tool for such real time use cases would be the one, which can input data as stream and not batch. Apache Flink is that real-time processing tool.

Next Topic : Click Here

This Post Has One Comment

Leave a Reply