S K S, Author at Adglob Infosystem Pvt Ltd

Storm – Distributed Messaging System

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Apache Storm processes real-time data and the input normally comes from a message queuing system. An external distributed messaging system will provide the input necessary for the realtime computation. Spout…

Apache Storm – Workflow

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

A working Storm cluster should have one nimbus and one or more supervisors. Another important node is Apache ZooKeeper, which will be used for the coordination between the nimbus and…

Apache Storm – Cluster Architecture

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

One of the main highlight of the Apache Storm is that it is a fault-tolerant, fast with no “Single Point of Failure” (SPOF) distributed application. We can install Apache Storm…

Apache Storm – Core Concepts

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Apache Storm reads raw stream of real-time data from one end and passes it through a sequence of small processing units and output the processed / useful information at the…

Apache Storm-Home

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:1 Comment

Storm was originally created by Nathan Marz and team at BackType. BackType is a social analytics company. Later, Storm was acquired and open-sourced by Twitter. In a short time, Apache Storm became a standard…

Advanced Spark Programming

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Spark contains two different types of shared variables − one is broadcast variables and second is accumulators. Broadcast variables − used to efficiently, distribute large values.Accumulators − used to aggregate the information of particular collection.…

Apache Spark – Deployment

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Spark application, using spark-submit, is a shell command used to deploy the Spark application on a cluster. It uses all respective cluster managers through a uniform interface. Therefore, you do…

Apache Spark – Core Programming

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Spark Core is the base of the whole project. It provides distributed task dispatching, scheduling, and basic I/O functionalities. Spark uses a specialized fundamental data structure known as RDD (Resilient…

Apache Spark – Installation

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Spark is Hadoop’s sub-project. Therefore, it is better to install Spark into a Linux based system. The following steps show how to install Apache Spark. Step 1: Verifying Java Installation…

Apache Spark – RDD

Post author:S K S
Post published:August 14, 2021
Post category:apache
Post comments:0 Comments

Resilient Distributed Datasets Resilient Distributed Datasets (RDD) is a fundamental data structure of Spark. It is an immutable distributed collection of objects. Each dataset in RDD is divided into logical…