Apache Storm workflow


A working Storm cluster should have a Nimbus and one or more supervisors. Another important node is Apache ZooKeeper, which will be used for coordination between nimbus and supervisors.

Now let’s take a closer look at the workflow of Apache Storm −

  • Initially, nimbus will wait for the “Storm topology” to be submitted to it.

  • Once the topology is submitted, it processes the topology and collects all the tasks to be executed and the order in which the tasks will be executed.

  • Nimbus then distributes the tasks evenly to all available supervisors.

  • At specific intervals, all supervisors will send heartbeats to notify them that they are still running.

  • When a supervisor terminates and does not send a heartbeat, then nimbus assigns the task to another supervisor.

  • When nimbus itself terminates, the supervisor will work on the assigned tasks without any problems.

  • Once all tasks are completed, the supervisor will wait for new tasks to come in.

  • At the same time, terminating nimbus will be automatically restarted by the service monitoring tool.

  • The restarted network will continue where it left off. Likewise, terminating the supervisor can also automatically restart it. Since both the network manager and supervisor can be automatically restarted and both will continue as before, Storm guarantees that all tasks will be processed at least once.

  • Once all topologies have been processed, the network manager waits for new topologies to arrive, and similarly, the manager waits for new tasks.

By default, there are two modes in the Storm cluster:

  • Local mode - This mode is used for development , testing and debugging because it is the easiest way to see all topology components working together. In this mode we can adjust parameters, allowing us to see how our topology performs in different Storm configuration environments. In local mode, the Storm topology runs in a single JVM on the local machine.

  • Production Mode - In this mode we submit the topology to a working Storm cluster, which consists of many processes, usually running on different machines . As discussed in Storm's workflow, the worker cluster will run indefinitely until it is shut down.