We have updated our Terms of Service, Code of Conduct, and Addendum.

Failure of worker node

Silambarasan Selvamani
Silambarasan Selvamani Posts: 1
edited September 2023 in Stream
  1. When a worker nodes fail (down for whatever reason), what happens to the data that the specific worker had received for processing ( like, in-memory , PQ data ) ?
  2. Worker nodes receive data in distributed manner for processing , how does it ensures that data delivered to any third party destination (e.g. splunk ) is the chronological order of the events ?
    1. e.g. in case of hardware issues / slowness if a worker node is experiencing lag in processing data ., can it be expected that data send to any output destination has latency ?

Tagged:

Best Answer

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod
    Answer ✓

    The Worker Nodes and Worker Processes are stateless. Therefore, if one of them died, the Splunk Forwarder will be notified that the TCP connection died, and it will resubmit the request to the next available one. Furthermore, to increase the system’s resiliency, the leader process also acts as a watchdog for worker processes, restarting any that exit or crash.
    As for PQ, for both Source PQ and Destination PQ When the receiver is ready, the output will start draining the queues in FIFO (First In, First Out) fashion and Order is maintained.
    Another option during the draining process for Destination PQ, if Strict ordering is disabled, Cribl Stream will prioritize new events over draining the queue. This is like LIFO (Last In, First Out) fashion.

Answers

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod
    Answer ✓

    The Worker Nodes and Worker Processes are stateless. Therefore, if one of them died, the Splunk Forwarder will be notified that the TCP connection died, and it will resubmit the request to the next available one. Furthermore, to increase the system’s resiliency, the leader process also acts as a watchdog for worker processes, restarting any that exit or crash.
    As for PQ, for both Source PQ and Destination PQ When the receiver is ready, the output will start draining the queues in FIFO (First In, First Out) fashion and Order is maintained.
    Another option during the draining process for Destination PQ, if Strict ordering is disabled, Cribl Stream will prioritize new events over draining the queue. This is like LIFO (Last In, First Out) fashion.