We have updated our Terms of Service, Code of Conduct, and Addendum.

Guidance on when to add additional processors due to a large number of consecutive tasks?

Options

The sizing pages talk about scaling in regards to data volumes in and out. However, is there any guidance on when to add additional processors due to a large number of consecutive tasks (i.e. Collection Jobs)? Does each Collection Job run in its own worker process? On an 2 node worker group with 8 vCPUs/ea could we run into queuing issues if we had 50 or 100 collectors attempting to run at the same time?

Answers

  • Brandon McCombs
    Options

    Jobs are broken into tasks which are put into a job queue and are taken off the queue in the leader node as the worker processes in the group request tasks to complete. In this manner, all the data that was discovered is distributed as evenly as possible across the worker group.

  • Brendan Dalpe
    Brendan Dalpe Posts: 201 mod
    Options

    Something to consider is the limits page regarding the number of jobs/tasks that can be run concurrently: https://docs.cribl.io/stream/collectors-job-limits/

  • Brandon McCombs
    Options

    It's best to avoid scheduling jobs in such a way that they run simultaneously. Some overlap may be unavoidable but the more processes that are available then the more tasks that can be executed to finish a job.

  • Brendan Dalpe
    Brendan Dalpe Posts: 201 mod
    Options

    Thank you all

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    for larger collection use cases, i'd encourage a separate worker group dedicated to collection