Guidance on when to add additional processors due to a large number of consecutive tasks?

Brendan Dalpe · September 2023

The sizing pages talk about scaling in regards to data volumes in and out. However, is there any guidance on when to add additional processors due to a large number of consecutive tasks (i.e. Collection Jobs)? Does each Collection Job run in its own worker process? On an 2 node worker group with 8 vCPUs/ea could we run into queuing issues if we had 50 or 100 collectors attempting to run at the same time?

Brandon McCombs · September 2023

Jobs are broken into tasks which are put into a job queue and are taken off the queue in the leader node as the worker processes in the group request tasks to complete. In this manner, all the data that was discovered is distributed as evenly as possible across the worker group.

Brendan Dalpe · September 2023

Something to consider is the limits page regarding the number of jobs/tasks that can be run concurrently: https://docs.cribl.io/stream/collectors-job-limits/

Brandon McCombs · September 2023

It's best to avoid scheduling jobs in such a way that they run simultaneously. Some overlap may be unavoidable but the more processes that are available then the more tasks that can be executed to finish a job.

Brendan Dalpe · September 2023

Thank you all

Jon Rust · September 2023

for larger collection use cases, i'd encourage a separate worker group dedicated to collection

Guidance on when to add additional processors due to a large number of consecutive tasks?

Answers

Categories