Troubleshooting

Cribl on Kubernetes Troubleshooting

This article addresses some of the common issues related to running Cribl on Kubernetes.

Issue: A pod is crashing continuously or changes its status to "CrashLoopBackOff"

Potential Resolutions:

Run the following command to get more information about what may be causing the condition. Pay special attention to the events at bottom of the below command:

kubectl describe pod <podname> -n <namespace>

Potential Causes:

A container may attempt to create too many Worker processes on startup before it contacts the Leader. Cribl may not get the correct CPU information for the container and will instead pull it for the host.

Potential Resolutions:

Edit the environment variables to ensure that when a container is created, it is limited to only 2 CPU threads. Once the worker contacts the leader, it will then get the correct configuration bundle.

env:
  CRIBL_BOOTSTRAP: |
    cribl/cribl.yml:
      workers:
        count: 2

For deployment.yaml:

env: 
-name: CRIBL_BOOTSTARP:
   value: |
     cribl/cribl.yml:
       workers:
         count: 2

Change Worker Process Count in the Worker Group settings from "-2" to "2" (or another positive integer) to ensure Stream does not attempt to create an incorrect number of processes.

Potential Resolutions:

Change the host value from "127.0.0.1" to "0.0.0.0" in Edge -> Fleet -> Fleet Settings -> System -> General -> API Server Settings -> General. See here for more.

Potential Resolutions:

Follow the AWS guide to configure a Kubernetes service account to assume an IAM role