This article addresses some of the common issues related to running Cribl on Kubernetes.
Issue: A pod is crashing continuously or changes its status to "CrashLoopBackOff"
Potential Resolutions:
- Run the following command to get more information about what may be causing the condition. Pay special attention to the events at bottom of the below command:
kubectl describe pod <podname> -n <namespace>
Issue: Too many worker processes are being launched at startup
Potential Causes:
- A container may attempt to create too many Worker processes on startup before it contacts the Leader. Cribl may not get the correct CPU information for the container and will instead pull it for the host.
Potential Resolutions:
- Edit the environment variables to ensure that when a container is created, it is limited to only 2 CPU threads. Once the worker contacts the leader, it will then get the correct configuration bundle.
For the Helm chart:
env:
CRIBL_BOOTSTRAP: |
cribl/cribl.yml:
workers:
count: 2
For deployment.yaml:
env:
-name: CRIBL_BOOTSTARP:
value: |
cribl/cribl.yml:
workers:
count: 2
Issue: Pods are dying due to failed health checks (for Cribl Edge)
Potential Resolutions:
- Change the host value from "127.0.0.1" to "0.0.0.0" in Edge -> Fleet -> Fleet Settings -> System -> General -> API Server Settings -> General. See here for more.
Issue: When attempting to write to AWS S3 from EKS Stream Workers or Edge nodes, you get an "Access Denied" error
Potential Resolutions:
- Follow the AWS guide to configure a Kubernetes service account to assume an IAM role