Handling System Metrics overwhelming 4200 port
William Chen
Posts: 1 mod
in Cloud
Has anyone experienced system metrics crippling the 4200 port? What are some best practices or documentation to combat this and just setting configurations for system metrics in general?
1
Answers
-
What are you using to send the metrics to Stream? Are you using Cribl Edge or another Stream instance?
1 -
- The specific action depends on the actual problem with the metrics. To determine that may require more analysis or simply some trial and error.
- In no particular order:
- Disable
Full Fidelity
option in the Cribl internal metrics source. Not only does this send full fidelity through the pipelines but also to the leader. - If disk IO is high then disable
Metrics Persistence
on leader node. This doesn't prevent the leader node from still receiving the metrics from clients but minimizes the impact involved with writing to disk. The side effect is that if you restart the leader node (or use leader HA and have a failover) then the metrics will not be be retained. - Reduce cardinality limit in Group Settings->General->Limits->Metrics. It can be difficult to know whether to modify this or not so if the metrics are being sent to an analytics tool downstream then that tool can be used to determine if cardinality is high for any particular metric.
- In a fleet or worker group's settings: modify the Metrics Never Drop List by removing anything that you may have added in the past. By default only
total.*
andsystem.*
are specified; these will populate your Event In/Out and CPU Load graphs on the leader's Monitoring page so if you remove them (which you may need to) then you'll lose those graphs. If you've added more to this list then that means more metrics are possibly being sent to the leader that may have been previously getting dropped by the clients. - Add more entries to the
Disable field metrics
list to exclude metrics from being sent to the leader. By default this containshost
,source
,sourcetype
,index
, andproject
.
If there are numerous Edge clients (more than Stream workers) then you may need to modify theMetrics to send from Edge Nodes
setting to reduce the metrics being sent. By default it's set toBasic
. If you've increased the metrics in the past by modifying this setting then change it back toBasic
or even tryMinimal
.
0