Summary
Cribl is able to perform as a syslog receiver, simplifying and streamlining your data collection architecture. Below are some best practices and other tips to help with receiving syslog data. For even more information including a detailed video on architecture, check out the blog. And check out this one for even more syslog goodness!
Best Practices
When possible, set the Stream Syslog Source to listen on ports above 1024.
- This prevents needing to give Stream privileged OS access.
- If necessary, such as when the originating source is hard coded to use a lower port (like 514), follow the instructions.
Use a load balancer to spread the data over multiple Workers / processes.
- Most syslog data sources won't have a way to balance load automatically, so a load balancer can help take advantage of scaled Workers.
- Try to avoid stickiness so that the load balancer does not get 'stuck' sending to one Worker.
- This can also help avoid a 'pinning' situation, where one process is taking the entire load. Too much data can overwhelm a single process. NOTE: as of version 4.6, Cribl now supports included TCP syslog load balancing to alleviate this issue.
Set up multiple Stream Sources with different ports for different syslog senders or source types.
- This can help with the pinning issue described above.
- It also can simplify filters, such as in Routes, where you can just put in the ID of the source (as opposed to having to construct filters that use the source plus some other event information). Typically different sources have different Pipelines anyway, so you will likely create separate routes. Separate sources makes this very straight forward and make the Routes more readable.
Other considerations:
- If you have a high volume of UDP data, check out the buffer tuning and considerations in the docs.
- Where possible, standardize to using UTC for timestamps. Double check the syslog source data has correct timezone information. If not, use Cribl techniques like Lookups, Pipelines, or try out the Syslog Pack in the Dispensary for pre-built pipelines to condition syslog data (including time stamps).
- Since many syslog senders do NOT queue data, consider using Stream's Persistent Queueing if needed to hold data in case of potential sending issues to your Destinations.
Troubleshooting
Use the internal field '__syslogFail' to assist with troubleshooting or for data quality.
- The syslog source checks if the message passes proper syslog RFCs and appends this internal field with the result.
- If you see data issues, check this field value on events in Stream. If 'true', the originating source may not be sending properly formatted messages.
- Some customers use this field as a filter to block or drop messages that aren't properly formatted.
If data is not showing up in Stream, such as in a capture or the Source does not show incoming data:
- Commit and Deploy after making any changes to the Source(s).
- Make sure the port is above 1024 OR ensure Cribl has been given access to listen on lower ports.
- Do a 'netstat' on a target Worker and confirm the port is open and the Cribl process is the listener. A 'telnet' from another authorized system is a good test of port availability.
- Confirm network connectivity end-to-end; typical issues include proxies (known and misconfigured, or unknown), firewall or security groups, and DNS (it's always DNS! /joke). It's very rare to have a Cribl receiving issue, assuming the port is open and active.
- Try to netcat (nc) a message to a Worker from another system. If successful, suspect the originating source or networking between it and the Worker(s).