Filesystem Collector and Event Breaker Inconsistencies
I have created an event breaker rule that works in the knowledge area of cribl stream. But when I run a filesystem collection job to pick up the same file I used to create the event breaker, it does not work.
Event breaker rules in
Best Answer
-
I "solved" the issue. This may need to become an engineering ticket.
I looked back at the file encoding…
sample.tsv: UTF-8 Unicode (with BOM) text, with CRLF line terminatorsThere may be an issue with the filesystem collector interpreting the byte order mark?
I used notepad++ to remove the BOM and encoded it as just UTF8 and it worked.
This is a band-aid fix for me, as converting the encoding of 50-100GB of files each day prior to ingest is not particularly scalable or effective.Thanks Dan for the assist!
1
Answers
-
Andrew,
I have a feeling your header line regex is matching the first #separator line. I can test it out but I think maybe you would want to change your header line to ^#[Ff] so we ignore the lines before the fields line.1 -
Can you validate youve committed and deployed?
0 -
Yeah sorry I think the header line is actually excluding everything that starts with #. What do your first few events look like on the file import?
0 -
Sorry, I mean what do the first few events look like in Stream when you run the job? You had a screenshot above but it started at event 7, just wondering how the first few events look.
0 -
The breaker works for me in the preview, but not when I run an actual collection.
sample.tsv: UTF-8 Unicode (with BOM) text, with CRLF line terminators
0 -
It is being pulled from an NFS mount (Synology NAS) on a linux (ubuntu 20.04) vm.
0 -
Which is strange since it is showing you are hitting the tsv-bro breaker.
0 -
How big is the file you are collecting?
0 -
The filetype is just a variable on the path for a filesystem. I have about 15 different "filetypes"… Im hitting the right breaker as seen in my collection. Im on cribl 3.4.1 for both leader and worker.
0 -
Its 10,000 line sample, but when the file actually comes in its about 10-50 GB
0 -
Can you try bumping your max event size to the max
134217728
(128MB). How big are your working IIS logs?0 -
You could also try to bump up the event breaker buffer timeout on the file system collector.
0 -
The IIS logs would have been about the same size, 10,000 line sample. Each event about the same size. Those whole files are anywhere from 100KB to 2-3GB. Any recommendation on the buffer timeout?
I will recreate the whole event breaker tomorrow using the same config and max size you recommend, and test again in case I flubbed something else up somewhere. Ill let you know!
0 -
Im still having the issue. This was my order of operations to fix/recreate issue.
Increased event breaker buffer timeout on collector source to 600000, commit, deploy.
Unsuccessful.
Delete event breaker, commit, deploy.
Recreate event breaker with settings you recommended to try, commit, deploy.
Filesystem collection has same results/symptoms.
I connected to the worker ui from the leader, and verified the breaker exists on the worker. The file preview with the breaker on the worker works.I checked logs for the specific adhoc run and I only see 1 error.
{
"time": "2022-05-04T13:29:35.668Z",
"cid": "api",
"channel": "Job",
"level": "error",
"message": "failed to cancel task",
"jobId": "1651670972.72.adhoc.NDCA-collector",
"taskId": "collect.0",
"reason": {
"message": "Instance 1651670972.72.adhoc.NDCA-collector|collect.0 not registered",
"stack": "RpcInstanceNotFoundError: Instance 1651670972.72.adhoc.NDCA-collector|collect.0 not registered\n at /opt/cribl/bin/cribl.js:14:13169102\n at /opt/cribl/bin/cribl.js:14:11427356\n at runMicrotasks ()\n at processTicksAndRejections (internal/process/task_queues.js:95:5)\n at async k.handleRequest (/opt/cribl/bin/cribl.js:14:13168338)",
"name": "RpcInstanceNotFoundError",
"req": {
"instanceId": "1651670972.72.adhoc.NDCA-collector|collect.0",
"method": "cancel",
"args": []
}
},
"source": "/opt/cribl/state/jobs/default/1651670972.72.adhoc.NDCA-collector/logs/job/job.log"
}0 -
That just looks like the worker was restarting. Have you tried collecting a small file ~20 events?
0 -
Yeah, the error looked benign, but just providing info.
I edited the file down to about 20 lines, changed EOL to LF from CRLF. Same symptoms.
0 -
I "solved" the issue. This may need to become an engineering ticket.
I looked back at the file encoding…
sample.tsv: UTF-8 Unicode (with BOM) text, with CRLF line terminatorsThere may be an issue with the filesystem collector interpreting the byte order mark?
I used notepad++ to remove the BOM and encoded it as just UTF8 and it worked.
This is a band-aid fix for me, as converting the encoding of 50-100GB of files each day prior to ingest is not particularly scalable or effective.Thanks Dan for the assist!
1 -
Yes encoding has hit me a few times. My next suggestion was to run head on your file and see if you had any extra stuff at the beginning. I will add this to a feature request I already had in for supporting additional encoding on file system collector.
0 -
Filesystem collection results (events do not have correct fields/values)
0 -
Event breaker rules out
0 -
UTF-8 should be fine.
This is using the breaker I posted above and pulling with a file system collector.
The only difference is my filter. When I add a ‘filetype field on the collector and use bro with your filter it breaks it. Where are you adding filetype?
0 -
I ran what you sent above through a collector with what I think is the same breaker as you and it worked.
The next thing I would check is what type of encoding you have on your file, is this pulling from a linux machine? If so run a
file testfile.tsv
on your test file.0 -
Sorry, the first few events are the commented rows from the log. Exactly as they appear in the log.
0 -
I want it to exclude the # lines since those are not events. The first real events are tab separated and the field names are in that field list. I tried changing the header line to "^#[Ff]" the event breaker preview completely fails.
Heres the first few lines of the file. With my original settings, the import looks fine and field/value pairs look good. But when run with a filesystem collector, it fails. Im going to try with the header line changes that you recommended.
#separator \x09#set_separator ,#empty_field (empty)#unset_field -#path conn#open 2021-11-12-12-45-00#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p id.vlan id.vlan_inner proto service duration orig_bytes resp_bytes conn_state local_orig local_resp missed_bytes history orig_pkts orig_ip_bytes resp_pkts resp_ip_bytes tunnel_parents orig_cc resp_cc suri_ids community_id#types time string addr port addr port int int enum string interval count count string bool bool count string count count count count set[string] string string set[string] string2022-05-01 00:00:00.000012 CUaMDI3N3CtEwGXbX9 128.83.27.4 46210 170.114.10.87 443 4020 \N tcp \N 65.207075 0 6218 SHR 1 0 0 ^hdf 0 0 9 6590 \N US US \N 1:1nbEONdQpmuQtjlL3SSQbc28Wyo=2022-05-01 00:00:00.000320 CAZzJv4QRVv5Yek7Oh 128.83.130.204 54935 58.247.212.36 53 4020 \N udp dns \N \N \N SHR 1 0 0 ^d 0 0 1 156 \N US CN \N 1:KJjQRZuB5bkT7+ebSf4FW7RJiL8=2022-05-01 00:00:00.000432 CdRza81SzhESDDyhI9 128.83.72.175 58632 192.111.4.106 443 4020 \N tcp ssl 376.280685 1458 6534 S1 1 0 0 ShDd 3 1590 7 6826 \N US US \N 1:ZqDFOlfGk/8wlEO1gmawxhE6YBg=2022-05-01 00:00:00.001140 CAcMyE40njQ2DatMNc 128.83.28.30 59755 205.251.197.3 53 4020 \N udp dns \N \N \N S0 1 0 0 D 1 140 0 0 \N US US \N 1:SeSWa3fEVB/I60glsRug0PmDPys=
0 -
Dan, that does make sense, Ill reconfigure that one and reply back. I have an example where that would be inconsistent if thats the case.
This event breaker works, collects, and extracts correctly. Event though the first few lines match ^# as well.
Jon, yes I have. On multiple occasions, for each attempt. I had restarted the worker too just in case.
0