We have updated our Terms of Service, Code of Conduct, and Addendum.

Extracting values from a netscaler log

this is beyond my skillset, so was hoping someone could help me. I have netscaler logs that come in that are key<space>value and was hoping there was a regex to extract them all. Goes like this: ```Source 1.2.3.4:18356 - Vserver 4.5.6.7:389 - NatIP 8.9.0.1:18356 - Destination 6.6.6.6:389 -``` That's just one sample. The rest are sending in a similar format. Was looking for one regex to extract them and put them in fields.

Answers

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    wow. Thanks, so much!

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    hey <@ULBGHDPNY&gt; - thanks for the help here. Was wondering if I could run another one by you. You seem to be the regex expert on this channel. I have a feed coming in that uses key-value events. The ones with strings are enclosed in quotes ("), which makes them useless in Spunk for the TERM command. I'd like to remove the quotes, if they exist from each key-value pair. Also, if possible, replace spaces in the value with underscores. Is that possible? Here's an examples: ```1.2.3.4 devname="DEVNAME" devid="AZEVTM21000028" eventtime=1681220885920164816 tz="-0400" logid="0000000013" type="traffic" subtype="forward" level="notice" vd="root" srcip=4.5.6.7 srcport=50239 srcintf="port2" srcintfrole="undefined" dstip=7.8.9.0 dstport=10000 dstintf="port5" dstintfrole="undefined" srccountry="Reserved" dstcountry="Reserved" sessionid=1316660221 proto=6 vrf=1 action="timeout" policyid=42 policytype="policy" poluuid="65eAAA79a-88Z1-51ec-87EGF8-61fd6d21915b" policyname="AWS-POLICHY-TO-MADEUP-Subnets" service="TCP-1000" trandisp="noop" duration=10 sentbyte=52 rcvdbyte=0 sentpkt=1 rcvdpkt=0 appcat="unscanned" testfield="SOME TEXT"```

  • Jon Rust
    Jon Rust Posts: 487 mod

    No regex required for the first ask. Just use the Parser function, Operation mode in reserialize, Type K=V, source field _raw. Boom. Quotes gone

  • Jon Rust
    Jon Rust Posts: 487 mod

    To combine the second ask, change the Parser to extract, saving to a new field, call it `parsed`. Then use Mask function to clean up spaces. Then Serialize function to change back to K=V. Finally, and Eval to drop the `parsed` field. ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB" }```

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    awesome. I'll try this out. Thanks so much.

  • Jon Rust
    Jon Rust Posts: 487 mod

    result sample:

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    did the first part. Amazing. Love this thing!

  • Jon Rust
    Jon Rust Posts: 487 mod

    ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "fieldFilterExpr": "value != null && value != 'undefined'", "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "srcField": "parsed.eventtime", "dstField": "_time", "defaultTimezone": "local", "timeExpression": "time.getTime() / 1000", "offset": 0, "maxLen": 150, "defaultTime": "now", "latestDateAllowed": "+1week", "earliestDateAllowed": "-420weeks" }, "id": "auto_timestamp" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!eventtime", "!tz", "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB" }```

  • Jon Rust
    Jon Rust Posts: 487 mod

    ^^^ my simple clean up recs » Ditch fields that are empty or 'undefined' » use eventtime for _time, but then drop it and the pointless tz field

  • Jon Rust
    Jon Rust Posts: 487 mod

    (tz doesn't make sense in the context of epoch time -style timestamps)

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    yeah, I'm debating dropping the time. Need to run it by the analysts. They'll probably want to keep it until they feel more comfortable.

  • Jon Rust
    Jon Rust Posts: 487 mod

    you're keeping it :slightly_smiling_face: in _time

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    Didn't realize just how many undefined values were in this feed. Dropped. Gotta love it.

  • Jon Rust
    Jon Rust Posts: 487 mod

    quotes, undefined fields, and the time field, i'd guess you whack 20%+ from the overall volume

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    hey <@ULBGHDPNY&gt; - sorry to keep bugging you. I have one last one,, if that''s ok. I have a key<space>value feed and I want to turn it into a key=value format. I tried what you mentioned above and a few other things, but that didn't work. Any suggestions?

  • Jon Rust
    Jon Rust Posts: 487 mod

    sure, give me a few minutes to wrap up a call

  • Jon Rust
    Jon Rust Posts: 487 mod

    can you provide a sample event with this format?

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    sure

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    ```Apr 11 17:29:09 host1 shd_logs_bdc1nx: Status: CPULd 3.4 DskUtil 7.4 RAMUtil 17.7 Reqs 171 Band 82695 Latency 68 CacheHit 7 CliConn 19821 SrvConn 20379 MemBuf 98 SwpPgOut 54243 ProxLd 31 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 1.1 SophosLd 15.7 McafeeLd 0.0 WTTLd 0.0 Apr 11 17:29:03 host2 shd_logs_bdc3nx: Status: CPULd 4.9 DskUtil 8.0 RAMUtil 18.4 Reqs 184 Band 401701 Latency 147 CacheHit 4 CliConn 19516 SrvConn 20169 MemBuf 98 SwpPgOut 56092 ProxLd 45 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 19.2 McafeeLd 0.0 WTTLd 0.0 Apr 11 17:29:00 host3 shd_logs_ndc2nx: Status: CPULd 1.3 DskUtil 5.0 RAMUtil 14.6 Reqs 0 Band 0 Latency 4 CacheHit 0 CliConn 7 SrvConn 10 MemBuf 63 SwpPgOut 0 ProxLd 0 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 0.0 McafeeLd 0.0 WTTLd 0.0```

  • Jon Rust
    Jon Rust Posts: 487 mod

    My take on it: » Extract the part that has the KV pairs into `payload` » Use regex extract on `payload` with KEY_0 and VALUE_0 shenanigans. » Use Serialize to push those extracted fields back into raw as K=V » use eval to clean up the mess ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "source": "_raw", "iterations": 100, "overwrite": false, "regex": "/Status: (?<payload>.*)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "source": "payload", "iterations": 100, "overwrite": false, "regex": "/(?<_KEY_0>\S+)\s+(?<_VALUE_0>\S+)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!", "!cribl_breaker", "!host", "!source", "!payload", "!index", "" ], "dstField": "_raw", "cleanFields": false }, "id": "serialize" }, { "filter": "true", "conf": { "keep": [ "_raw", "_time", "source", "index" ], "remove": [ "*" ] }, "id": "eval" } ] }, "id": "scottb2" }```

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭

    thanks. Perfect. I can learn a lot from this.