We have updated our Terms of Service, Code of Conduct, and Addendum.

Extracting values from a netscaler log

Options

this is beyond my skillset, so was hoping someone could help me. I have netscaler logs that come in that are key<space>value and was hoping there was a regex to extract them all. Goes like this: ```Source 1.2.3.4:18356 - Vserver 4.5.6.7:389 - NatIP 8.9.0.1:18356 - Destination 6.6.6.6:389 -``` That's just one sample. The rest are sending in a similar format. Was looking for one regex to extract them and put them in fields.

Answers

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    wow. Thanks, so much!

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    hey <@ULBGHDPNY&gt; - thanks for the help here. Was wondering if I could run another one by you. You seem to be the regex expert on this channel. I have a feed coming in that uses key-value events. The ones with strings are enclosed in quotes ("), which makes them useless in Spunk for the TERM command. I'd like to remove the quotes, if they exist from each key-value pair. Also, if possible, replace spaces in the value with underscores. Is that possible? Here's an examples: ```1.2.3.4 devname="DEVNAME" devid="AZEVTM21000028" eventtime=1681220885920164816 tz="-0400" logid="0000000013" type="traffic" subtype="forward" level="notice" vd="root" srcip=4.5.6.7 srcport=50239 srcintf="port2" srcintfrole="undefined" dstip=7.8.9.0 dstport=10000 dstintf="port5" dstintfrole="undefined" srccountry="Reserved" dstcountry="Reserved" sessionid=1316660221 proto=6 vrf=1 action="timeout" policyid=42 policytype="policy" poluuid="65eAAA79a-88Z1-51ec-87EGF8-61fd6d21915b" policyname="AWS-POLICHY-TO-MADEUP-Subnets" service="TCP-1000" trandisp="noop" duration=10 sentbyte=52 rcvdbyte=0 sentpkt=1 rcvdpkt=0 appcat="unscanned" testfield="SOME TEXT"```

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    No regex required for the first ask. Just use the Parser function, Operation mode in reserialize, Type K=V, source field _raw. Boom. Quotes gone

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    To combine the second ask, change the Parser to extract, saving to a new field, call it `parsed`. Then use Mask function to clean up spaces. Then Serialize function to change back to K=V. Finally, and Eval to drop the `parsed` field. ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB" }```

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    awesome. I'll try this out. Thanks so much.

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    result sample:

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    did the first part. Amazing. Love this thing!

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "mode": "extract", "type": "kvp", "srcField": "raw", "cleanFields": false, "allowedKeyChars": [], "allowedValueChars": [], "fieldFilterExpr": "value != null && value != 'undefined'", "dstField": "parsed" }, "id": "serde" }, { "filter": "true", "conf": { "srcField": "parsed.eventtime", "dstField": "_time", "defaultTimezone": "local", "timeExpression": "time.getTime() / 1000", "offset": 0, "maxLen": 150, "defaultTime": "now", "latestDateAllowed": "+1week", "earliestDateAllowed": "-420weeks" }, "id": "auto_timestamp" }, { "filter": "true", "conf": { "rules": [ { "matchRegex": "/\s+/g", "replaceExpr": "''" } ], "fields": [ "parsed" ], "depth": 5 }, "id": "mask" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!eventtime", "!tz", "" ], "dstField": "_raw", "cleanFields": false, "srcField": "parsed" }, "id": "serialize" }, { "filter": "true", "conf": { "remove": [ "parsed" ] }, "id": "eval" } ] }, "id": "scottB" }```

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    ^^^ my simple clean up recs » Ditch fields that are empty or 'undefined' » use eventtime for _time, but then drop it and the pointless tz field

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    (tz doesn't make sense in the context of epoch time -style timestamps)

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    yeah, I'm debating dropping the time. Need to run it by the analysts. They'll probably want to keep it until they feel more comfortable.

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    you're keeping it :slightly_smiling_face: in _time

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    Didn't realize just how many undefined values were in this feed. Dropped. Gotta love it.

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    quotes, undefined fields, and the time field, i'd guess you whack 20%+ from the overall volume

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    hey <@ULBGHDPNY&gt; - sorry to keep bugging you. I have one last one,, if that''s ok. I have a key<space>value feed and I want to turn it into a key=value format. I tried what you mentioned above and a few other things, but that didn't work. Any suggestions?

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    sure, give me a few minutes to wrap up a call

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    can you provide a sample event with this format?

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    sure

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    ```Apr 11 17:29:09 host1 shd_logs_bdc1nx: Status: CPULd 3.4 DskUtil 7.4 RAMUtil 17.7 Reqs 171 Band 82695 Latency 68 CacheHit 7 CliConn 19821 SrvConn 20379 MemBuf 98 SwpPgOut 54243 ProxLd 31 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 1.1 SophosLd 15.7 McafeeLd 0.0 WTTLd 0.0 Apr 11 17:29:03 host2 shd_logs_bdc3nx: Status: CPULd 4.9 DskUtil 8.0 RAMUtil 18.4 Reqs 184 Band 401701 Latency 147 CacheHit 4 CliConn 19516 SrvConn 20169 MemBuf 98 SwpPgOut 56092 ProxLd 45 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 19.2 McafeeLd 0.0 WTTLd 0.0 Apr 11 17:29:00 host3 shd_logs_ndc2nx: Status: CPULd 1.3 DskUtil 5.0 RAMUtil 14.6 Reqs 0 Band 0 Latency 4 CacheHit 0 CliConn 7 SrvConn 10 MemBuf 63 SwpPgOut 0 ProxLd 0 Wbrs_WucLd 0.0 LogLd 0.0 RptLd 0.0 WebrootLd 0.0 SophosLd 0.0 McafeeLd 0.0 WTTLd 0.0```

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    My take on it: » Extract the part that has the KV pairs into `payload` » Use regex extract on `payload` with KEY_0 and VALUE_0 shenanigans. » Use Serialize to push those extracted fields back into raw as K=V » use eval to clean up the mess ```{ "conf": { "output": "default", "streamtags": [], "groups": {}, "asyncFuncTimeout": 1000, "functions": [ { "filter": "true", "conf": { "source": "_raw", "iterations": 100, "overwrite": false, "regex": "/Status: (?<payload>.*)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "source": "payload", "iterations": 100, "overwrite": false, "regex": "/(?<_KEY_0>\S+)\s+(?<_VALUE_0>\S+)/" }, "id": "regex_extract" }, { "filter": "true", "conf": { "type": "kvp", "fields": [ "!", "!cribl_breaker", "!host", "!source", "!payload", "!index", "" ], "dstField": "_raw", "cleanFields": false }, "id": "serialize" }, { "filter": "true", "conf": { "keep": [ "_raw", "_time", "source", "index" ], "remove": [ "*" ] }, "id": "eval" } ] }, "id": "scottb2" }```

  • Franky Laarits
    Franky Laarits Posts: 59 ✭✭
    Options

    thanks. Perfect. I can learn a lot from this.