We have updated our Terms of Service, Code of Conduct, and Addendum.

JSON formatting for VMWare Log Insight

BargiBargi
BargiBargi Posts: 6
edited July 2023 in Stream

Hi,

I've added some fields (lookups, etc) from a syslog source and trying to push it into the Log Insight API via a webhook.

But, I'm struggling understand how to get it into the format Log Insight wants.
I'll say up front, JSON and Javascript is pretty new to me so be gentle 🙂

The JSON format Log Insight expects is…

{“messages”: [{
“fields”: [
{“name”: “Field1”, “content”: “Field1_value”},
{“name”: “Field2”, “content”: “Field2_value”},
{“name”: “Field_xxx”, “content”: “Field_Last_xxx”}
],
“text”: “original message”,
“timestamp”: timestamp
}
]
}

I've tried the Serialize to JSON, but it doesn't seem to have a way to arrange the fields for it to output in a custom way.

I've seen a post about using Object.fromEntries, but I'm still not able to get working. Not sure if it's even the right way to go.

Converting an array into a simplified JSON object:

Does anyone have a slick way of converting an array where each element is an array of length 2, that contains the key at index 0 and value at index 1, into a JSON object. This is what the original looks like: { "HeadersIn": [ [ "Host", "example.com" ], [ "Accept", "*/*" ], [ "Connection", "keep-alive" ], [ "Cookie", " ], [ …

It's probably something simple, Thank you in advance!

Answers

  • Jon Rust
    Jon Rust Posts: 443 mod
    edited July 2023

    Be sure your target object is actually JSON. If it's text (represented by an 'a' in the preview pane), you'll want to parse it into JSON before it hits the destination. Eval function, with _raw => JSON.parse(_raw) would work.

    Note the curly braces next to raw here, indicating a JSON object:

  • Brendan Dalpe
    Brendan Dalpe Posts: 201 mod
    edited July 2023

    Hi @BargiBargi,

    Here's how I would build a pipeline to format the messages. Here's an example I'll show with some fields already extracted. We want to move _raw to the text field, _time to timestamp, and then all remaining fields to fields.

    Let's start with a basic Eval function to build the general structure of an individual message:

    (the expression is {"text": _raw, "timestamp": _time, "fields": []})

    Now we can use the code function to do some magic… We want to take all fields (using the special variable __e) that do not start with an underscore (internal or otherwise not already used) and move their KV pairs to fields.

    __e['_raw']['fields'] = Object.entries(__e).filter(([key, value]) => !key.startsWith('_')).map(([key, value]) => {    return {"name": key, "content": value}})
    

    We use the Object.entries function to create a KV array to work with in the filter and map functions. In the map function, the return value reformats the original KV pair into the expected name and content fields.

    Finally, we can use the Aggregations function to combine events into a single array.

    The list(_raw) function will generate a new array of individual messages aggregated together. The evaluate fields expression moves the array into the expected messages object key.

    Which gives an output that looks like the following:

    Now in the Webhook destination, configure as follows to only emit the _raw field as the payload to the Log Insight collector. Note the URL is static for the destination, but it can be customized per-event by setting the __url field.

    Let me know if this solves your issue!

  • BargiBargi
    BargiBargi Posts: 6
    edited July 2023

    Amazing!
    Thanks so much for the help.

    I've worked through it and I can see it builds the JSON object up as expected.
    But when it's sending to the LI API it's it's erroring bellow.

    {"errorMessage":"Invalid request body.","errorCode":"JSON_FORMAT_ERROR","errorDetails":{"reason":"Unrecognized token ‘object': was expecting (JSON String, Number, Array, Object or token ‘null', ‘true' or ‘false')\n at [Source: (String)"[object Object]\n[object Object]\n[object Obj…"[truncated 3915 chars]; line: 1, column: 8]"}}

    Pipeline looks fine (only thing maybe a bit strange is the content is listed before name, but assume that's just Cribl sorting alphabetically)

    I changed the code to filter on anything starting with NSXT as all the fields I need start with that.

    Tried with the Aggregations and just as Eval and same

    Live Data for the Webhook destination again looks fine

  • Brendan Dalpe
    Brendan Dalpe Posts: 201 mod

    @BargiBargi could you try adding an eval function to the end of your pipeline that turns _raw into JSON.stringify(_raw)?

  • BargiBargi
    BargiBargi Posts: 6
    edited July 2023

    @bdalpe
    Using JSON.stringify(_raw) I can see events come through when using Aggregations step, which is good, but now I barely see 1 event a second coming through when there's much more than that coming in.

    I thought it might have been to do with the Aggregation so replaced it with the following Eval and it works, but again only at a very low rate

    There's no dropped events in the Cribl Webhook which is strange. So not quite sure wha'ts going on. Does the JSON.stringify have a rate limit to what it can process?

  • BargiBargi
    BargiBargi Posts: 6
    edited July 2023

    Hi,

    Been trying to sort this on and off and can't figure out what the issue is.
    Without the JSON.stringify(_raw) Cribl outputs [object Object], which can be seen in the last-failed-buffer.raw

    With JSON.stringify(_raw) it works, but it's literly 1 message every 2 seconds

    Any advice would be much appreciated!