We have updated our Terms of Service, Code of Conduct, and Addendum.

How to skip the default parsing of JSON input to the HTTP(S) Cribl source in Streams?

Options
Santhosh M
Santhosh M Posts: 8

Dear all,

Coming up with yet another question as I couldn't find similar from the history of available requests.

I have created the HTTP RAW Source and able to push the data and process it in the pipeline. However, I have observed that when I POST the valid JSON format(all the attributes with double quotes), Cribl http source is parsing them into individual fields after the source process, and if the posted content is not JSON parsable, then the content is available under the key "_raw".

As per my requirement, I would like to keep the input JSON object as is, since I was trying to validate the whole json with the schema stored in the Knowledge. If it's not getting parsed then I was able to run my use case as expected. In other case where the JSON is parsed into individual fields, I couldn't find any internal key which holds all these variable to format them back to JSON object to validate against the schema.

Hence, reaching out for any suggestions in this. Thanks in advance!


NOTE: Would be a good idea to have an option in the settings of HTTPs Source to enable or disable for parsing the JSON compatible input.

Answers

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    I'm not sure I follow. You say you're using Raw HTTP, but then you also say Cribl HTTP, which is entirely different. I've tested in my lab using Raw HTTP to validate Cribl does not parse received JSON by default. You could define an Event Breaker to do it, but out of the box everything sent is put into _raw as a string.

    Sent: curl -X POST http://workernode:10082/mycribl -d '{"test":"data"}'

    Captured on Cribl side (note the a next to the _raw field means it is a string):

  • Santhosh M
    Santhosh M Posts: 8
    Options

    Hi @Jon Rust ,

    Thanks for taking time to answer. The source I meant was HTTP but not Cribl HTTP, about which you have given your explanation. Apologies for the confusion!

    I am surprised with the event data the you have got is in _raw, when in my situation it was different and I assume it could be because I specified the Content-Type. As I can't share the screenshots of the project environment that i was testing, here is the sample similar to the data I was testing;

    Case 1:
    curl -X POST -H "Content-Type: application/json" "http://<host>:port/cribl/_bulk" -d '{"testkey2": "1", "testkey3": "1", "testkey1": "(test) - foo"}'

    Data captured after the HTTP source has been parsed into K-V. I was able to handle this scenario using the "serialize" function to create the JSON with all the required attributes.

    Case 2:

    curl -X POST -H "Content-Type: application/json" "http://<host>:<port>/cribl/_bulk" -d "{'testkey2': '1', 'testkey3': '1', 'testkey1': '(test) - foo'}"

    Data captured in this case was added as a string into the "_raw" key.

    As I mentioned in the case 1, I have got the solution for my question. Thanks again for your time. Appreciate it!

    Cheers!

  • Jon Rust
    Jon Rust Posts: 439 mod
    Options

    Good info. Thanks for the followup!

    Are you using a custom event breaker?

    I've included the content-type header as above and get the same results as before. In any case, sounds like you got what you need. Happy Criblin'!

  • Santhosh M
    Santhosh M Posts: 8
    Options

    No event breaker, however I do see the event differently and handled the case to accept both options, string and parsed json. Will try to post if I can recreate the scenario in sandbox later this week.


    Thanks again!