User Examples

Reducing Missed Events with Cribl Stream

Are you suffering from fear of missing out (FOMO)? It is not uncommon for log outputs to change with vendor updates or internal administrative changes. Data that is parsed incorrectly, or not at all, can lead to missing events in your security and analytic tools. While log format changes are inevitable, Cribl Stream can significantly reduce, or even eliminate, the impact of missed events in your organization. With Stream’s native functionality, you can easily create a process for validating and bifurcating data that does not meet validation criteria. These events can be directed to a designated review repository for further review, while the rest of the data moves to the destination of your choice.

Use-Case: Create a process to validate whether log output is in an intelligible format for the destination. Use Cribl’s schema library as a framework to validate JSON data post-parsing. If the data does not pass validation, send it to a custom logging repository where it can be reviewed.

Step 1: Generate Data for the Source and Configure the Destinations

First, make sure you have the JSON source and destination of choice configured.

Manage -> Routing -> Quick Connect -> Add Source -> Datagen

tjIB6oEVFW-dwWnjJo_eepDG0j2IKeRNer1jmt5T8L-D7AlO7VfSZHl5MNFPqPUd9ct5NzuHxmUbQoFJ9P4dGAAgt73Hlr11UeqTzW8Zjr71EtZWXR5Sqz8D2LQrbZx125jesv90TjwTxJsEaHwhb2g

In this example, use Quick Connect and Cribl’s Datagen Source configured with: ‘big_json.log’.

Step 2: Configure Destinations and Output Router

For this process, configure two destinations: One destination for the events that pass validation, like your SIEM, and a second for the review repository.

For the review repository, it is recommended to keep a short retention period (ex. 5-10 days) and that it is reviewed frequently. Find out more about configuring destinations in docs.

Next, configure the Output Router. This feature will be used later in the process diverting output to the correct destination based on results from the validation pipeline.

Add Destination -> Filter Destinations -> Output Router

IwS22oXkJfKJpEQdm-yZFBBVgiOcm7oOjXtnA6yxFPJhcX6zTEpe0YHQDki3rfuLcYkHyQMH9BiySXKp_y__Iz1WFOhiM3H5plCWtXm7z36a5VYSgZSM9vmWxQV2WXG90kkXAc4v_qbqkBg8ItHLyIY

Enter an Output Router Name and save it. You will return to configure the router in a later step.

Next, make a connection by clicking the plus icon on the Source and dragging to the Destination.

Source -> Destination -> Passthrough

dXv4LAlzVu9BziU-NsjVQDIaRlh68mSW0fRFpFJsX5ByUbkd16aYzR9HoWuFG1jVot0jbg3fFHQSso8AY1f6us4Sf2Bjrrl6rHHnO-d_qfVtjgATA2DeCQH_V_Pb5PA96aOSzr9-dMHX6RYBD-d3Vkk

Since you are working with sample data, the source is Datagen and the destination is Output Router. In the configuration pop up select pass through, you will configure the pipeline in the next step.

Step 3: Create Your Validation Pipeline

Create a route between the source and destination. In the QuickConnect pipeline you will add:

Select the Route -> Pipeline ->Add Pipeline to Connection -> Add Pipeline

jWYJA5CxBaeu-yMUnX05T9n8s06dJ-clog48TY01AEBLauf_pXwsObfvMM3iXvxlk8CjXziEk9LuYLTe2Rl5V3bqmHJocWHnVJcRb9Gryzb_hY2OcvXgGV7Konsc9ZdTStyonnZ8AELNLWLrgmnKT9M

7FyHcm4EU0VrYYieiEDoA7mPx0BI02Z_P0QY4yHgirhbjzpMyXZ9h953uiIYV9_LHW3AESSCOnogMCvAnCk_YfPPqX8OCxrNWE5g0CxkrKlfI9Yhr6r03CEUFPfam8REeocRKK4GlVGwfcIxOr85IhQ

Since you are working with Datagen it's ok to keep this step very simple. Leaving the ‘List of fields’ blank allows the Parser function to auto generate a list of fields for you! You will return to configure ‘Eval’ later.

Step 4: Create JSON Validation Schema

Now that a pipeline is established, create the JSON Schema based on the fields and values your security and analysis tools are expecting.

Processing > Knowledge > Schemas -> New Schema

Start with the below JSON and save ID* as ‘schema1’.

{

"$id":"https://example.com/bigjson.schema.json",

"$schema":"http://json-schema.org/draft-07/schema#",

"title":"bigjson",

"type":"string",

"required":[

"host",

"source",

"sourcetype",

"Dest"

],

"properties":{

"host":{

"type":"string",

"description":"Device where log originates."

},

"source":{

"type":"string",

"description":"Data source that creates the log."

},

"sourcetype":{

"type":"string",

"description":"Field identifying the structure of the event."

},

"Dest":{

"type":"string",

"description":"Endpoint for the intended message."

}

}

}

OMhgKSKrKLHlrkNknlt9Ul3NahW1pncAjZWZwILRPS9o4jur2Mzh0Dl6Nq3jY7aHwHSwr8qu-TmWSTNNxeuAW0IBTl5L3nn3qLzKqAlcvbUB8cIiOtwcAL_bx3Swjmuvt3XyPY55Oft-k1jvw0MC2GY

The JSON schema contains the fields host, source, sourcetype and Dest, as string fields. While this is a simple validation, complexity can be added when/where necessary. For example, you can use the schema to ensure that host equals a specific value or constrain fields with integer values within a specific range.

Step 5: Update Eval Function to Validate Data

Return to the pipeline and update the Eval Function to include the JSON Schema you just created.

Manage ->Routing -> Quick Connect -> Pipeline -> Eval

qhU-PTG31d19uQEq64ygsj8Ycm2ZXixkh_Ep3tT_W5ZBBOPM-M3qn2j49jq_YDBOrTHqKl6h1KN3y1p-xKS-HUoB_heUsI3Pm3jCBkBafTgb1cX-eACuDoDEinxgf70pRvQMoSsbxQTUhXyPDT5TjEY

Including a double underscore in the ‘Name’ field (__isvalid) enriches the event with a metadata field. This is an internal field that will not be passed on to your destination but can be used later to bifurcate your data.

For the Value Expression reference ‘schema1’ and add host as the value. You are validating that ‘host’ is present and contains a string value. You can add any of the fields (source, sourcetype, dest) here to be validated.

Step 6: Test the output of your pipeline

Capture sample data to view to test the results of the Pipeline.

Sample Data -> Capture Data -> Capture ->Start -> Save as Sample File -> Simple Preview

Xc9Yy5dQrPl5nHM6ROcv06ZnyFhSY_ZlTbm5HS6HuYP5aEkudtq5zTZxquSIhaqgNFxsGGNtMivezgHHfv-XZlSMFJSYphQK9EQ0wD81VUgnDa6hWl4rOPaC8K-WHASIoiGJbbhDRIklD8iG7tLMhrs

By default, internal fields are not shown in the output.

Settings -> Show Internal Fields

Q6sXCo1ardB0ZwkjcnuuiaiOyy1eVMPg5i-QlVAcSB3xbYlcSdepBWS1uDRscVvdGmBbHIPk5MlRMOaZtc-Aofy1gWWzhqbYnmkVY3PmVK6G6OfzoD8lB1gS8DBzQAtlOIToaS9pbPzfcZ8D863J6mA

6TRLUTHbUNNdYYINksfrrhx9VE4qRGap3zOULkrZa07Yleb138noH5_mg_u4Mc4tirs1m9nqeNOKviHy-IgfmccPlgv6_PfGqig9ZqDOulXvjPDncGVt5jiU2ym9MWDg0jnkIkxs8hEukOUAMSpjJOU

In the two logs from the sample, you can see ‘__isvalid:true’. These logs have passed the validation test. Now test ‘Dest’

Eval -> Add Field

LKFm5X-SrDeoq6afAFkKTz3awtUiwZ5F-IisVmWk6GxQqh0sFzwIi_7dMh_U-_D0mr9J9t2gEzb00juoVxsI0nRNbjckqvM28FnQVehQZtaab6aAv3Ghu1A_tBIXsPDlO_pvlOgK4lnd1Ty3v0if4ZQ

Save the changes that have been made to the Eval function. Pivot to the Simple Preview and re-run the Data Capture File against the sample.

uZayrmUZja_He3NlOzqrjgD4k-ZG1M6kZx2gs_eplISCpVTVZolFlQcMSe5TbQOP4SpFFwXRl5QU2j8bvrQhn6a_XodcLdK5c6hfWdq0_Citb-REvlDmeXWn-C7ZB6S0zWIuayu9X4XRflbV4rj8gbg

Since the field ‘Dest’ has been added to a validator, you will now get ‘__isvalid: false’ because the Dest field does not exist in our bigjson.log file.

Pro TIP: While this pipeline is validating data in the route, validation could also be done at the source using a pre-processing pipeline.

Step 7: Send Data to the Right Destination

Return to the Output Router and add the Filter Expression to direct data to the desired location.

Destinations -> Output Router ->Rules -> Filter Expression

WdLET5W9y4TxGpmA1cAEBcHizwsSD_RVJt8D2FPpM1LCetuNxmyuM3P_9UroVFx_PJlmCF-Byz1S9S86obU5anwjXQaIx9_NJHQgqrIwFS8L7AI54CaPQuw_QDqvvZwDFd7eZZidpZg8fYcI6V2DpiY

Events where the ‘__isvalid:true’ are sent to security and analytics tools. Events where the ‘__isvalid:false’ are sent to a separate repository for review.

Step 8: Check Your Work!

You can validate the output of your pipeline by capturing live data at the destination.

Destination -> Capture -> Show Internal Fields

Mf7flKRLtPKPu2tRRIVUUWEEXG0cyG7SKJIIX5IsJqnE94wbPBNVBcCPmHJXZ9GVfOHxtR72gtvHZ8sY0tYNWVhCgku0fLE-5-ZegLNY35h_Ahe8KnCEjw0B5ey5MhRzPgk1CFU6GGzmdrKH9gSxyL4

In the above example, you can see that ‘__isvalid: false’ is sent to the ‘__outputId: s3:Review’ which is exactly the result we were expecting.

Wrap Up!

This framework is a great way to ensure that security and analytic destinations receive data in a format that is recognized. You can include as many or as few fields as needed to validate the data. This framework can even be extended to similar use-cases like segmenting data by host, source, or sourcetype to alternate destinations.