join
The join
operator merges events from two different data scopes.
You can merge data coming from different datasets and dataset providers, and then process the output for further analysis.
Syntax
Using a let
statement:
let RightScopeName = RightScope;
LeftScope
| join [ JoinOptions ] RightScopeName on JoinConditions
Using an inline subquery:
LeftScope
| join [ JoinOptions ] (
RightScope
)
on JoinConditions
Arguments
- RightScopeName: The name for the RightScope. Spaces (
- RightScope: The data on the right-hand side of the join operation. Needs to be assigned a name, using the
let statement. Can be a single value, an expression, or an entire query, for example:
dataset="cribl_internal_logs" | limit 1000
. - LeftScope: The data on the left-hand side of the join operation. Can be a single value, an expression, or an entire query.
- JoinOptions: The join operation’s optional settings. Syntax:
option1=value1[, option2=value2, ...]
. The following options are available:kind=String
: The type of join to perform. Can beinner
,leftouter
, orleftanti
. Default:inner
.overwrite=Bool
: Applies to fields with the same name. Iftrue
, the RightScope fields overwrite their LeftScope namesakes. Iffalse
, both LeftScope and RightScope namesakes are preserved. Default:false
.caseInsensitive=Bool
: Iftrue
, the join operation is case-insensitive. Default:false
.doNotMerge
: Iftrue
, the events are not merged. Instead, the join operation works only as a filter. Default:false
.
- JoinConditions: The conditions on which the events are merged. Separate multiple conditions with a comma (
,
). Use the following syntax:FieldName
, if the fields you want to match have the same name in both datasets.$left.LeftFieldName == $right.RightFieldName
, if you want to match fields that have different names.
Rules
The RightScope must:
- Either be assigned a name, using a let statement.
- Or be an inline subquery.
The join operation includes the first 50,000 events of the RightScope. The remaining events are ignored.
Types of Joins
By default, Cribl Search performs the so-called inner join, which returns the following:
Scope | What’s included in the output |
---|---|
LeftScope | Events that match the join conditions, including the matching keys |
RightScope | Events that match the join conditions, including the matching keys |
To perform the so-called
leftouter join, use the
kind=leftouter
setting, which returns the following:
Scope | What’s included in the output |
---|---|
LeftScope | All events |
RightScope | Events that match the join conditions, including the matching keys |
To perform the so-called leftanti
join, use the kind=leftanti
setting, which returns the following:
Scope | What’s included in the output |
---|---|
LeftScope | Events that don’t match the join conditions |
RightScope | No events |
Examples
Join two scopes of data, using an inline subquery:
dataset="cribl_search_sample"
| join (
// inline subquery
search in(blacklistIps) hostIp
) on $left.srcaddr == $right.hostIp
Join two scopes of data, based on a single namesake field (commonField
). Returns only those LeftScope
events that
have the commonField
. Those events are also enriched with the commonField
fields from the RightScope
.
let RightScope = dataset="myDataset1"
| where x < 10;
dataset="myDataset2" // LeftScope
| join RightScope on commonField
Join two scopes of data, based on multiple namesake fields. Returns only those LeftScope
events that have the
commonField1
and commonField2
fields. Those events are also enriched with the commonField1
and commonField2
fields from the RightScope
.
let RightScope = dataset="myDataset1"
| where x < 10;
dataset="myDataset2" // LeftScope
| join kind=inner RightScope on commonField1, commonField2
Join two scopes of data, based on fields with different names. Returns only those bills
events that have the name
field. Additionally, those events are enriched with their lastName
counterparts from the employees
dataset.
let RightScope = dataset=employees;
dataset=bills
| join RightScope on $left.name == $right.lastName
Join two scopes of data, based on fields with different names. Returns only those LeftScope
events that have the a1
and b1
fields. Additionally, those events are enriched with their b1
and b2
counterparts from the RightScope
.
let RightScope = dataset="myDataset1"
| where x < 10;
dataset="myDataset2" // LeftScope
| join kind=inner RightScope on $left.a1 == $right.b1, $left.a2 == $right.b2
Perform two joins in a single search. Returns only those trafficData
events that have the targetIP
and hostname
fields. Additionally, those events are enriched with their destinationIP
and hostname
counterparts.
let sourceOne = search in(vpcflowlogs) *
| summarize c=count() by destinationIP
| top 10 by c;
let sourceTwo = search in(hostlist) *
| where host.status == 'blacklisted';
search in(trafficData) *
| join sourceOne on $left.targetIP == $right.destinationIP
| join sourceTwo on $left.hostname == $right.hostname;
Perform a join with the overwrite
option set to true
. Returns only those LeftScope
events that have the name
field, but the RightScope
name
values overwrite the LeftScope
name
values.
let RightScope = search in(myDataset1) 'foo'
| count;
search in (myDataset2) x < y // LeftScope
| join overwrite=true RightScope on name
Perform a left outer join. Returns all events from the bills
dataset. Additionally, those bills
events that have the
name
field are enriched with the lastName
field.
let RightScope = dataset=employees;
dataset=bills // LeftScope
| where isnull(lastName)
| join kind=leftouter RightScope on $left.name == $right.lastName
let codeMap = dataset=$vt_dummy event < 4 | extend x =
case(event == 0, dynamic({"status": 200, "text": 'OK'}),
event == 1, dynamic({"status": 308, "text": 'Redirect'}),
event == 2, dynamic({"status": 404, "text": 'Not found'}),
dynamic({"status": 500, "text": 'Internal Error'}))
| project status=x.status, statusText=x.text;
dataset="cribl_int*" method status | where isnotnull(status) | join codeMap on status | distinct method, status, statusText;