We have updated our Terms of Service, Code of Conduct, and Addendum.

Creating New Fields from _raw

I want to create new fields from values in _raw. See below. I want to create a series of folders based off the year, month, date and hour.

2025 (parent folder) - 01(subfolder) - 12(sub under 01) - 23(subfolder under 12).

_raw:"2025-01-12 23:59:56","Another Value","Dummy Data","","Something Really Cool"

I'm using Regex Extraction and I can see the selection in the preview as Group 1, Group 2, etc.

However, I can not convert these into individual fields. Any suggestions?

Comments

  • Jon Rust
    Jon Rust Posts: 485 mod
    edited January 14

    In the Regex Extraction function, you can use named groups, which you can then refer to as fields. Example:

    source: _raw
    regex: ^(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2}) (?<hour>\d{2})

    This will result in 4 new fields in your event.

    Another option would be to parse the time, which should have happened automatically in this case. The _time field should have the epoch time in it. You can use that to "extract" year, month, day and hour as needed. In an Eval function you can use C.Time.strftime, for example:

    Add field
    mypath = C.Time.strftime(_time,"%Y/%m/%d/%H")

  • Steve Bennett
    Steve Bennett Posts: 6

    I fill like there is a next step. Because when I do Full Preview, I still don't see those fields or values. I can see them in the preview of the Regex Advance view but no where else.

  • Jon Rust
    Jon Rust Posts: 485 mod
    edited January 14

    What happens if you use Simple Preview? Can you share screencaps?

    Edit: full preview is usually an unneeded complexity

  • Steve Bennett
    Steve Bennett Posts: 6

    Wow… It's in Simple Preview but not Full….

  • Jon Rust
    Jon Rust Posts: 485 mod

    Full Preview takes everything into account. Source, routes, filters, pipelines, etc. It will require you to consider all the steps. It's not usually required, and will take a few extra steps. Simple preview is where you want to be in most cases.

  • Steve Bennett
    Steve Bennett Posts: 6

    John, You are awesome. So the pipeline can now pass these fields on the destination? In my case an S3 compatible bucket?

  • Jon Rust
    Jon Rust Posts: 485 mod

    Yep. If you're trying to set the path for the bucket, you may want to do this in the S3 destination configs.

  • Steve Bennett
    Steve Bennett Posts: 6
    edited January 14

    So. I'm parsing DNS logs from Cisco Umbrella. Cisco keeps a copy in a managed s3 bucket but only for a rolling 30 days. I'm trying to preserve the folder structure as best as possible but I wanted to better segment it by adding an hour folder so I didn't have as much scrolling. I'm sending a copy to my Splunk Cloud instance and a copy to some on-premise S3 compatible storage for longer term storage.

    UPDATE: I forgot to mention that when I tried to do this at the Destination, I got a bunch of unexpected token errors.

  • Jon Rust
    Jon Rust Posts: 485 mod

    can you share the partition expression you're using? And a sample event (post-processing) would be great too.

  • Steve Bennett
    Steve Bennett Posts: 6

    /${event_year}/${event_month}/${event_day}/${event_hour}/

  • Jon Rust
    Jon Rust Posts: 485 mod

    The key prefix field is not the path field. They serve different purposes. From the docs:

    > Key prefix: Root directory to prepend to path before uploading. Enter either a constant, or a JS expression (enclosed in single quotes, double quotes, or backticks) that will be evaluated only at init time

    The setting you have shown in the picture should be in the Partitioning expression, and also will ONLY work if you have event_year, event_month, etc fields available in the event. You'd be better off using _time as I showed above. Note the backticks are absolutely required.

    `/${C.strftime(_time,"%Y/%m/%d/%H")}/`

    I'd recommend reading this blog post from Cribler Ahmed Kira, too.

    I have office hours available today at 08:30 PST, and Friday at 10:00 and 10:30 PST. Let me know if you'd like to get on a call to sort this out.