We have updated our Terms of Service, Code of Conduct, and Addendum.

How are you handling token persistence?

I imagine a lot of people are using the Splunk Load Balanced destination with indexer discovery. How are you handling token persistence? We patch/recycle our Cluster Master monthly so looking for a method to "restore" the tokens without impacting data delivery. I believe tokens are stored in the kvstore, $SPLUNK_HOME/etc/passwd and $SPLUNK_HOME/auth/splunk.secret files.

Answers

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    <@U02QJ374Z3R&gt; there are many who are using Splunk LB, but as far as I know .. using Indexer discovery + updating the Indexer discovery often .. I am not sure how often most customers updating the token ..

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    with Cribl there are 2 ways (that I can think-off) to update the Indexer Discovery token. Manually in the UI or Cribl API

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    Once you update the Token, Cribl will not impact data delivery since .. " Worker Process Rolling Restart During a restart, to minimize ingestion disruption and increase availability of network ports, Worker Processes on a Worker Node are restarted in a rolling fashion. 20% of running processes – with a minimum of one process – are restarted at a time. A Worker Process must come up and report as started before the next one is restarted. This rolling restart continues until all processes have restarted. If a Worker Process fails to restart, configurations will be rolled back. "

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    I don't use Indexer Discovery, nor do I recommend it to clients. Just say no.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    <@U01J549PR6Y&gt; - it's not that I want to update the token regularly in Cribl (if at all unless a breach or something). But when I recycle my cluster manager the auth tokens get wiped on the Splunk side and would therefore be invalidated in Cribl indexer-discovery.. Granted this is more of a Splunk question, but figured enough Cribl customers leverage Splunk LB destination that someone has a solution for it.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    <@UEGNG8MJB&gt; any particular reason? Curious what better method exists to maintain a working list of online indexers.

  • Jon Rust
    Jon Rust Posts: 475 mod

    i used DNS. one name, all the indexer IPs behind it. I'm curious why a CM restart trashes the token. Never seen that before.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    Not a restart, but swapping out the EC2 instance for updated AMI etc.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    We also recycle the indexers for the same reasons. (security compliance)

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    As Jon said, most common that I have seen are the DNS alternative

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    DNS. Cribl handles DNS round robin waaaay better than Splunk. And being an old-school network guy who prefers an NLB over any round robin DNS, that says something. The programmers did an excellent job of leveraging DNS entries that are loaded with IPs. And the Cribl load balance approach is better than Splunk's.

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    I've found DNS is just reliable with Stream.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    Thanks. So then the 'discovery' config would look something like this?

  • Jon Rust
    Jon Rust Posts: 475 mod

    correct!

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    great....stay tuned. :smile:

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    Worked like a charm. Thanks for the guidance on this, much appreciated.

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    One follow up, which backpressure method best suits Splunk LB destination? The way I understand this setting, is that if the destination (ie Splunk indexers) cannot receive data: » block - buffer in memory on cribl workers » drop - /dev/null it » PQ - queue it on disk on the cribl workers until destination is accepting data

  • Jon Rust
    Jon Rust Posts: 475 mod

    enable PQ

  • Jon Rust
    Jon Rust Posts: 475 mod

    and set-up some constraints around the storage

  • Jon Rust
    Jon Rust Posts: 475 mod

    i would enable compression too (not on by default iirc)

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    Reading https://docs.cribl.io/stream/persistent-queues/#persistent-queue-details-and-constraints|this, it looks like the PQ's use workers' storage. Hard to but a number on the amount of storage a worker can allocate for PQ, since it's relative to the amount of data you'd be sending to the destination, right?

  • Jon Rust
    Jon Rust Posts: 475 mod

    yes rate of sending * expected down time / number of workers * compression expected

  • Jon Rust
    Jon Rust Posts: 475 mod

    if you were doing 240 GB per day, that's 10 GB per hour divided by 2 workers == 5 GB per worker; then factor in compression

  • Jon Rust
    Jon Rust Posts: 475 mod

    and add 50% :slightly_smiling_face:

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    napkin math ftw thanks a lot Jon. I did set it to 5GB, since our workers are on Fargate and have 20GB of ephemeral storage by default. And since PQ is worst case sort of scenario, ie the entire indexer cluster is :dumpsterfire: , seems to be a safe setting.

  • Jon Rust
    Jon Rust Posts: 475 mod

    eggzactly

  • Jon Rust
    Jon Rust Posts: 475 mod

    don't overthink it

  • Jon Rust
    Jon Rust Posts: 475 mod

    or over-provision it

  • Paul Dott
    Paul Dott Posts: 33 ✭✭

    > don't overthink it too late for that :laughing:

  • Raanan Dagan
    Raanan Dagan Posts: 101 mod

    One more note .. Max queue size is per Worker Processes (not per Worker node)