We have updated our Terms of Service, Code of Conduct, and Addendum.

Performance question on multiple drops vs regex function

Performance question for the team: » Is it more performant to have a pipeline with 30 different "drop" functions (each one has a final mark) that looks for a specific pattern within the `message` field » a lookup with all "droppable" patterns, a single regex function, a single drop function that drops messages that matched a row on the lookup.

Answers

  • dritan
    dritan Posts: 51 ✭✭

    Great question Paimon. Modulo manageability, there is only one way to find out :joy: - test it out with some sample data while profiling the pipeline.

  • Sounds good :smile: . will give it a shot and see what i can discover

  • dritan
    dritan Posts: 51 ✭✭

    :+1: - shouldn't be that bad with the embedded profiler now.

  • for sure, I'll have to wait till we hear back from support. Got an interesting issue that has been brewing over the last month (Case 6086) where worker procs randomly OOM. Even after bumping up our system to 64gb and increasing worker heap to 4gb, proc count at 14. Suspecting we have something leaking in one of our pipelines. Got Rob and Mughda taking a peek, and we just shot over the memory profile from one of the workers that recently had a jvm OOM in our named slack channel here

  • dritan
    dritan Posts: 51 ✭✭

    Ah interesting. Glad to hear you got engineers engaged!