We have updated our Terms of Service, Code of Conduct, and Addendum.

Data collected using REST Collector getting appended in a single file, how to resolve?

Options

TLDR : JSON Collected from API, parsed through Pipeline, should have three JSON files in S3, but have two where one file have 2 JSON object appended, need help to find out the reason

Hi All,

At first, I want to say, no this is not an event breaker issue, as much as my understanding. Let me explain.

I am trying to get some data from a REST Api endpoint using REST Collector. I am using discovery with Item list, and in the list, there are three values.

In preview, three different events are also getting registered (sometimes, sometimes nothing is coming to the preview also).

Now I am parsing this response using pipeline and putting the data in a S3 bucket using the router. I followed the documentation to configure the Destination as well as the pipeline and route.

Now, I am setting up the schedular to run the REST Collector. Then, after each collection event, when I am checking my s3, there instead of having three separate JSON files for three separate events, there are two JSON files, one with 1 JSON Object, another one with TWO Json object appended one after another (appended, not comma separated or anything). Can anyone tell me why it might be happening and how to mitigate it?

PS. I used the system default event breaker, as well as Cribl no break Event breaker ruleset with same Issue.

Answers

  • Ralph Nowitzki
    Options

    Can you share the Pipeline and Output configs and maybe an example of the json?

  • Ralph Nowitzki
    Options

    I tried to recreate the problem, but failed. I can't test end2end, because I don't have access to a S3 store right now.

    One thing I noticed in your files compared to what the pipeline outputs: You have 2 JSON Objects in your "CriblOut_test" file. but the one that has "two JSON Objects" is the same Object that comes from the API. It just has a second level of object for "food Attributes".

    There is one Object missing from the output (fdcId = 534358 / Nut ' Berry Mix) and I don't see a reason within the Pipeline . Or is the "CriblOut_test" file just the one with the 2 Events?

  • Swapnamay Halder
    Options

    The file with Nut and Berry Mix is a standalone JSON file, and that one is working as intended thus I did not add it in here

  • Ralph Nowitzki
    Options

    I don't see a reason with your Cribl configs. I could add all your elements, run the collector against the API and the events would be 3 seperate.

    It does not really seem to be related, as the event break fine within Cribl, but just to make sure, I would add an dedicated Event Breaker (ndjson?).

    Sorry, can't help any further as I can't test the S3 part.


  • Swapnamay Halder
    Options

    Yeah but the issue is still happening. I will check some other settings then