We have updated our Terms of Service, Code of Conduct, and Addendum.

How to integrate Cribl with Databricks(Destination)?

How to integrate Cribl with Databricks(Destination) ? as per the Cribl doc it says "the S3 Destination can be adapted to send data to services like Databricks". So, can we use the pre-configured s3 destination connector to integrate Databricks ? If yes, what are the changes required in s3 destination ?

Answers

  • Kam Amir
    Kam Amir Posts: 21

    Yes, you can send the data either as raw JSON or convert it to parquet for databricks to ingest it.

  • Kam Amir
    Kam Amir Posts: 21

    <@U03UMRGSKRQ&gt; is working on this internally and he can share the work he's done around this integration.

  • Hey Sheetal - that's right you can leverage the S3 Destination tile and leverage the format of your choosing (Parquet | JSON), once you load it to a dataframe you can convert to delta format.

  • And if you have a chance, the https://docs.cribl.io/stream/parquet-schemas|docs explains about parquet schemas in Stream. Since the Delta format is built on top of parquet you have the option of sending the data in parquet.

  • Neil Parisi
    Neil Parisi Posts: 12 mod

    Hi <@U01BVH0E872&gt; and jamie, thanks much for the insights. This is regarding a customer POC just wanted to make up a story here how Cribl integration with Databriks can make a big difference. It would be great if you could share your thoughts/suggestions.

  • Kam Amir
    Kam Amir Posts: 21

    Sure, the story is pretty straightforward, Cribl can take the data that comes from traditional data sources (syslog, agents etc.) and routes it to an object store, transforms the data into parquet for databricks to consume.