We want to enable persistent queuing and want to know how large we can make the max file size?
We are enabling persistent queuing (PQ) on our destinations and notices that the Max File size is defaulted to 1MB. With the amount of data we are processing that will lead to A LOT of small files so we were thinking of increasing that to a Max File Size of 100MB. However, before we do that - we wanted to know if there was any potential impact to Cribl if the PQ files are to large ?
When Cribl is reading these files to send to the destination, does it read the entire file into memory ?
Best Answer
-
I recommend either sticking with the 1 MB or a compromise at, say, 10 MB but the smaller the
better.The reason is that we cant delete a file upon flushing until all data from the file has been flushed so the faster we can flush all data from a file and the smaller it is then the faster we can free up the storage it consumes. The app wont flush faster because the file is smaller but the app can remove a file sooner if its smaller.
Also note that worker processes write to their own files so the data going into a given file is only being written to by 1 process.
Offhand Im unsure if each PQ file is loaded into memory when its data is sent to the destination. However, even if they are, keep in mind that Stream reads them sequentially so itll only be reading from 1 file at a time for any given destination and worker process combination.
0
Answers
-
I recommend either sticking with the 1 MB or a compromise at, say, 10 MB but the smaller the
better.The reason is that we cant delete a file upon flushing until all data from the file has been flushed so the faster we can flush all data from a file and the smaller it is then the faster we can free up the storage it consumes. The app wont flush faster because the file is smaller but the app can remove a file sooner if its smaller.
Also note that worker processes write to their own files so the data going into a given file is only being written to by 1 process.
Offhand Im unsure if each PQ file is loaded into memory when its data is sent to the destination. However, even if they are, keep in mind that Stream reads them sequentially so itll only be reading from 1 file at a time for any given destination and worker process combination.
0