Wanna share with you what managed to take me half a day to clarify. I just read in the following article https://docs.confluent.io/current/kafka/deployment.html#file-descriptors-and-mmap
and learned that in order to optimize kafka, you will need to also change the maximum number of open files. It is nice, but our clusters are deployed on Ubuntu and the images are pretty basic. Not really sure if this is valid for all of the distributions but at least for this one it’s absolutely needed.
Before trying to setup anything in
make sure that you have exported in
session required pam_limits.so
It is needed in order for ssh, su processes to take the new limits for that user (in our case kafka).
Doing this will help you define new values on “limits” file. You are now free to setup nofile limit like this for example
* soft nofile 10000 * hard nofile 100000 kafka soft nofile 10000 kafka hard nofile 100000
After it is done, you can restart the cluster and check value by finding process with ps-ef | grep kafka and viewing limit file using cat /proc/[kafka-process]/limits.
I will come back later with also a puppet implementation for this.