Wrong Kafka configuration and deployment using puppet


I just want to share with you one case that we have last week and that involved some wrong kafka deplyment from puppet that actually filled the filesystems and got our colleagues that were using it for a transport layer on ELK.

Lets start with the begining, we have some puppet code to deploy and configure kafka machines. To keep it simple the broker config block from puppet looks like this:

$broker_config = {
    ''                     => '-1', # always set to -1.
    # broker specific config
    'zookeeper.connect'             => hiera('::kafka::zookeeper_connect', $zookeeper_connect),
    '' => hiera('::kafka::inter_broker_protocol_version', $kafka_version),
    'log.dir'                       => hiera('::kafka::log_dir', '/srv/kafka-logs'),
    'log.dirs'                      => hiera('::kafka::log_dirs', '/srv/kafka-logs'),
    'log.retention.hours'           => hiera('::kafka::log_retention_hours', $days7),
    'log.retention.bytes'           => hiera('::kafka::log_retention_bytes', '-1'),
    # confiure availability
    'num.partitions'                => hiera('::kafka::num_partitions', 256),
    'default.replication.factor'    => hiera('::kafka::default_replication_factor', $default_replication_factor),
    # configurre administratability (this is a word now)
    'delete.topic.enable'           => hiera('::kafka::delete_topic_enable', 'true'),

As you can see, there are two fields that need to be carefully configured, one is the log_retention_bytes and the other is the num_partitions. I am underlying this for a very simple reason, if we will take a look in the kafka server.properies file which the engine use to start the broker, we will see


 Now, the first one is the default value (and it means the maximum size per partition), and if you go to an online converter you will see that it means 100GB. This shouldn’t be a problem if you override it topic based, and even more if you manually create a topic with one or two partitions and you have the required space. But our case was different, they were using for logstash which created a default topic configuration with 256 partitions and if the maximum size was 100GB then it though it had 25600GB which i am not mistaken should also translate in 25TB (yeah, that it’s usually more than enough for a ELK instance).

Since we were running only three nodes with 100GB filesystem each, you can imagine that it filled up and the servers crashed.

Now there are two ways to avoid this, depending the number of consumers in the consumer group, and also the retention period. If you have only one consumer and you want to have data for a larger period of time, you can leave it at 100GB (if you have that much storage available) and manually create a topic with 1 partition and a replication factor equal to the number of nodes that you want to use for high availability, or you can leave the partition number to 256, and you highly decrease the retention bytes value. This translates in an equal distribution of data on multiple consumers, but it comes with a lesser storage period of time (now it is also true that it depends on the amount and type of data that you are transferring).

The solution that we adopted is to leave the partition number unchanged and decrease the partition size to 200MB. Keep you informed how it works. 🙂



Install eyaml module on puppet master


Today i will show how i installed module used for data encrypt in order to safely include it in hiera yaml files)
It really simple as described on The actual step that i couldn’t find explicitly written in the doku and i had to figure it out myself is that you need to modify the config.yaml needed by the module.

  1. gem install hiera-eyaml
  2. puppetserver gem install hiera-eyaml
  3. eyaml createkeys
  4. mv ./keys /etc/puppetlabs/puppet/eyaml
  5. $ chown -R puppet:puppet /etc/puppetlabs/puppet/eyaml
    $ chmod -R 0500 /etc/puppetlabs/puppet/eyaml
    $ chmod 0400 /etc/puppetlabs/puppet/eyaml/*.pem
    $ ls -lha /etc/puppetlabs/puppet/eyaml
    -r——– 1 puppet puppet 1.7K Sep 24 16:24 private_key.pkcs7.pem
    -r——– 1 puppet puppet 1.1K Sep 24 16:24 public_key.pkcs7.pem
  6.  vim /etc/eyaml/config.yaml and add following content:
    pkcs7_private_key: '/etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem'
    pkcs7_public_key: '/etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem'

If the last step is not executed, you will get the error: [hiera-eyaml-core] No such file or directory – ./keys/public_key.pkcs7.pem

After these configurations you should be able to encrypt files or strings. Short example:

eyaml encrypt -s 'test'
[hiera-eyaml-core] Loaded config from /etc/eyaml/config.yaml
string: ENC[PKCS7,MIIBeQYJKoZIhvcNAQcDoIIBajCCAWYCAQAxggEhMIIBHQIBADAFMAACAQEwDQYJKoZIhvcNAQEBBQAEggEAvWHMltzNiYnp0iG6vl6tsgayYimoFQpCFeA8wdE3k6h2OGZAXHLOI+ueEcv+SXVtOsqbP2LxPHe19zJS9cLV4tHu1rUEAW2gstkImI4FoV1/SoPrXNsBBXuoG3j7R4NGPpkhvOQEYIRTT9ssh9hCrzkEMrZ5pZDhS4lNn01Ax1tX99NdmtXaGvTTML/kV061YyN3FaeztSUc01WwpeuHQ+nLouuoVxUUOy/d/5lD5wLKq9t8BYeFG6ekq/D9iGO6D/SNPB0UpVqdCFraAN7rIRNfVDaRbffCSdE59AZr/+atSdUk9cI0oYpG25tHT9x3eWYNNeCLrVAoVMiZ01uR7zA8BgkqhkiG9w0BBwEwHQYJYIZIAWUDBAEqBBBHO9P8JfkovKLMdtvaIxAzgBAjiu0/l+Hm+Xaezhp2AWjj]


block: >

Will write something similar for Hiera configuration to use the desired backend.


cloud newtools

Small Vagrant config file for Rancher deploy


Just wanted to post this also, if it’s not that nice the config using a jumpserver, surely we can convert that to code (Puppet/Ansible), you can also use Vagrant. The main issue that i faced when i tried to create my setup is that for a reason (not really sure why, Vagrant on Windows runs very slow). However, i chose to give you one piece of Vagrantfile for a minimal setup on which you can grab the Rancher server framework and also the client containers.

Here is it:

# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure("2") do |config|
config.vm.define "master" do |master| = "centos/7"
master.vm.hostname = 'master' "public_network", bridge: "enp0s25"
config.vm.define "slave" do |slave| = "centos/7"
slave.vm.hostname = 'slave' "public_network", bridge: "enp0s25"
config.vm.define "swarmmaster" do |swarmmaster| = "centos/7"
swarmmaster.vm.hostname = 'swarmmaster' "public_network", bridge: "enp0s25"
config.vm.define "swarmslave" do |swarmclient| = "centos/7"
swarmclient.vm.hostname = 'swarmclient' "public_network", bridge: "enp0s25"


Do not worry about the naming of the machines, you can change them to whatever you like, the main catch is to bridge the public network in all of them in order to be able to communicate with each other and also have access to the docker hub. Beside that everything else that i posted regarding the registry to the Rancher framework is still valid.

Thank you for your time,