• Docker statistics – way to investigate performance

    Hi,

    I wish it would be mine but it isn’t. Quite good article from this week newsletter related to container stats from Docker containers:

    Analyzing Docker container performance with native tools

    Wish you an enjoyable read.

    Cheers!

  • Kafka limits implementation using puppet

    Morning,

    I keep my promise and provide you with the two simple blocks that are needed to implement limits that we discussed in article http://log-it.tech/2017/10/16/ubuntu-change-ulimit-kafka-not-ignore/

    For the limits module you can use:
    https://forge.puppet.com/puppetlabs/limits

    As for the actual puppet implementation, I took the decision not to restart the service immediately. This being said, it’s dead simple to do it:

    	 file_line {"add_pamd_record":
    	 path => '/etc/pam.d/common-session',
    	 line => 'session required pam_limits.so'
    	 }
    	 limits::fragment {
    	     "*/soft/nofile":
          		value => "100000";
        		"*/hard/nofile":
          		value => "100000";
       		 "kafka/soft/nofile":
          		value => "100000";
        		"kafka/hard/nofile":
          		value => "100000";
      }
    

    This is all you need.

    Cheers

  • Ubuntu – change ulimit for kafka, do not ignore

    Hi,

    Wanna share with you what managed to take me half a day to clarify. I just read in the following article https://docs.confluent.io/current/kafka/deployment.html#file-descriptors-and-mmap
    and learned that in order to optimize kafka, you will need to also change the maximum number of open files. It is nice, but our clusters are deployed on Ubuntu and the images are pretty basic. Not really sure if this is valid for all of the distributions but at least for this one it’s absolutely needed.
    Before trying to setup anything in

    /etc/security/limits.conf

    make sure that you have exported in

    /etc/pam.d/common-session

    line

    session required pam_limits.so

    It is needed in order for ssh, su processes to take the new limits for that user (in our case kafka).
    Doing this will help you define new values on “limits” file. You are now free to setup nofile limit like this for example

    *               soft    nofile          10000
    *		hard	nofile		100000
    kafka		soft 	nofile		10000
    kafka		hard	nofile		100000

    After it is done, you can restart the cluster and check value by finding process with ps-ef | grep kafka and viewing limit file using cat /proc/[kafka-process]/limits.

    I will come back later with also a puppet implementation for this.

    Cheers!

  • Kafka implementation using puppet at IMWorld Bucharest 2017

    Hi,

    I recently had a presentation on how to deploy kafka using puppet and what do you need as a minimum in order to have success in production.
    Here is the presentation:

    Hope it is useful.

    Cheers!

    Update:

    There is also an official version from IMWorld which you can find here:

    And also the article on medium.com that describes it in more technical detail:

    https://medium.com/@sorin.tudor/messaging-kafka-implementation-using-puppet-5438a0ed275d

  • Definitive guide to Kafka, confluent edition

    Hi,

    No technical details today. Just wanted to share with you the Definitive guide to Kafka, book provided by our dear and esteem colleagues from Confluent

    https://www.confluent.io/wp-content/uploads/confluent-kafka-definitive-guide-complete.pdf

    Thank you, it should be an interesting read.

    Cheers!

  • Eyaml hiera configuration for puppet, as promised

    Morning,

    We managed to configure also the hiera backend in order to have eyaml module active. It is related to the following past article http://log-it.tech/2017/05/29/install-eyaml-module-on-puppet-master/. So in the hiera.yaml you bassicaly need to add the following configuration before hierarchy:

    :backends:
      - eyaml
      - yaml
      - puppetdb
    

    and

    :eyaml:
        :datadir: /etc/puppetlabs/hieradata
        :pkcs7_private_key: /etc/puppetlabs/puppet/eyaml/private_key.pkcs7.pem
        :pkcs7_public_key:  /etc/puppetlabs/puppet/eyaml/public_key.pkcs7.pem 
        :extension: 'yaml
    

    at the botton. After this is done, the most essential part is that you created the required symlinks so that the backend is enabled.
    This should be done easily with a bash script like:

    #!/bin/bash
    ln -s /opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/hiera-eyaml-2.1.0/lib/hiera/backend/eyaml /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/hiera/backend/eyaml
    ln -s /opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/hiera-eyaml-2.1.0/lib/hiera/backend/eyaml_backend.rb /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/hiera/backend/eyaml_backend.rb
    ln -s /opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/hiera-eyaml-2.1.0/lib/hiera/backend/eyaml.rb /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/hiera/backend/eyaml.rb
    ln -s /opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/highline-1.6.21/lib/highline /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/highline/
    ln -s /opt/puppetlabs/puppet/lib/ruby/gems/2.1.0/gems/highline-1.6.21/lib/highline.rb /opt/puppetlabs/puppet/lib/ruby/vendor_ruby/highline.rb

    After this is done, it is advised for a puppetdb and puppetserver restart, and you can try testing it by putting a string in hiera and see if a notice prints the required output. Something like

    profiles::test::teststring: '[string generated with eyaml ecrypt -s 'test']'

    and then creating a small class like :

    
    class profiles::test{
    $teststring = hiera('profiles::test::teststring')
    notice {"${teststring}":}
    }

    That should be most of you need in order to do this. Hope it works! 🙂

    Cheers!

  • Python dictionary construction from process list

    Hi,

    This is out of my expertise but i wanted to shared it anyways. One colleague wanted to help him with the creation of a pair key:value from one command that lists the processes, in python. With a little bit of testing i came to the following form:

    
    import os
    import subprocess
    from subprocess import Popen, PIPE
    username = subprocess.Popen(['/bin/ps','-eo','pid,uname'], stdout=PIPE, stderr=PIPE)
    firstlist = username.stdout.read().split('\n')
    dict = {}
    for str in firstlist:
      if (str != ''):
        secondlist = str.split()
        key = secondlist[0]
        value = secondlist[1]
        dict[key]=value
    print(dict)
    

    Now, i think there are better ways to write this but it works also in this way.
    If you find better ways, please leave a message 🙂

    Cheers

  • Kafka problem that wasn’t a problem after all

    Hi,

    Do not make my mistake from the last couple of weeks trying to connect to a “secured” kafka cluster using TLS. I wrote following article http://log-it.tech/2017/07/27/configure-kafka-truststore-keystore-using-puppet/ some time ago, and i know that it’s far from bullet proof but it does the job.
    Now let’s get to the subject, if you want to connect to the node once this is activated you can not use localhost anymore. And the way i figured it out is by trying to test the port using openssl command.
    The config in server.properties is

    'listeners'                     => "PLAINTEXT://${::fqdn}:9092,SSL://${::fqdn}:9093", #both listeners are enabled
    'advertised.listeners'          => "PLAINTEXT://${::fqdn}:9092,SSL://${::fqdn}:9093",

    So, please keep in mind that it’s configured to listen on FQDN, so normally the external interface is the target not the loopback adapter.
    Now if you try to test it using localhost you will surely get this output:

    /opt/kafka/bin# openssl s_client -debug -connect localhost:9093 -tls1
    connect: Connection refused
    connect:errno=111

    Do not try to check if the firewall or port it’s opened. You can easily check that using iptables -L or netstat -tulpen | grep 9093. The problem is that instead of localhost you should be using FQDN like openssl s_client -debug -connect ${fqdn}:9093 -tls1 and you will see a lot of keys/certificates.
    Now, if you want for example to use the standard .sh scripts that are delivered with kafka installation, you should created a file called config.properties (for example) and pass it as parameter. In case zookeeper connect (with the –zookeeper parameter) this is not needed but if you want to start a console consumer or producer, or you want to check the consumer groups, this will be needed. Let me just give you an example:

    /opt/kafka/bin# ./kafka-consumer-groups.sh --command-config /root/config.properties --bootstrap-server ${fqdn}:9093 --list
    Note: This will only show information about consumers that use the Java consumer API (non-ZooKeeper-based consumers).
    
    console-consumer-30514
    KMOffsetCache-kafka2
    KMOffsetCache-kafka0
    KMOffsetCache-kafka1
    

    Oterwise, it will not work. And my config file looks like this:

    security.protocol=SSL
    ssl.truststore.location=/home/kafka/kafka.client.truststore.jks
    ssl.truststore.password=password
    ssl.keystore.location=/home/kafka/kafka.client.keystore.jks
    ssl.keystore.password=password
    ssl.key.password=password
    

    I can not give you all the details to all the commands but at least i am confident i put you on the right track.

    Cheers

  • Configure Jupyter Notebook on Raspberry PI 2 for remote access and scala kernel install

    Hi,

    This is a continuation of the previous article regarding Jupyter Notebook (http://log-it.tech/2017/09/02/installing-jupyter-notebook-raspberry-pi-2/) Let’s start with my modification in order to have an remote connection to it. It first needs a password in the form of password hash. To generate this pass run python cli and execute this code from IPython.lib import passwd;passwd(“your_custom_password”). Once you get the password hash, we can list the fields that i uncommented to activate minimal remote access:

    c.NotebookApp.open_browser = False #do not open a browser on notebook start, you will access it by daemon remotely
    c.NotebookApp.ip = '*' #permite access on every interface of the server
    c.NotebookApp.password = u'[your_pass_has]' #setup password in order to access the notebook, otherwise token from server is required (if you want it this way you can get the token by running sudo systemctl status jupyter.service 

    You can also add them at the bottom of the file as well. In order for the changes to take effect you will need also to perform a service restart with sudo systemctl restart jupyter.service.

    You have now the basic steps to run Jupyter Notebook with the IPython 2 kernel. Now lets’s ger to the next step of installing the scala kernel(https://www.scala-lang.org).

    The steps are pretty straight forward and they are taken from this link https://www.packtpub.com/mapt/book/big_data_and_business_intelligence/9781785884870/9/ch09lvl1sec65/installing-the-scala-kernel , what i tried is to put it end to end. I am not 100% sure if you need also java 8 but i installed it anyway, you will find the steps here https://www.raspinews.com/installing-oracle-java-jdk-8-on-raspberry-pi/ but what you really need to install is sbt.

    The catch here is that you don’t need to search for sbt on raspberry, just drop the default one, it will do the job. The steps are listed here http://www.scala-sbt.org/release/docs/Installing-sbt-on-Linux.html. Once it is installed you can return to the link listed above and just run the steps:

    apt-get install git
    git clone https://github.com/alexarchambault/jupyter-scala.git
    cd jupyter-scala
    sbt cli/packArchive

    Sbt will grab a lot of dependences, if you work with proxies i am not aware of the settings that you need to do, but you can search it and probably you find a solution. Have patience, it will take a while until it is done, but once it is done you can run ./jupyter-scala in order to install the kernel and also check if it works with jupyter kernelspec list.

    Restart the Jupyter Notebook to update it, although i am not convinced if it’s necessary 🙂
    In my case i have a dynamic dns service from my internet provider but i think you can do it with a free dns provider on your router as well. An extra forward or NAT of port 8888 will be needed but once this is done you should have a playgroup in your browser that knows python and scala. Cool, isn’t it?

    Cheers

  • Installing Jupyter Notebook on Raspberry PI 2

    Morning,

    Just want to share you that i managed to install the Jupyter Notebook(http://jupyter.org) on a Raspberry PI 2 without any real problems. Beside a microSD card and a Raspberry you need to read this and that would be all.
    So, you will need a image of Raspbian from https://www.raspberrypi.org/downloads/raspbian/ (i selected the lite version without the GUI, you really don’t need that actually). In installed it on the card with Linux so i executed a command similar with dd if=[path_to_image]/[image_name] of=[sd_device_name taken from fdisk -l without partition id usually /dev/mmcblk0] bs=4MB; sync. The sync command is added just to be sure that all files are syncronized to card before remove it. We have now a working image that we can use on raspberry, it’s fair to try boot it.
    Once it’s booted login with user pi and password raspberry. I am a fan of running the resize steps which you can find here https://coderwall.com/p/mhj8jw/raspbian-how-to-resize-the-root-partition-to-fill-sd-card.
    Ok, so we are good to go on installing Jupyter Notebook, at first we need to check what Python version we have installed and in my case it was 2.7.13 (it should be shown by running python –version). In this case then we need to use pip for this task, and it’s not present by default on the image.
    Run sudo apt-get install python-pip, after this is done please run pip install jupyter. It will take some time, but when it is done you will have a fresh installation in pi homedir(/home/pi/.local).
    It is true that we need also a service, and in order to do that, please create following path with following file:
    /usr/lib/systemd/system/jupyter.service

    [Unit]
    Description=Jupyter Notebook
    
    [Service]
    Type=simple
    PIDFile=/run/jupyter.pid
    # Step 1 and Step 2 details are here..
    # ------------------------------------
    ExecStart=/home/pi/.local/bin/jupyter-notebook --config=/home/pi/.jupyter/jupyter_notebook_config.py
    User=pi
    Group=pi
    WorkingDirectory=/home/pi/notebooks
    Restart=always
    RestartSec=10
    #KillMode=mixed
    
    [Install]
    WantedBy=multi-user.target

    You are probably wondering from where do you get the config file. This will be easy, just run /home/pi/.local/bin/jupyter notebook –generate-config

    After the file is created, in order to activate the service and enable it there are sudo systemctl enable jupyter.service and sudo systemctl start jupyter.service

    You have now a fresh and auto managed jupyter service. It will be started only on the localhost by default, but in the next article i will tell you also the modifications to be executed in order to run it remotely and also install scala kernel.

    Cheers!