• Order Linux processes by memory usage

    This one is more for me actually. We have some issues with one puppet instance on which the processes fail, and i wanted to see if there is any way to order them by memory usage.

    So i searched the net and found this link https://unix.stackexchange.com/questions/92493/sorting-down-processes-by-memory-usage

    The command is like

    ps aux --sort -rss | head -10

    And it provides you with following output, at least in my case

    USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
    puppet    6327 70.1 25.5 3585952 1034532 ?     Sl   06:53   7:33 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Djava.security.egd=/dev/urandom -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8778 -Xms1024m -Xmx1024m -cp /opt/puppetlabs/server/apps/puppetserver/puppet-server-release.jar clojure.main -m puppetlabs.trapperkeeper.main --config /etc/puppetlabs/puppetserver/conf.d -b /etc/puppetlabs/puppetserver/bootstrap.cfg
    jenkins   6776  9.6 16.6 4648236 671980 ?      Sl   06:55   0:51 /usr/bin/java -Djava.awt.headless=true -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8780 -Xms1024m -Xmx1024m -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --httpListenAddress=127.0.0.1
    puppetdb  5987 16.8 11.7 3845896 474164 ?      Sl   06:52   2:01 /usr/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Djava.security.egd=/dev/urandom -Xmx192m -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8779 -cp /opt/puppetlabs/server/apps/puppetdb/puppetdb.jar clojure.main -m puppetlabs.puppetdb.main --config /etc/puppetlabs/puppetdb/conf.d -b /etc/puppetlabs/puppetdb/bootstrap.cfg
    postgres  1458  0.0  2.1 249512 88656 ?        Ss   Nov21   3:10 postgres: checkpointer process                                                                                              
    postgres  6206  0.0  1.4 253448 57984 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36882) idle                                                                           
    postgres  6209  0.0  0.7 252580 29820 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36886) idle                                                                           
    postgres  6210  0.0  0.5 254892 22440 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36888) idle                                                                           
    postgres  6213  0.0  0.5 254320 21416 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36894) idle                                                                           
    postgres  6205  0.0  0.5 253524 20324 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36878) idle                       

    As you can probably see, the components are taking slowly but surely more and more memory and since the machine has only 4GB allocated it will probably crash again.

    If this happens, i will manually increase the memory with another 2GB and see where we will go from there.

    Cheers!

  • Golang logging using USER profile on Mint 19

    Hi,

    I committed on learning Golang and as a part of this task i came to play with logging examples. It seems that if you user syslog.LOG_USER the info is stored in the /var/log/syslog.

    Here is the code and also the output

    package main
    import (
    	"io"
    	"log"
    	"log/syslog"
    	"os"
    	"path/filepath"
    )
    func main() {
    	progname := filepath.Base(os.Args[0])
    	sysLog, err := syslog.New(syslog.LOG_INFO|syslog.LOG_USER,progname)
    	if err != nil {
    	log.Fatal(err)
    } else {
    	log.SetOutput(sysLog)
    	}
    	log.Println("LOG_INFO + LOG_USER: Logging in Go!")
    	io.WriteString(os.Stdout,"Will you see this?")
    }
    

    The second line (Will you see this?) is outputed only in console.

    Oct 29 14:30:25 mintworkstation logging[4835]: 2018/10/29 14:30:25 LOG_INFO + LOG_USER: Logging in Go!
    Oct 29 14:30:25 mintworkstation logging[4835]: 2018/10/29 14:30:25 LOG_INFO + LOG_USER: Logging in Go!
    

    P.S.: Managed to find a config file located under /etc/rsyslog.d, called 50-default.conf.
    In this file there is a commented line

    #user.*				-/var/log/user.log
    

    If you uncomment it and restart service with systemctl restart rsyslog, the output will be moved to /var/log/user.log

    Oct 29 14:48:32 mintworkstation NetworkManager[836]:   [1540817312.1683] connectivity: (enp0s31f6) timed out
    Oct 29 14:49:37 mintworkstation gnome-terminal-[2196]: g_menu_insert_item: assertion 'G_IS_MENU_ITEM (item)' failed
    Oct 29 14:49:59 mintworkstation gnome-terminal-[2196]: g_menu_insert_item: assertion 'G_IS_MENU_ITEM (item)' failed
    Oct 29 14:50:28 mintworkstation gnome-terminal-[2196]: g_menu_insert_item: assertion 'G_IS_MENU_ITEM (item)' failed
    Oct 29 14:50:59 mintworkstation logging[5144]: 2018/10/29 14:50:59 LOG_INFO + LOG_USER: Logging in Go!
    Oct 29 14:51:14 mintworkstation gnome-terminal-[2196]: g_menu_insert_item: assertion 'G_IS_MENU_ITEM (item)' failed
    

    Cheers

  • Small go code example for zookeeper resource editing

    Hi,

    We have the task of “service restart coordination” for our Apache Kafka cluster. It’s still a work in progress but if you want to use the zookeeper for some status verification and update, something like this will work as an example.

    package main
    
    import (
    	"fmt"
    	"io"
    	"launchpad.net/gozk"
    	"os"
    	"strings"
    	"sync"
    	"time"
    )
    
    const (
    	SERVICEPATH = "/servicerestart"
    )
    
    var wg sync.WaitGroup
    
    func main() {
    	conn := "zk1:2181,zk2:2181,zk3:2181"
    	connSlice := strings.Split(string(conn), ",")
    	var flag bool
    	args := os.Args
    	if len(args) != 2 {
    		io.WriteString(os.Stdout, "Argument is needed for the script\n")
    		os.Exit(1)
    	} else {
    		switch args[1] {
    		case "hang":
    			flag = false
    		case "nohang":
    			flag = true		
    		default:
    			io.WriteString(os.Stdout, "Command unrecognized\n")	
    	}
    		
    	}
    	wg.Add(1)
    	go ModifyZooStat(connSlice, flag)
    	wg.Wait()
    }
    func ModifyZooStat(strconn []string, flag bool) {
    	var zooReach string
    	for _, zoohost := range strconn {
    		zk, _, err := zookeeper.Dial(zoohost, 5e9)
    		if err != nil {
    			fmt.Println("Couldn't connect to " + zoohost)
    			continue
    		} else {
    			zooReach = zoohost
    			zk.Close()
    			break
    		}
    	}
    	zkf, sessionf, _ := zookeeper.Dial(zooReach, 5e9)
    defer zkf.Close()
    	event := <-sessionf
    	if event.State != zookeeper.STATE_CONNECTED {
    		fmt.Println("Couldn't connect")
    	}
    	acl := []zookeeper.ACL{zookeeper.ACL{Perms: zookeeper.PERM_ALL, Scheme: "world", Id: "anyone"}}
    	host, _ := os.Hostname()
    	t := time.Now()
    	servicerestart, _ := zkf.Exists(SERVICEPATH)
    	if servicerestart == nil {
    		path, _ := zkf.Create(SERVICEPATH, host+" "+t.Format(time.Kitchen), zookeeper.EPHEMERAL, acl)
    		fmt.Println(path)
    	} else {
    		change, _ := zkf.Set(SERVICEPATH, host+" "+t.Format(time.Kitchen), -1)
    		fmt.Println(change.MTime().Format(time.Kitchen))
    	}
    	if flag {
    		wg.Done()
    	}
    
    }
    

    Let me explain what it does. Basically it takes a zookeeper connection string and it splits it per server. This was a requirement from the zk module used. It could’n take as argument more than one case of host:2181.
    After we found the active server, we can connect to it and put in the /servicerestart path the hostname and also the time on which the resource was edited.
    In order to create a resource, you will need an ACL slice that will be passed as parameter.

    acl := []zookeeper.ACL{zookeeper.ACL{Perms: zookeeper.PERM_ALL, Scheme: "world", Id: "anyone"}}

    Once this slice is created we will get in the next step and check if the resource exists. If it doesn’t then we will create it and if it does, we will just modify it.

    The fmt.Println instructions are put basically for two reasons.

    • In order to see the resource that it’s created. And i wanted to do that because zookeeper.EPHEMERAL parameter only creates this resource as long as the connection is active. If you want persistence, you will have to use zookeeper.SEQUENCE but it will add to your resource name a unique counter.
    • Also see the timestamp when the resource was modified.

    Even if you don’t close the zookeeper connection with defer zkf.Close(), it will close it automatically and end the script. So, we still need a way to keep it alive, and we will do that using WaitGroups…
    We will add one function in the queue and wait for it to finish. And to control this we can use a parameter that is mapped to a flag.

    This is just a very small example and i am still a true beginner in the art of Go programming, but hope it helps 🙂

    Cheers

  • Don’t delete the Kafka GC logs when they are used

    Hi,

    I made a mistake some time ago, and it’s there to hunt me.
    Deleting the normal gc logs including the one it’s already used doesn’t solve anything, it just created a more difficult situation.
    Here is my example:

    /dev/sda1                        50G   42G  5.2G  90% /
    /opt/kafka/logs# ll
    total 34M
    drwxrwxr-x 2 kafka kafka 4.0K Oct 10 19:34 ./
    drwxr-xr-x 7 kafka kafka 4.0K Mar 14  2018 ../
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 controller.log
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 kafka-authorizer.log
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 kafka-request.log
    -rw-rw-r-- 1 kafka kafka 2.9M Oct 11 04:44 log-cleaner.log
    -rw-rw-r-- 1 kafka kafka 6.1M Oct 11 05:24 server.log
    -rw-rw-r-- 1 kafka kafka  25M Oct  4 14:03 state-change.log
    
    lsof +L1 | grep delete
    init        1     root   13w   REG    8,1         106     0     95 /var/log/upstart/systemd-logind.log.1 (deleted)
    init        1     root   14w   REG    8,1        5794     0   2944 /var/log/upstart/kafka-manager.log.1 (deleted)
    java     1630    kafka    3w   REG    8,1 46836567522     0 524939 /opt/kafka-2.11-0.10.1.1/logs/kafkaServer-gc.log (deleted)
    java     1863 dd-agent    4r   REG    8,1     5750256     0 525428 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    java    10749 dd-agent    4r   REG    8,1     5750216     0 525427 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.0-jar-with-dependencies.jar (deleted)
    bash    10928     root    0u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root    1u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root    2u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root  255u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    0u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    1u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    2u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    3r   REG    8,1    52428909     0 525512 /opt/kafka-2.11-0.10.1.1/logs/server.log.1 (deleted)
    java    14692 dd-agent    4r   REG    8,1     5750256     0 526042 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    java    16574 dd-agent    4r   REG    8,1     5750256     0 526041 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    

    Handling gc in versions lower than 1.0.0 is quite tricky. It is best to remove these options from your startup script

    -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/opt/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

    But taking into consideration that we use a standard puppet module that it’s used by multiple teams it is still to be fixed. Fortunately from 1.0.0, GC is disabled by default.

    In order to fix what i showed you before, process restart is needed and we will do that.

    Cheers

  • Final version of SSL gen script for kafka

    Hi,

    I wrote a lot about this topic but it seems that i came to the procedure specified by Confluent.
    Here is the right way to do it, at least for now:

    #!/bin/bash
    HOST=<%= @fqdn %>
    PASSWORD=<%= @pass %>
    KEYSTOREPASS=<%= @keystorepass %>
    VALIDITY=365
    
    keytool -keystore kafka.server.keystore.jks -alias ${HOST} -validity $VALIDITY -genkey -dname "CN=${HOST}, OU=MyTeam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    openssl req -new -x509 -keyout ca-key -out ca-cert -days $VALIDITY -subj "/CN=${HOST}/OU=MyTeam/O=MyCompany/L=Bucharest/S=Romania/C=RO" -passout pass:$PASSWORD
    keytool -keystore kafka.server.truststore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.server.keystore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.server.keystore.jks -alias ${HOST} -certreq -file cert-file-${HOST}.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-${HOST}.host -out cert-signed-${HOST}.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias ${HOST} -import -file cert-signed-${HOST}.host -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.client.keystore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.client.truststore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    
    <% @servers.each do |server| 
    separate = server.split("."); host = separate[0]-%>
    # <%= server %>
    keytool -keystore <%= host %>.server.keystore.jks -alias <%= server %> -validity $VALIDITY -genkey -dname "CN=<%= server %>, OU=MyTeam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore <%= host %>.server.keystore.jks -alias <%= server %> -certreq -file cert-file-<%= server %>.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-<%= server %>.host -out cert-signed-<%= server %>.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore <%= host %>.server.keystore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore <%= host %>.server.keystore.jks -alias <%= server %> -import -file cert-signed-<%= server %>.host -storepass $KEYSTOREPASS -noprompt
    
    <% end -%>
    
    keytool -keystore kafka.client.keystore.jks -alias 'client' -validity $VALIDITY -genkey -dname "CN=${HOST}, OU=MyTeam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore kafka.client.keystore.jks -alias 'client' -certreq -file cert-file-client.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-client.host -out cert-signed-client.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.client.keystore.jks -alias 'client' -import -file cert-signed-client.host -storepass $KEYSTOREPASS -noprompt
    

    The puppet code needs to be modified also. You can find the initial manifest here The difference is

    
    if (member($servers,$item[0]) and $item[1] == "disabled") {
        $fqdn_split = split($item[0], '[.]')
            exec{"copy files to ${item[0]}":
                cwd => '/home/kafka',
                path   => '/usr/bin:/usr/sbin:/bin',
                command => "scp /home/kafka/${fqdn_split[0]}.server.keystore.jks kafka@${item[0]}:/home/kafka/kafka.server.keystore.jks; scp /home/kafka/kafka.server.truststore.jks kafka@${item[0]}:/home/kafka/kafka.server.truststore.jks",
                user => 'kafka',
            }
            }
    

    Enough on this topic.

    Cheers

  • Wrong again, there is no return code 0 on self signed certs

    Morning,

    It looks like i was wrong again with the SSL generation script. Here is the second article

    Code 0 is not good after all and it signals that Kafka broker is closing the connection really fast.

    So:

  • There is no 0 on self signed certs
  • Please make sure that you have a certificate in chain when you test
  • I will give you just the server side, for the client it’s still not very clear if it works. Once i have the confirmation i will post it.

    #!/bin/bash
    HOST=<%= @fqdn %>
    PASSWORD=<%= @pass %>
    KEYSTOREPASS=<%= @keystorepass %>
    VALIDITY=365
    
    openssl genrsa -out CA.key 2048
    openssl req -new -x509 -keyout CA.key -out ca-cert -days $VALIDITY -subj "/CN=${HOST}/OU=MyTeam/O=MyCompany/L=Bucharest/S=Romania/C=RO" -passout pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias $HOST -validity $VALIDITY -genkey -dname "CN=${HOST}, OU=MyTeam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore kafka.server.keystore.jks -alias $HOST -certreq -file cert-file-${HOST}.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey CA.key -in cert-file-${HOST}.host -out cert-signed-${HOST}.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias CARoot -import -trustcacerts -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.server.keystore.jks -alias $HOST -import -file cert-signed-${HOST}.host -storepass $KEYSTOREPASS -noprompt
    
    <% @servers.each do |server| -%>
    # <%= server %>
    keytool -keystore kafka.server.keystore.jks -alias <%= server %> -validity $VALIDITY -genkey -dname "CN=<%= server %>, OU=MyTeam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore kafka.server.keystore.jks -alias <%= server %> -certreq -file cert-file-<%= server %>.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey CA.key -in cert-file-<%= server %>.host -out cert-signed-<%= server %>.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias <%= server %> -import -file cert-signed-<%= server %>.host -storepass $KEYSTOREPASS -noprompt
    <% end -%>
    
    keytool -keystore kafka.server.truststore.jks -alias CARoot -import -trustcacerts -file ca-cert -storepass $KEYSTOREPASS -noprompt
    

    Hope i don’t discover anything else that it’s wrong. If so, keep you informed

    PS: It seems that i was wrong again 😀 It’s strange that it works with Kafka until 2.0 but it will not validate on that version.
    The final right way to do it is to kave in the keystore only caroot and the alias correspondent to that server.
    Will post as soon as i have an implementation.

    And here it is.
    Cheers

  • Correct SSL script for Kafka deployment

    Hi,

    I wrote some time ago a post about certificate generation in order to secure Kafka cluster.

    Long story short, it was wrong!

    Here is the correct version that returns O (keystore is correctly generated and used)

    
    #!/bin/bash
    HOST=<%= @fqdn %>
    PASSWORD=<%= @pass %>
    KEYSTOREPASS=<%= @keystorepass %>
    VALIDITY=365
    
    keytool -keystore kafka.server.temp.keystore.jks -alias $HOST -validity $VALIDITY -genkey -dname "CN=${HOST}, OU=Myteam, O=Mycompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    openssl req -new -x509 -keyout ca-key -out ca-cert -days $VALIDITY -subj "/CN=${HOST}/OU=Myteam/O=MyCompany/L=Bucharest/S=Romania/C=RO" -passout pass:$PASSWORD
    keytool -keystore kafka.server.temp.keystore.jks -alias $HOST -certreq -file cert-file-${HOST}.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-${HOST}.host -out cert-signed-${HOST}.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias $HOST -import -file cert-signed-${HOST}.host -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.server.keystore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.server.truststore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    
    
    <% @servers.each do |server| -%>
    # <%= server %>
    keytool -keystore kafka.server.temp.keystore.jks -alias <%= server %> -validity $VALIDITY -genkey -dname "CN=<%= server %>, OU=Myteam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore kafka.server.temp.keystore.jks -alias <%= server %> -certreq -file cert-file-<%= server %>.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-<%= server %>.host -out cert-signed-<%= server %>.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.server.keystore.jks -alias <%= server %> -import -file cert-signed-<%= server %>.host -storepass $KEYSTOREPASS -noprompt
    <% end -%>
    
    keytool -keystore kafka.client.temp.keystore.jks -alias 'client' -validity $VALIDITY -genkey -dname "CN=${HOST}, OU=Myteam, O=MyCompany, L=Bucharest S=Romania C=RO" -storepass $KEYSTOREPASS -keypass $KEYSTOREPASS
    keytool -keystore kafka.client.temp.keystore.jks -alias 'client' -certreq -file cert-file-client.host -storepass $KEYSTOREPASS
    openssl x509 -req -CA ca-cert -CAkey ca-key -in cert-file-client.host -out cert-signed-client.host -days $VALIDITY -CAcreateserial -passin pass:$PASSWORD
    keytool -keystore kafka.client.keystore.jks -alias $HOST -import -file cert-signed-client.host -storepass $KEYSTOREPASS -noprompt
    keytool -keystore kafka.client.truststore.jks -alias CARoot -import -file ca-cert -storepass $KEYSTOREPASS -noprompt
    

    Here is also a link to the old article for comparison wrong way to do it

    PS: It seems that this is also wrong. Please check article

  • Cgroups management on Linux – first steps

    Hi,

    I didn’t know that much about control groups but i see that there are a big thing in performance and process optimization.
    For the moment i would like to share two important info that i found.
    First, there are three options that you need to activate in you want to play with control group management:

    DefaultCPUAccounting=yes
    DefaultBlockIOAccounting=yes
    DefaultMemoryAccounting=yes
    

    thtat you can find under /etc/systemd/system.conf.

    And, there is also a command that shows CPU utilization along with other info related to the user/system slices – systemd-cgtop.
    If the accounting is not enabled, no details are shown…..once you do that you will have info like this:

    Path                                                                                                                                                                        Tasks   %CPU   Memory  Input/s Output/s
    
    /                                                                                                                                                                              66    9.2        -        -        -
    /user.slice                                                                                                                                                                     -    5.0        -        -        -
    /user.slice/user-1000.slice                                                                                                                                                     -    5.0        -        -        -
    /user.slice/user-1000.slice/session-1.scope                                                                                                                                    47    5.0        -        -        -
    /system.slice                                                                                                                                                                   -    3.8        -        -        -
    /system.slice/lightdm.service                                                                                                                                                   2    3.5        -        -        -
    /system.slice/docker.service                                                                                                                                                    2    0.3        -        -        -
    /system.slice/vboxadd-service.service                                                                                                                                           1    0.0        -        -        -
    /system.slice/ModemManager.service                                                                                                                                              1      -        -        -        -
    /system.slice/NetworkManager.service                                                                                                                                            2      -        -        -        -
    /system.slice/accounts-daemon.service                                                                                                                                           1      -        -        -        -
    /system.slice/acpid.service                                                                                                                                                     1      -        -        -        -
    /system.slice/atd.service                                                                                                                                                       1      -        -        -        -
    /system.slice/avahi-daemon.service                                                                                                                                              2      -        -        -        -
    /system.slice/colord.service                                                                                                                                                    1      -        -        -        -
    /system.slice/cron.service                                                                                                                                                      1      -        -        -        -
    /system.slice/cups-browsed.service                                                                                                                                              1      -        -        -        -
    /system.slice/cups.service                                                                                                                                                      1      -        -        -        -
    /system.slice/dbus.service

    That is all so far. I will let you know once i discover new info.

    Cheers

  • Kernel not compatible with zookeeper version

    Morning,

    It’s important to share this situation with you. This morning i came to the office to see that a cluster that was upgraded/restarted had an issue with Zookeeper instances.

    Symptoms  were clear: instances won’t start completely. But why?

    After a little bit of investigation, i went to the /var/log/syslog (/var/log/zookeeper did not contain any information at all) to see that there is a bad page table in the jvm.

    Java version is:

    java version "1.8.0_111"
    Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
    Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
    

    So, the log showed following lines:

    Aug 16 07:16:04 kafka0 kernel: [  742.349010] init: zookeeper main process ended, respawning
    Aug 16 07:16:04 kafka0 kernel: [  742.925427] java: Corrupted page table at address 7f6a81e5d100
    Aug 16 07:16:05 kafka0 kernel: [  742.926589] PGD 80000000373f4067 PUD b7852067 PMD b1c08067 PTE 80003ffffe17c225
    Aug 16 07:16:05 kafka0 kernel: [  742.928011] Bad pagetable: 000d [#1643] SMP 
    Aug 16 07:16:05 kafka0 kernel: [  742.928011] Modules linked in: dm_crypt serio_raw isofs crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd psmouse floppy
    

    Why should the JVM throw a memory error? The main reason is incompatibility with kernel version.

    Let’s take a look in the GRUB config file.

    Looks like we are using for boot:

    menuentry 'Ubuntu' --class ubuntu --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple-baf292e5-0bb6-4e58-8a71-5b912e0f09b6' {
    	recordfail
    	load_video
    	gfxmode $linux_gfx_mode
    	insmod gzio
    	insmod part_msdos
    	insmod ext2
    	if [ x$feature_platform_search_hint = xy ]; then
    	  search --no-floppy --fs-uuid --set=root  baf292e5-0bb6-4e58-8a71-5b912e0f09b6
    	else
    	  search --no-floppy --fs-uuid --set=root baf292e5-0bb6-4e58-8a71-5b912e0f09b6
    	fi
    	linux	/boot/vmlinuz-3.13.0-155-generic root=UUID=baf292e5-0bb6-4e58-8a71-5b912e0f09b6 ro  console=tty1 console=ttyS0
    	initrd	/boot/initrd.img-3.13.0-155-generic
    

    There was also an older version of kernel image available 3.13.0-153.

    Short fix for this is to update the grub.cfg file with the old version and reboot the server.

    Good fix is still in progress. Will post as soon as i have it.

    P.S: I forgot to mention the Zookeeper version:

    Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT

    P.S 2: It seems that the issue is related with the java processes in general not only zookeeper

    Cheers

  • Puppet gems install workaround after TLS 1.0 switchoff

    Hi,

    It seems that since Ruby disabled the TLS 1.0 protocol, there is an issue with installing custom gems in the puppet server.

    If you run puppetserver gem environment you will probably see the following output:

    /opt/puppetlabs/bin/puppetserver gem environment
    RubyGems Environment:
      - RUBYGEMS VERSION: 2.4.8
      - RUBY VERSION: 1.9.3 (2015-06-10 patchlevel 551) [java]
      - INSTALLATION DIRECTORY: /opt/puppetlabs/server/data/puppetserver/jruby-gems
      - RUBY EXECUTABLE: java -jar /opt/puppetlabs/server/apps/puppetserver/puppet-server-release.jar
      - EXECUTABLE DIRECTORY: /opt/puppetlabs/server/data/puppetserver/jruby-gems/bin
      - SPEC CACHE DIRECTORY: /root/.gem/specs
      - SYSTEM CONFIGURATION DIRECTORY: file:/opt/puppetlabs/server/apps/puppetserver/puppet-server-release.jar!/META-INF/jruby.home/etc
      - RUBYGEMS PLATFORMS:
        - ruby
        - universal-java-1.7
      - GEM PATHS:
         - /opt/puppetlabs/server/data/puppetserver/jruby-gems
         - /root/.gem/jruby/1.9
         - file:/opt/puppetlabs/server/apps/puppetserver/puppet-server-release.jar!/META-INF/jruby.home/lib/ruby/gems/shared
      - GEM CONFIGURATION:
         - :update_sources => true
         - :verbose => true
         - :backtrace => false
         - :bulk_threshold => 1000
         - "install" => "--no-rdoc --no-ri --env-shebang"
         - "update" => "--no-rdoc --no-ri --env-shebang"
      - REMOTE SOURCES:
         - https://rubygems.org/
      - SHELL PATH:
         - /usr/local/sbin
         - /usr/local/bin
         - /usr/sbin
         - /usr/bin
         - /sbin
         - /bin
         - /usr/games
         - /usr/local/games
         - /opt/puppetlabs/bin
    

    Also if you want to install a gem you will receive:

    /opt/puppetlabs/bin/puppetserver gem install toml-rb
    ERROR:  Could not find a valid gem 'toml-rb' (>= 0), here is why:
              Unable to download data from https://rubygems.org/ - Received fatal alert: protocol_version (https://api.rubygems.org/specs.4.8.gz)
    

    Short but unsafe fix for this is:

    opt/puppetlabs/bin/puppetserver gem install --source "http://rubygems.org/" toml-rb
    Fetching: toml-rb-1.1.1.gem (100%)
    Successfully installed toml-rb-1.1.1
    WARNING:  Unable to pull data from 'https://rubygems.org/': Received fatal alert: protocol_version (https://api.rubygems.org/specs.4.8.gz)
    1 gem installed
    

    It’s not that elegant, but it does the trick. You can also include this in an puppet exec block.

    Cheers