Category: puppet

Datadog and GCP are “friends” up to a point

Hi,

Since in the last period I preferred to publish more on Medium, let me give you the link to the latest article.

There is an interesting case in which the combination of automation, Goggle Cloud Platform and Datadog didn’t go as we expected.

https://medium.com/metrosystemsro/puppet-datadog-google-cloud-platform-recipe-for-a-small-outage-310166e551f1

Hope you enjoy! I will get back with more also with interesting topics on this blog also.

Cheers

March 27, 2020

Overriding OS fact with external one

Hi,

Short notice article. We had a issue in which the traefik module code was not running because of a wrong os fact. Although the image is Ubuntu 14.04, facter returns it like:

{
  architecture => "amd64",
  family => "Debian",
  hardware => "x86_64",
  name => "Debian",
  release => {
    full => "jessie/sid",
    major => "jessie/sid"
  },
  selinux => {
    enabled => false
  }
}

I honestly don’t know why this happens since on rest of machines it works good, the way to fix it fast is by defining an external fact in /etc/facter/facts.d

Create a file named os_fact.json, for example, that will contain this content:

{ 
   "os":{ 
      "architecture":"amd64",
      "distro":{ 
         "codename":"trusty",
         "description":"Ubuntu 14.04.6 LTS",
         "id":"Ubuntu",
         "release":{ 
            "full":"14.04",
            "major":"14.04"
         }
      },
      "family":"Debian",
      "hardware":"x86_64",
      "name":"Ubuntu",
      "release":{ 
         "full":"14.04",
         "major":"14.04"
      },
      "selinux":{ 
         "enabled":"false"
      }
   }
}

And it’s fixed.

Cheers

February 17, 2020

Duplicate exported resources on puppet by mistake
We had a strange problem in our test environment the other day. There is a need to share an authorized key in order for the ssh connectivity to be available.

The way we shared the file resource was straight forward.
```
  @@file {"/home/kafka/.ssh/authorized_keys":
    ensure => present,
    mode => '0600',
    owner => 'kafka',
    group => 'kafka',
    content => "${::sharedkey}",
    tag => "${::tagvalue}",
  }
```
The tag value variable was a fact unique to each Kafka cluster.

However, each time we executed puppet, the following error the following error was present:
```
08:38:20 Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: A duplicate resource was found while collecting exported resources, with the type and title File[/home/kafka/.ssh/authorized_keys] on node [node_name]
```
We had a couple of days at our disposal to play with the puppet DB, nothing relevant came from it

This behavior started after provisioning a second cluster named similar also with SSL enabled.

After taking a look on the official Puppet documentation (https://puppet.com/docs/puppet/latest/lang_exported.html – check the caution clause), it was clear that the naming of resource should not be the same.

The problem hadn’t appear on any of our clusters since now, so this was strange to say the least.

For whatever reason, the tag was not taken into consideration.

And we know that because resources shared on both nodes were put everywhere, there was no filtering.

Solution:

Quick fix was done with following modifications.
```
  @@file {"/home/kafka/.ssh/authorized_keys_${::clusterid}":
    path => "/home/kafka/.ssh/authorized_keys",
    ensure => present,
    mode => '0600',
    owner => 'kafka',
    group => 'kafka',
    content => "${::sharedkey}",
    tag => "${::clusterid}",
  }
```
So now there is an individual file per cluster, and we also have a tag that is recognized in order to filter the shared file that we need on our server.

Filtering will be done like File <<| tag == "${::clusterid}" |>>

Cheers!
February 11, 2020
Strange problem in puppet run for Ubuntu
Hi,

Short sharing of a strange case.

We’ve written a small manifest in order to distribute some python scripts. You can find the reference here: https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

When you try to run it on Ubuntu 14.04, there is this very strange error:
```
Error: Failed to apply catalog: [nil, nil, nil, nil, nil, nil]
```
The cause for this is as follows:

Python 3.4.3 (default, Nov 12 2018, 22:25:49)
[GCC 4.8.4] on linux (and I believe this is the default max version on trusty)

In order to install the dependencies, you need python3-pip, so a short search returns following options:
```
apt search python3-pip
Sorting... Done
Full Text Search... Done
python3-pip/trusty-updates,now 1.5.4-1ubuntu4 all [installed]
  alternative Python package installer - Python 3 version of the package

python3-pipeline/trusty 0.1.3-3 all
  iterator pipelines for Python 3
```
If we want to list all the installed modules with pip3 list, guess what, it’s not working:
```
Traceback (most recent call last):
   File "/usr/bin/pip3", line 5, in 
     from pkg_resources import load_entry_point
   File "/usr/local/lib/python3.4/dist-packages/pkg_resources/init.py", line 93, in 
     raise RuntimeError("Python 3.5 or later is required")
 RuntimeError: Python 3.5 or later is required
```
So, main conclusion is that it’s not related to puppet, just the incompatibility between version for this old distribution.

Cheers
January 29, 2020
Automatic increase of Kafka LVM on GCP

I wrote an article for my company that was published on Medium regarding the topic in the subject. Please see the link

https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

Thanks

November 27, 2019

Install zookeeper using puppet without module

Hi,

In this post, I was given the task to provide a standalone zookeeper cluster with basic auth on the latest version.

The reason that happened is that we are using a very old module on our Kafka clusters and a new requirement appeared to install the latest version of 3.5.5.

The old module had only the possibility to install the package from apt repo, which was not an option since the last version available on Ubuntu Xenial is at least two years old.

To complete this task, a different method was required. I would have to grab it with wget and add the rest of the files to make it functional.

Let us start with the puppet manifest and from that, I will add the rest.

class zookeeperstd {
  $version = hiera("zookeeperstd::version","3.5.5")
  $authenabled = hiera("zookeeperstd::authenabled",false)
  $server_jvm_flags = hiera('zookeeperstd::jvm_flags', undef)
    group { 'zookeeper':
        ensure => 'present',
    } 
    user {'zookeeper':
        ensure => 'present',
        home => '/var/lib/zookeeper',
        shell => '/bin/false',
        }
    wget::fetch { 'zookeeper':
        source      => "https://www-eu.apache.org/dist/zookeeper/stable/apache-zookeeper-${version}-bin.tar.gz",
        destination => "/opt/apache-zookeeper-${version}-bin.tar.gz",
        } ->
    archive { "/opt/apache-zookeeper-${version}-bin.tar.gz":
        creates      => "/opt/apache-zookeeper-${version}-bin",
        ensure        => present,
        extract       => true,
        extract_path  => '/opt',
        cleanup       => true,
    } ->
    file { "/opt/apache-zookeeper-${version}-bin":
        ensure    => directory,
        owner     => 'zookeeper',
        group      => 'zookeeper',
        require     => [ User['zookeeper'], Group['zookeeper'], ],
        recurse => true,
    } ->
    file { '/opt/zookeeper/':
        ensure    => link,
        target    => "/opt/apache-zookeeper-${version}-bin",
        owner     => 'zookeeper',
        group      => 'zookeeper',
        require     => [ User['zookeeper'], Group['zookeeper'], ],
    }
    file { '/var/lib/zookeeper':
        ensure    => directory,
        owner     => 'zookeeper',
        group      => 'zookeeper',
        require     => [ User['zookeeper'], Group['zookeeper'], ],
        recurse    => true,
    }
# in order to know which servers are in the cluster a role fact needs to be defined on each machine
    $hostshash = query_nodes(" v1_role='zookeeperstd'").sort
    $hosts_hash = $hostshash.map |$value| { [$value, seeded_rand(254, $value)+1] }.hash
    $overide_hosts_hash = hiera_hash('profiles_opqs::kafka_hosts_hash', $hosts_hash)
    $overide_hosts = $overide_hosts_hash.keys.sort
    if $overide_hosts_hash.size() != $overide_hosts_hash.values.unique.size() {
        #notify {"Duplicate IDs detected! ${overide_hosts_hash}": }
        $overide_hosts_hash2 = $hosts.map |$index, $value| { [$value, $index+1] }.hash
  } else {
        $overide_hosts_hash2 = $overide_hosts_hash
    }
	$hosts = $overide_hosts_hash2
	$data_dir = "/var/lib/zookeeper"
	$tick_time        = 2000
        $init_limit       = 10
        $sync_limit       = 5

	$myid = $hosts[$::fqdn]
    file { '/var/lib/zookeeper/myid':
        content => "${myid}",
    }

	file { '/opt/zookeeper/conf/zoo.cfg':
        content => template("${module_name}/zoo.cfg.erb"),
   }
   if $authenabled {
   
    $superpass        = hiera("zookeeperstd::super_pass", 'super-admin')
    $zoopass          = hiera("zookeeperstd::zookeeper_pass", 'zookeeper-admin')
    $clientpass        = hiera("zookeeperstd::client_pass", 'client-admin')
    
    file { '/opt/zookeeper/conf/zoo_jaas.config':
        content => template("${module_name}/zoo_jaas.config.erb"),
   }
   }
     file { '/opt/zookeeper/conf/java.env':
        content => template("${module_name}/java.zookeeper.env.erb"),
        mode => "0755",
    }
     file { '/opt/zookeeper/conf/log4j.properties':
        content => template("${module_name}/log4j.zookeeper.properties.erb"),
    }
   
    file {'/etc/systemd/system/zookeeper.service':
        source  => 'puppet:///modules/work/zookeeper.service',
        mode => "644",
        } ->
    service { 'zookeeper':
        ensure   => running,
        enable   => true,
        provider => systemd,
        }
}

As far as I managed to adapt some file from the existing module, here are the rest of the additional details.

#zoo.cfg.erb
# Note: This file is managed by Puppet.

# http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html

# specify all zookeeper servers
# The fist port is used by followers to connect to the leader
# The second one is used for leader election
<%
if @hosts
# sort hosts by myid and output a server config
# for each host and myid.  (sort_by returns an array of key,value tuples)
@hosts.sort_by { |name, id| id }.each do |host_id|
-%>
server.<%= host_id[1] %>=<%= host_id[0] %>:2182:2183
<% if @authenabled -%>
authProvider.<%= host_id[1] %>=org.apache.zookeeper.server.auth.SASLAuthenticationProvider
<% end -%>
<% end -%>
<% end -%>

# the port at which the clients will connect
clientPort=2181

# the directory where the snapshot is stored.
dataDir=<%= @data_dir %>

# Place the dataLogDir to a separate physical disc for better performance
<%= @data_log_dir ? "dataLogDir=#{data_log_dir}" : '# dataLogDir=/disk2/zookeeper' %>


# The number of milliseconds of each tick.
tickTime=<%= @tick_time %>

# The number of ticks that the initial
# synchronization phase can take.
initLimit=<%= @init_limit %>

# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=<%= @sync_limit %>

# To avoid seeks ZooKeeper allocates space in the transaction log file in
# blocks of preAllocSize kilobytes. The default block size is 64M. One reason
# for changing the size of the blocks is to reduce the block size if snapshots
# are taken more often. (Also, see snapCount).
#preAllocSize=65536

# Clients can submit requests faster than ZooKeeper can process them,
# especially if there are a lot of clients. To prevent ZooKeeper from running
# out of memory due to queued requests, ZooKeeper will throttle clients so that
# there is no more than globalOutstandingLimit outstanding requests in the
# system. The default limit is 1,000.ZooKeeper logs transactions to a
# transaction log. After snapCount transactions are written to a log file a
# snapshot is started and a new transaction log file is started. The default
# snapCount is 10,000.
#snapCount=1000

# If this option is defined, requests will be will logged to a trace file named
# traceFile.year.month.day.
#traceFile=

# Leader accepts client connections. Default value is "yes". The leader machine
# coordinates updates. For higher update throughput at thes slight expense of
# read throughput the leader can be configured to not accept clients and focus
# on coordination.
#leaderServes=yes

<% if @authenabled -%>

requireClientAuthScheme=sasl
quorum.auth.enableSasl=true
quorum.auth.learnerRequireSasl=true
quorum.auth.serverRequireSasl=true
quorum.auth.learner.loginContext=QuorumLearner
quorum.auth.server.loginContext=QuorumServer
quorum.cnxn.threads.size=20

<% end -%>

#zoo_jaas.config
QuorumServer {
       org.apache.zookeeper.server.auth.DigestLoginModule required
       user_zookeeper="<%= @zoopass %>";
};
 
QuorumLearner {
       org.apache.zookeeper.server.auth.DigestLoginModule required
       username="zookeeper"
       password="<%= @zoopass %>";
};

Server {
       org.apache.zookeeper.server.auth.DigestLoginModule required
       user_super="<%= @superpass %>"
       user_client="<%= @clientpass %>";
};

#java.zookeeper.env.erb
ZOO_LOG4J_PROP="INFO,ROLLINGFILE"
SERVER_JVMFLAGS="<%= @server_jvm_flags %>"

#log4j.zookeeper.properties.erb
# Note: This file is managed by Puppet.

#
# ZooKeeper Logging Configuration
#

# Format is "<default threshold> (, <appender>)+

log4j.rootLogger=${zookeeper.root.logger}, ROLLINGFILE

#
# Log INFO level and above messages to the console
#
log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender
log4j.appender.CONSOLE.Threshold=INFO
log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout
log4j.appender.CONSOLE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n

#
# Add ROLLINGFILE to rootLogger to get log file output
#    Log INFO level and above messages to a log file
log4j.appender.ROLLINGFILE=org.apache.log4j.RollingFileAppender
log4j.appender.ROLLINGFILE.Threshold=INFO
log4j.appender.ROLLINGFILE.File=${zookeeper.log.dir}/zookeeper.log

# Max log file size of 10MB
log4j.appender.ROLLINGFILE.MaxFileSize=10MB
# Keep only 10 files
log4j.appender.ROLLINGFILE.MaxBackupIndex=10
log4j.appender.ROLLINGFILE.layout=org.apache.log4j.PatternLayout
log4j.appender.ROLLINGFILE.layout.ConversionPattern=%d{ISO8601} - %-5p [%t:%C{1}@%L] - %m%n

And the last but not the least.

[Unit]
Description=ZooKeeper Service
Documentation=http://zookeeper.apache.org
Requires=network.target
After=network.target

[Service]
Type=forking
User=zookeeper
Group=zookeeper
ExecStart=/opt/zookeeper/bin/zkServer.sh start /opt/zookeeper/conf/zoo.cfg
ExecStop=/opt/zookeeper/bin/zkServer.sh stop /opt/zookeeper/conf/zoo.cfg
ExecReload=/opt/zookeeper/bin/zkServer.sh restart /opt/zookeeper/conf/zoo.cfg
WorkingDirectory=/var/lib/zookeeper

[Install]
WantedBy=default.target

Also, if you want to enable simple MD5 authentication, in hiera you will need to add the following two lines.

zookeeperstd::authenabled: true
zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config"

If there is a simpler approach, feel free to leave me a message on Linkedin or Twitter.

Cheers

June 18, 2019

Jolokia particular case using custom facts in Hiera
Hi,

This is for me and also for all the other people that are searching for how to use custom defined types in Hiera

In my case i wanted to activate the HTTP endpoint of Jolokia using custom hostname and standard port. And for that it was sufficient to add in my host yaml the following lines
```
profiles::kafka::jolokia: "-javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8778,host=%{::networking.fqdn}"
```
This contains the standard fact called networking, which is a hash, and i am using the key that is called fqdn.

And it works.

Cheers
April 2, 2019

Fact for kafka_consumer hash…or kind of

Hi,

There is a late requirement that we activate the kafka_consumer functionality of Datadog.

Unfortunately this is a challenge if you don’t have a fixed number of consumer groups and topics (on one client we had a couple of hundreds consumer groups)

Here is how it should look in the example file

  #  consumer_groups:
  #    <CONSUMER_NAME_1>:
  #      <TOPIC_NAME_1>: [0, 1, 4, 12]
  #    <CONSUMER_NAME_2>:
  #      <TOPIC_NAME_2>:
  #    <CONSUMER_NAME_3>

So, if you want to grab the data using the old kafka-consumer-groups.sh, you will have to do this in one iteration. I tried it this way, but even if i am done, should not be a option for larger clusters

require 'facter'
Facter.add('kafka_consumers_config') do
  setcode do
      kafka_consumers_cmd = '/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list'
      kafka_consumers_result = Facter::Core::Execution.exec(kafka_consumers_cmd)
      kafka_consumers_result.to_s.split(/\n/)
      group_hash = {}
      kafka_consumers_result.each_line do |group|
        groupid = group.strip 
        kafka_consumer_topic_list_cmd="/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group #{groupid} --describe | sort | grep -v TOPIC  | awk {\'print $1\'} | uniq"
        kafka_consumer_topic_list_result = Facter::Core::Execution.exec(kafka_consumer_topic_list_cmd)
        kafka_consumer_topic_list_result.split(/\n/).reject { |c| c.empty? }
        topic_hash = {}
        kafka_consumer_topic_list_result.each_line do |topic|
           topicid = topic.strip
           kafka_consumer_topic_partition_cmd="/opt/kafka/bin/kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group #{groupid} --describe | grep -v TOPIC | grep #{topicid} | awk {\'print $2\'} | sort"
           kafka_consumer_topic_partition_result=Facter::Core::Execution.exec(kafka_consumer_topic_partition_cmd)
           kafka_consumer_topic_partition_result.gsub("\n", ' ').squeeze(' ')
           topic_hash[topic] = kafka_consumer_topic_partition_result
        end
      group_hash[group] = topic_hash    
      end
    group_hash
  end
end

This is posted only as a snapshot and maybe as a source of inspiration. For an actual working version, it should be done using Java or Scala in order to leverage the power of libraries (from what i know Python/Go and other programming languages have only libraries on consumer/producer part)

If i will pursue the task of rewriting this using only one interrogation loop or in Java/Scala, you will see it.

March 5, 2019

Distributing service conditionally on OS version

Hi,

Since we are in the process of migrating to 16.04, my service restart script needed to be deployed with separate builds.

In that purpose, i found a fact that would help me, so that my standard file block transformed into this:

 case $facts['os']['distro']['codename']  {
    'xenial': {
        file {"/root/servicerestart":
        source => 'puppet:///modules/profiles/servicerestart-kafka-new',
        mode => '0755',
        replace => true,
	}
    }
    'trusty': { 
    file {"/root/servicerestart":
        source => 'puppet:///modules/profiles/servicerestart-kafka',
        mode => '0755',
        replace => true,
	}
}
  }

That should be all for now.

February 8, 2019

Order Linux processes by memory usage

This one is more for me actually. We have some issues with one puppet instance on which the processes fail, and i wanted to see if there is any way to order them by memory usage.

So i searched the net and found this link https://unix.stackexchange.com/questions/92493/sorting-down-processes-by-memory-usage

The command is like

ps aux --sort -rss | head -10

And it provides you with following output, at least in my case

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
puppet    6327 70.1 25.5 3585952 1034532 ?     Sl   06:53   7:33 /usr/lib/jvm/java-7-openjdk-amd64/jre/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Djava.security.egd=/dev/urandom -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8778 -Xms1024m -Xmx1024m -cp /opt/puppetlabs/server/apps/puppetserver/puppet-server-release.jar clojure.main -m puppetlabs.trapperkeeper.main --config /etc/puppetlabs/puppetserver/conf.d -b /etc/puppetlabs/puppetserver/bootstrap.cfg
jenkins   6776  9.6 16.6 4648236 671980 ?      Sl   06:55   0:51 /usr/bin/java -Djava.awt.headless=true -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8780 -Xms1024m -Xmx1024m -jar /usr/share/jenkins/jenkins.war --webroot=/var/cache/jenkins/war --httpPort=8080 --httpListenAddress=127.0.0.1
puppetdb  5987 16.8 11.7 3845896 474164 ?      Sl   06:52   2:01 /usr/bin/java -XX:OnOutOfMemoryError=kill -9 %p -Djava.security.egd=/dev/urandom -Xmx192m -javaagent:/usr/share/java/jolokia-jvm-agent.jar=port=8779 -cp /opt/puppetlabs/server/apps/puppetdb/puppetdb.jar clojure.main -m puppetlabs.puppetdb.main --config /etc/puppetlabs/puppetdb/conf.d -b /etc/puppetlabs/puppetdb/bootstrap.cfg
postgres  1458  0.0  2.1 249512 88656 ?        Ss   Nov21   3:10 postgres: checkpointer process                                                                                              
postgres  6206  0.0  1.4 253448 57984 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36882) idle                                                                           
postgres  6209  0.0  0.7 252580 29820 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36886) idle                                                                           
postgres  6210  0.0  0.5 254892 22440 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36888) idle                                                                           
postgres  6213  0.0  0.5 254320 21416 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36894) idle                                                                           
postgres  6205  0.0  0.5 253524 20324 ?        Ss   06:53   0:00 postgres: puppetdb puppetdb 127.0.0.1(36878) idle

As you can probably see, the components are taking slowly but surely more and more memory and since the machine has only 4GB allocated it will probably crash again.

If this happens, i will manually increase the memory with another 2GB and see where we will go from there.

Cheers!

December 14, 2018

Category: puppet

This behavior started after provisioning a second cluster named similar also with SSL enabled.

For whatever reason, the tag was not taken into consideration.

Solution: