Hi,
In the last couple of days i worked on deployment of Prometheus server and agent for Kafka monitoring. In that purpose i will share with you the main points that you need to do in order to achieve this.
First thing to do is to use the prometheus and grafana modules that you will find at the following links:
https://forge.puppet.com/puppet/prometheus
https://forge.puppet.com/puppet/grafana
After these are imported in puppet you need to create the following puppet files:
grafana.pp
class profiles::grafana {
class { '::grafana':
cfg => {
app_mode => 'production',
server => {
http_port => 8080,
},
database => {
type => 'sqlite3',
host => '127.0.0.1:3306',
name => 'grafana',
user => 'root',
password => 'grafana',
},
users => {
allow_sign_up => false,
},
},
}
}
puppetserver.pp
class profiles::prometheusserver {
$kafka_nodes=hiera(profiles::prometheusserver::nodes)
if $kafka_nodes {
class {'::prometheus':
global_config => { 'scrape_interval'=> '15s', 'evaluation_interval'=> '15s', 'external_labels'=> { 'monitor'=>'master'}},
rule_files => [ "/etc/prometheus/alert.rules" ],
scrape_configs => [ {'job_name'=>'prometheus','scrape_interval'=> '30s','scrape_timeout'=>'30s','static_configs'=> [{'targets'=>['localhost:9090'], 'labels'=> { 'alias'=>'Prometheus'}}]},{'job_name'=> kafka, 'scrape_interval'=> '10s', 'scrape_timeout'=> '10s', 'static_configs'=> [{'targets'=> $kafka_nodes }]}],
}
} else {
class {'::prometheus':
global_config => { 'scrape_interval'=> '15s', 'evaluation_interval'=> '15s', 'external_labels'=> { 'monitor'=>'master'}},
rule_files => [ "/etc/prometheus/alert.rules" ],
scrape_configs => [ {'job_name'=>'prometheus','scrape_interval'=> '30s','scrape_timeout'=>'30s','static_configs'=> [{'targets'=>['localhost:9090'], 'labels'=> { 'alias'=>'Prometheus'}}]}],
}
}
}
prometheusnode.pp
class profiles_opqs::prometheusnode(
$jmxexporter_dir = hiera('jmxexporter::dir','/opt/jmxexporter'),
$jmxexporter_version = hiera('jmxexporter::version','0.9')
){
include ::prometheus::node_exporter
#validate_string($jmxexporter_dir)
file {"${jmxexporter_dir}":
ensure => 'directory',
}
file {"${jmxexporter_dir}/prometheus_config.yaml":
source => 'puppet:///modules/profiles/prometheus_config',
}
wget::fetch {"https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/${jmxexporter_version}/jmx_prometheus_javaagent-${jmxexporter_version}.jar":
destination => "${jmxexporter_dir}/",
cache_dir => '/tmp/',
timeout => 0,
verbose => false,
unless => "test -e ${jmxexporter_dir}/jmx_prometheus_javaagent-${jmxexporter_version}.jar",
}
}
It is true that i used the wget module to take the JMX exporter so i must give you that as well
https://forge.puppet.com/leonardothibes/wget
As also required is the file to configure the jmx exporter configuration file in order to translate the JMX data provided by Kafka to fields to be imported in Prometheus
prometheus_config:
lowercaseOutputName: true
rules:
- pattern : kafka.cluster<type=(.+), name=(.+), topic=(.+), partition=(.+)><>Value
name: kafka_cluster_$1_$2
labels:
topic: "$3"
partition: "$4"
- pattern : kafka.log<type=Log, name=(.+), topic=(.+), partition=(.+)><>Value
name: kafka_log_$1
labels:
topic: "$2"
partition: "$3"
- pattern : kafka.controller<type=(.+), name=(.+)><>(Count|Value)
name: kafka_controller_$1_$2
- pattern : kafka.network<type=(.+), name=(.+)><>Value
name: kafka_network_$1_$2
- pattern : kafka.network<type=(.+), name=(.+)PerSec, request=(.+)><>Count
name: kafka_network_$1_$2_total
labels:
request: "$3"
- pattern : kafka.network<type=(.+), name=(\w+), networkProcessor=(.+)><>Count
name: kafka_network_$1_$2
labels:
request: "$3"
type: COUNTER
- pattern : kafka.network<type=(.+), name=(\w+), request=(\w+)><>Count
name: kafka_network_$1_$2
labels:
request: "$3"
- pattern : kafka.network<type=(.+), name=(\w+)><>Count
name: kafka_network_$1_$2
- pattern : kafka.server<type=(.+), name=(.+)PerSec\w*, topic=(.+)><>Count
name: kafka_server_$1_$2_total
labels:
topic: "$3"
- pattern : kafka.server<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_server_$1_$2_total
type: COUNTER
- pattern : kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>(Count|Value)
name: kafka_server_$1_$2
labels:
clientId: "$3"
topic: "$4"
partition: "$5"
- pattern : kafka.server<type=(.+), name=(.+), topic=(.+), partition=(.*)><>(Count|Value)
name: kafka_server_$1_$2
labels:
topic: "$3"
partition: "$4"
- pattern : kafka.server<type=(.+), name=(.+), topic=(.+)><>(Count|Value)
name: kafka_server_$1_$2
labels:
topic: "$3"
type: COUNTER
- pattern : kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>(Count|Value)
name: kafka_server_$1_$2
labels:
clientId: "$3"
broker: "$4:$5"
- pattern : kafka.server<type=(.+), name=(.+), clientId=(.+)><>(Count|Value)
name: kafka_server_$1_$2
labels:
clientId: "$3"
- pattern : kafka.server<type=(.+), name=(.+)><>(Count|Value)
name: kafka_server_$1_$2
- pattern : kafka.(\w+)<type=(.+), name=(.+)PerSec\w*><>Count
name: kafka_$1_$2_$3_total
- pattern : kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, topic=(.+)><>Count
name: kafka_$1_$2_$3_total
labels:
topic: "$4"
type: COUNTER
- pattern : kafka.(\w+)<type=(.+), name=(.+)PerSec\w*, topic=(.+), partition=(.+)><>Count
name: kafka_$1_$2_$3_total
labels:
topic: "$4"
partition: "$5"
type: COUNTER
- pattern : kafka.(\w+)<type=(.+), name=(.+)><>(Count|Value)
name: kafka_$1_$2_$3_$4
type: COUNTER
- pattern : kafka.(\w+)<type=(.+), name=(.+), (\w+)=(.+)><>(Count|Value)
name: kafka_$1_$2_$3_$6
labels:
"$4": "$5"
Ok, so in order to put this together, we will use plain old hiera :). For the server on which you want to configure prometheus server you will need to create a role or just put it in the fqdn.yaml that looks like this:
prometheus.yaml
---
classes:
- 'profiles::prometheusserver'
- 'profiles::grafana'
alertrules:
-
name: 'InstanceDown'
condition: 'up == 0'
timeduration: '5m'
labels:
-
name: 'severity'
content: 'critical'
annotations:
-
name: 'summary'
content: 'Instance {{ $labels.instance }} down'
-
name: 'description'
content: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
This is as default installation, because it’s a “role”, on each prometheus host i also created a specific fqdn.yaml file to specify in order to tell what nodes should be checked for exposed metrics. Here is an example:
---
profiles::prometheusserver::nodes:
- 'kafka0:7071'
- 'kafka1:7071'
- 'kafka2:7071'
The three nodes are as an example, you can put all the nodes on which you include the prometheus node class.
Let me show you how this should also look:
---
classes:
- 'profiles::prometheusnode'
profiles::kafka::jolokia: '-javaagent:/usr/share/java/jolokia-jvm-agent.jar -javaagent:/opt/jmxexporter/jmx_prometheus_javaagent-0.9.jar=7071:/opt/jmxexporter/prometheus_config.yaml
Now i need to explain that jolokia variable, right? Yeah, it’s pretty straight forward. The kafka installation was already wrote, and it included the jolokia agent and our broker definition block looks like this:
class { '::kafka::broker':
config => $broker_config,
opts => hiera('profiles::kafka::jolokia', '-javaagent:/usr/share/java/jolokia-jvm-agent.jar'),
heap_opts => "-Xmx${jvm_heap_size}M -Xms${jvm_heap_size}M",
}
}
So i needed to puth the jmx exporter agent beside jolokia on kafka startup, and when this will be deployed you will see the jmxexporter started as agent. Anyhow, when all is deployed you will have a prometheus config that should look like:
---
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: master
rule_files:
- /etc/prometheus/alert.rules
scrape_configs:
- job_name: prometheus
scrape_interval: 30s
scrape_timeout: 30s
static_configs:
- targets:
- localhost:9090
labels:
alias: Prometheus
- job_name: kafka
scrape_interval: 10s
scrape_timeout: 10s
static_configs:
- targets:
- kafka0:7071
- kafka1:7071
- kafka2:7071
You can also see the nodes at Status -> Targets from the menu, and yeah, all the metrics are available by node at http://[kafka-node]:7071/metrics.
I think this should be it, i don’t know i covered everything and there are a lot of details related to our custom installation but at least i managed to provide so details related to it. The article that helped me very much to do is can be visited here
https://www.robustperception.io/monitoring-kafka-with-prometheus/
Cheers!