Tag: kafka

Small Mirror Maker test between different Kafka clusters
Hi,

Today i am trying to show you what i have been playing with for the last day. There was a business case in which some colleagues from Analytics wanted to replicate all the data from other systems in their cluster.

We will start with this, two independent configured clusters with 3 servers each (on each server we have one zookeeper and one kafka node). On both the source and target i created a topic replicated three times with five partitions. You can find the description
```
/opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --describe --topic test-topic
Topic:test-topic	PartitionCount:5	ReplicationFactor:3	Configs:
	Topic: test-topic	Partition: 0	Leader: 1002	Replicas: 1002,1003,1001	Isr: 1002,1003,1001
	Topic: test-topic	Partition: 1	Leader: 1003	Replicas: 1003,1001,1002	Isr: 1003,1001,1002
	Topic: test-topic	Partition: 2	Leader: 1001	Replicas: 1001,1002,1003	Isr: 1001,1002,1003
	Topic: test-topic	Partition: 3	Leader: 1002	Replicas: 1002,1001,1003	Isr: 1002,1001,1003
	Topic: test-topic	Partition: 4	Leader: 1003	Replicas: 1003,1002,1001	Isr: 1003,1002,1001
```
The command for creating this is actually pretty simple and it goes like this /opt/kafka/bin/kafka-topics.sh –zookeeper localhost:2181 –create –replication-factor 3 –partition 5 –topic test-topic

Once the topic are created on both kafka instances we will need to start Mirror Maker (HortonWorks recommends that the process should be created on the destination cluster). In order to do that, we will need to create two config files on the destination. You can call them producer.config and consumer.config.

For the consumer.config we have the following structure:
```
bootstrap.servers=source_node0:9092,source_node1:9092,source_node2:9092
exclude.internal.topics=true
group.id=test-consumer-group
client.id=mirror_maker_consumer
```
For the producer.config we have the following structure:
```
bootstrap.servers=destination_node0:9092,destination_node1:9092,destination_node2:9092
acks=1
batch.size=100
client.id=mirror_maker_producer
```
These are the principal requirements and also you will need to be sure that you have in you consumer.properties the following line group.id=test-consumer-group.

Ok, so far so good, now lets start Mirror Maker with and once started you can see it beside kafka and zookeeper using ps -ef | grep java
```
/opt/kafka/bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config ../config/consumer.config --producer.config ../config/producer.config --whitelist test-topic &
```
To check the offset, at new versions of kafka you can always use
```
/opt/kafka/bin# ./kafka-run-class.sh kafka.admin.ConsumerGroupCommand --group test-consumer-group --bootstrap-server localhost:9092 --describe
GROUP                          TOPIC                          PARTITION  CURRENT-OFFSET  LOG-END-OFFSET  LAG             OWNER
test-consumer-group            test-topic                     0          2003            2003            0               test-consumer-group-0_/[dest_ip]
test-consumer-group            test-topic                     1          2002            2002            0               test-consumer-group-0_/[dest_ip]
test-consumer-group            test-topic                     2          2003            2003            0               test-consumer-group-0_/[dest_ip]
test-consumer-group            test-topic                     3          2004            2004            0               test-consumer-group-0_/[dest_ip]
test-consumer-group            test-topic                     4          2002            2002            0               test-consumer-group-0_/[dest_ip]
```
I tested the concept by running a short loop in bash to create 10000 records and put them to a file for i in $( seq 1 10000); do echo $i; done >> test.txt and this can be very easily imported on our producer by running the command /opt/kafka/bin/kafka-console-producer.sh –broker-list localhost:9092 –topic test-topic < test.txt

After this is finished, please feel free to take a look in the topic using /opt/kafka/bin/kafka-console-consumer.sh –bootstrap-server localhost:9092 –topic test-topic –from-beginning and you should see a lot of lines 🙂

Thank you for your time and if there are any parts that i missed, please reply.

Cheers!
May 18, 2017
Check Kafka JMX node stats using JConsole

Hi,

As you probably know, Kafka is already publishing a lot of performance data on JMX to be collected.
In order to do this, you will need to install jconsole (for Windows it’s already embedded in the jdk installation, for Linux you can use this article to check it out https://www.garron.me/en/linux/find-which-package-library-belongs.html. After you have done that, you will have just to export the JMX_PORT variable to you env (for example export JMX_PORT=9999) before you start the Kafka node. When you will open JConsole you will probably see something like

After you select the Kafka node, it will tell you that the connection is not secure, but it doesn’t matter for my point of view and after that you can have a overview of the process. The statistics are available MBens tab and extra info regarding the meaning you can find in the official doku and also in the DataDog article.

This is a single simple node configuration, if it is required i will post some complex configurations, but this is required in special cases, standard monitoring using DataDog/Prometheus or other solution needs to be implemented in case of a bigger infrastructure.

Cheers

April 24, 2017
Monitoring Kafka with DataDog

Hi,

A very interesting series of articles that should be checked regarding one option of Kafka monitoring with Datadog :

https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/.

I will have in the near future a task regarding this, will post the outcome when it’s done.

P.S: It was done and you can find the implementation here :

Integrate Kafka with Datadog monitoring using puppet

As for the metrics point of view we will see if this is really an option, i have tried the same with Prometheus and Grafana and it seems to wok better. Keep you posted

Cheers

April 13, 2017
Monitoring Kafka node using Docker

Hi,

Today i am just going to point you to a very interesting article related to monitoring of Kafka node/nodes using InfluxDB, Grafana and Docker. Hope it is useful, i will surely try it in one of the days.

https://softwaremill.com/monitoring-apache-kafka-with-influxdb-grafana/

Now this is not quite standard but nevertheless it is an option.

Cheers!

April 12, 2017