docker linux newtools

List differences between two sftp hosts using golang


Just as a intermediate post as i wanted to play a little bit with golang, let me show you what i managed to put together in some days. I created a virtual machine on which i installed docker and grabbed a sftp image. You can try first two from Docker Hub, it should work.
So i pulled this image and initiated two containers as shown below:

eaf3b93798b5        asavartzeth/sftp    "/ /u..."   21 hours ago        Up About a minute>22/tcp   server4
ec7d7e1d029f        asavartzeth/sftp    "/ /u..."   21 hours ago        Up About a minute>22/tcp   server3

The command to do this looks like:

docker run --name server3 -v /home/sorin/sftp1:/chroot/sorin:rw -e SFTP_USER=sorin -e SFTP_PASS=pass -p 2224:22 -d asavartzeth/sftp
docker run --name server4 -v /home/sorin/sftp2:/chroot/sorin:rw -e SFTP_USER=sorin -e SFTP_PASS=pass -p 2225:22 -d asavartzeth/sftp

Main info to know about these containers is that they should be accessible by user sorin and the path were the external directories are mapped is on /chroot/sorin.

You can manually test the connection by using a simple command like:

sftp -P 2224 sorin@localhost

If you are using the container ip address i observed that you will use the default 22 port to connect to them. Not really clear why but this is not about that.

Once the servers are up and running you can test the differences between the structure using following code:

package main

import (


type ServerFiles struct {
	Name  string
	files []string

func main() {

	server1client := ConnectSftp("localhost:2224", "sorin", "pass")
	server1files := ReadPath(server1client)
	server1struct := BuildStruct("", server1files)
	server2client := ConnectSftp("localhost:2225", "sorin", "pass")
	server2files := ReadPath(server2client)
	server2struct := BuildStruct("", server2files)
	diffilesstruct := CompareStruct(server1struct, server2struct)
        for _, values := range diffilestruct.files {
        fmt.Printf("%s %s\n", diffilesstruct.Name, values)
func CheckError(err error) {
	if err != nil {
func ConnectSftp(address string, user string, password string) *sftp.Client {
	config := &ssh.ClientConfig{
		User: user,
		Auth: []ssh.AuthMethod{
		HostKeyCallback: ssh.InsecureIgnoreHostKey(),
	conn, err := ssh.Dial("tcp", address, config)

	client, err := sftp.NewClient(conn)

	return client
func ReadPath(client *sftp.Client) []string {
	var paths []string
	w := client.Walk("/")
	for w.Step() {
		if w.Err() != nil {
		paths = append(paths, w.Path())
	return paths
func BuildStruct(address string, files []string) *ServerFiles {
	server := new(ServerFiles)
	server.Name = address
	server.files = files

	return server
func CompareStruct(struct1 *ServerFiles, struct2 *ServerFiles) *ServerFiles {

	diff := difference(struct1.files, struct2.files)
	diffstruct := new(ServerFiles)
	for _, value := range diff {
		for _, valueP := range struct1.files {
			if valueP == value {
				diffstruct.Name = struct1.Name
				diffstruct.files = append(diffstruct.files, valueP)
		for _, valueQ := range struct2.files {
			if valueQ == value {
				diffstruct.Name = struct2.Name
				diffstruct.files = append(diffstruct.files, valueQ)
	return diffstruct
func difference(slice1 []string, slice2 []string) []string {
	var diff []string

	// Loop two times, first to find slice1 strings not in slice2,
	// second loop to find slice2 strings not in slice1
	for i := 0; i < 2; i++ {
		for _, s1 := range slice1 {
			found := false
			for _, s2 := range slice2 {
				if s1 == s2 {
					found = true
			// String not found. We add it to return slice
			if !found {
				diff = append(diff, s1)
		// Swap the slices, only if it was the first loop
		if i == 0 {
			slice1, slice2 = slice2, slice1

	return diff
func CloseConnection(client *sftp.Client) {

This actually connects to each server, reads the hole filepath and puts it on a structure. After this is done for both servers, there is a method that compares only the slice part of the struct and returns the differences. On this differences there is another structure constructed with only the differences.
It is true that i took the differences func from stackoverflow, and it's far from good code, but i am working on it, this is the first draft, i will post different versions as it gets better.

The output if there are differences will look like this: /sorin/subdirectory /sorin/subdirectory/subtest.file /sorin/test.file /sorin/test2

If there are no differences that it will just exit.
Working on improving my golang experience. Keep you posted.



How to change root password on Debian – after vacation


Since i had a vacation and completely forgot all my passwords for Debian VM i fixed it using this article. Very useful!


kafka linux

Ubuntu – change ulimit for kafka, do not ignore


Wanna share with you what managed to take me half a day to clarify. I just read in the following article
and learned that in order to optimize kafka, you will need to also change the maximum number of open files. It is nice, but our clusters are deployed on Ubuntu and the images are pretty basic. Not really sure if this is valid for all of the distributions but at least for this one it’s absolutely needed.
Before trying to setup anything in


make sure that you have exported in



session required

It is needed in order for ssh, su processes to take the new limits for that user (in our case kafka).
Doing this will help you define new values on “limits” file. You are now free to setup nofile limit like this for example

*               soft    nofile          10000
*		hard	nofile		100000
kafka		soft 	nofile		10000
kafka		hard	nofile		100000

After it is done, you can restart the cluster and check value by finding process with ps-ef | grep kafka and viewing limit file using cat /proc/[kafka-process]/limits.

I will come back later with also a puppet implementation for this.



Python dictionary construction from process list


This is out of my expertise but i wanted to shared it anyways. One colleague wanted to help him with the creation of a pair key:value from one command that lists the processes, in python. With a little bit of testing i came to the following form:

import os
import subprocess
from subprocess import Popen, PIPE
username = subprocess.Popen(['/bin/ps','-eo','pid,uname'], stdout=PIPE, stderr=PIPE)
firstlist ='\n')
dict = {}
for str in firstlist:
  if (str != ''):
    secondlist = str.split()
    key = secondlist[0]
    value = secondlist[1]

Now, i think there are better ways to write this but it works also in this way.
If you find better ways, please leave a message 🙂


linux newtools

Configure Jupyter Notebook on Raspberry PI 2 for remote access and scala kernel install


This is a continuation of the previous article regarding Jupyter Notebook ( Let’s start with my modification in order to have an remote connection to it. It first needs a password in the form of password hash. To generate this pass run python cli and execute this code from IPython.lib import passwd;passwd(“your_custom_password”). Once you get the password hash, we can list the fields that i uncommented to activate minimal remote access:

c.NotebookApp.open_browser = False #do not open a browser on notebook start, you will access it by daemon remotely
c.NotebookApp.ip = '*' #permite access on every interface of the server
c.NotebookApp.password = u'[your_pass_has]' #setup password in order to access the notebook, otherwise token from server is required (if you want it this way you can get the token by running sudo systemctl status jupyter.service 

You can also add them at the bottom of the file as well. In order for the changes to take effect you will need also to perform a service restart with sudo systemctl restart jupyter.service.

You have now the basic steps to run Jupyter Notebook with the IPython 2 kernel. Now lets’s ger to the next step of installing the scala kernel(

The steps are pretty straight forward and they are taken from this link , what i tried is to put it end to end. I am not 100% sure if you need also java 8 but i installed it anyway, you will find the steps here but what you really need to install is sbt.

The catch here is that you don’t need to search for sbt on raspberry, just drop the default one, it will do the job. The steps are listed here Once it is installed you can return to the link listed above and just run the steps:

apt-get install git
git clone
cd jupyter-scala
sbt cli/packArchive

Sbt will grab a lot of dependences, if you work with proxies i am not aware of the settings that you need to do, but you can search it and probably you find a solution. Have patience, it will take a while until it is done, but once it is done you can run ./jupyter-scala in order to install the kernel and also check if it works with jupyter kernelspec list.

Restart the Jupyter Notebook to update it, although i am not convinced if it’s necessary 🙂
In my case i have a dynamic dns service from my internet provider but i think you can do it with a free dns provider on your router as well. An extra forward or NAT of port 8888 will be needed but once this is done you should have a playgroup in your browser that knows python and scala. Cool, isn’t it?


linux newtools

Installing Jupyter Notebook on Raspberry PI 2


Just want to share you that i managed to install the Jupyter Notebook( on a Raspberry PI 2 without any real problems. Beside a microSD card and a Raspberry you need to read this and that would be all.
So, you will need a image of Raspbian from (i selected the lite version without the GUI, you really don’t need that actually). In installed it on the card with Linux so i executed a command similar with dd if=[path_to_image]/[image_name] of=[sd_device_name taken from fdisk -l without partition id usually /dev/mmcblk0] bs=4MB; sync. The sync command is added just to be sure that all files are syncronized to card before remove it. We have now a working image that we can use on raspberry, it’s fair to try boot it.
Once it’s booted login with user pi and password raspberry. I am a fan of running the resize steps which you can find here
Ok, so we are good to go on installing Jupyter Notebook, at first we need to check what Python version we have installed and in my case it was 2.7.13 (it should be shown by running python –version). In this case then we need to use pip for this task, and it’s not present by default on the image.
Run sudo apt-get install python-pip, after this is done please run pip install jupyter. It will take some time, but when it is done you will have a fresh installation in pi homedir(/home/pi/.local).
It is true that we need also a service, and in order to do that, please create following path with following file:

Description=Jupyter Notebook

# Step 1 and Step 2 details are here..
# ------------------------------------
ExecStart=/home/pi/.local/bin/jupyter-notebook --config=/home/pi/.jupyter/


You are probably wondering from where do you get the config file. This will be easy, just run /home/pi/.local/bin/jupyter notebook –generate-config

After the file is created, in order to activate the service and enable it there are sudo systemctl enable jupyter.service and sudo systemctl start jupyter.service

You have now a fresh and auto managed jupyter service. It will be started only on the localhost by default, but in the next article i will tell you also the modifications to be executed in order to run it remotely and also install scala kernel.


kafka linux puppet

Securing kafka-manager endpoints with iptables rules behind traefik


One extra addition to my traefik balancing article from is that even so now we have the balancing capability we still need to restrict access to unsecured endpoint. I thought all the code to be deployable on all of the nodes. If this is taken in consideration, our issue with the firewall rules should be easily solved by using the puppetlabs module and the code that i included looks like:

$hosts_count = $kafka_hosts.count
  package {'iptables-persistent':
  name => 'iptables-persistent',
  ensure => installed,
  resources { 'firewall':
    purge => true,
  $kafka_hosts.each | Integer $index,String $host | {
    firewall {"10${index} adding tcp rule kafka manager node ${index}":
      proto => 'tcp',
      dport => 9000,
      source => "${host}",
      destination => "${fqdn}",
      action => 'accept',
  firewall {"10${hosts_count} droping rest of kafka manager calls":
    proto => 'tcp',
      dport => 9000,
      destination => "${fqdn}",
      action => 'drop',

This should be add rules in order to allow traffic on port 9000 only between the kafka hosts that have kafka manager installed.


List paths created by package install on Ubuntu


I was searching this morning to see what paths and files were created using package install with puppet and i found this:

root@test:~# apt list --installed | grep goss

WARNING: apt does not have a stable CLI interface yet. Use with caution in scripts.

goss/trusty,now 0.3.0-3 amd64 [installed]
root@test:~# dpkg-query -L goss

No other things to add.

cloud docker linux

Sysdig container isolation case debugged on kubernetes


I didn’t get to actual test anything related to this but i managed to find a very interesting article that might be lost if you are not a sysdig fan. You can find it at following link

To put into perspective, this tool is used for some very interesting debugging situation, i have played with it some a short period of time and i think i will put in on my list so that i can show you what it can do.



Memory check by process in Linux


I wanted to post this since it might be useful in some situations. On a Linux machine it seems that one way to check the memory usage by top processes is with ps aux –sort -rss (This means that it’s order by Resistent Set Size)  Once executed it will return an output similar to this:

sorin 3673 0.6 27.3 3626020 563964 pts/1 Sl+ 02:24 1:09 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+Disa
sorin 1708 2.0 9.2 1835288 189692 ? Sl 02:11 3:56 /usr/bin/gnome-shell
sorin 1967 0.6 8.0 1642280 166160 ? Sl 02:12 1:11 firefox-esr
sorin 3413 0.1 3.7 2000252 77016 pts/0 Sl+ 02:21 0:19 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+
root 576 0.5 2.6 263688 54172 tty7 Ssl+ 02:11 1:07 /usr/bin/Xorg :0 -novtswitch -background none -noreset -verbose 3 -auth /var/run/gdm3/auth-for-Debian-gdm-Bu1jB
sorin 1813 0.0 2.2 1175504 47196 ? Sl 02:11 0:00 /usr/lib/evolution/evolution-calendar-factory
root 486 0.1 1.2 377568 26584 ? Ssl 02:11 0:21 /usr/bin/dockerd -H fd://

If you want to get more detail of a PID status you can go to /proc/[pid]/status and you can find a lot of other informations. For example the top process on my Linux machine has the following header:

sorin@debian:/proc/3673$ cat status
Name: java
State: S (sleeping)
Tgid: 3673
Ngid: 0
Pid: 3673
PPid: 3660
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
FDSize: 256
Groups: 24 25 29 30 44 46 108 111 116 1000
VmPeak: 3626024 kB
VmSize: 3626020 kB

As you can see, the RSS is the same as VmSize.