Tag: system

  • Don’t delete the Kafka GC logs when they are used

    Hi,

    I made a mistake some time ago, and it’s there to hunt me.
    Deleting the normal gc logs including the one it’s already used doesn’t solve anything, it just created a more difficult situation.
    Here is my example:

    /dev/sda1                        50G   42G  5.2G  90% /
    /opt/kafka/logs# ll
    total 34M
    drwxrwxr-x 2 kafka kafka 4.0K Oct 10 19:34 ./
    drwxr-xr-x 7 kafka kafka 4.0K Mar 14  2018 ../
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 controller.log
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 kafka-authorizer.log
    -rw-rw-r-- 1 kafka kafka    0 Mar 14  2018 kafka-request.log
    -rw-rw-r-- 1 kafka kafka 2.9M Oct 11 04:44 log-cleaner.log
    -rw-rw-r-- 1 kafka kafka 6.1M Oct 11 05:24 server.log
    -rw-rw-r-- 1 kafka kafka  25M Oct  4 14:03 state-change.log
    
    lsof +L1 | grep delete
    init        1     root   13w   REG    8,1         106     0     95 /var/log/upstart/systemd-logind.log.1 (deleted)
    init        1     root   14w   REG    8,1        5794     0   2944 /var/log/upstart/kafka-manager.log.1 (deleted)
    java     1630    kafka    3w   REG    8,1 46836567522     0 524939 /opt/kafka-2.11-0.10.1.1/logs/kafkaServer-gc.log (deleted)
    java     1863 dd-agent    4r   REG    8,1     5750256     0 525428 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    java    10749 dd-agent    4r   REG    8,1     5750216     0 525427 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.0-jar-with-dependencies.jar (deleted)
    bash    10928     root    0u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root    1u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root    2u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    bash    10928     root  255u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    0u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    1u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    2u   CHR  136,6         0t0     0      9 /dev/pts/6 (deleted)
    tail    12378     root    3r   REG    8,1    52428909     0 525512 /opt/kafka-2.11-0.10.1.1/logs/server.log.1 (deleted)
    java    14692 dd-agent    4r   REG    8,1     5750256     0 526042 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    java    16574 dd-agent    4r   REG    8,1     5750256     0 526041 /opt/datadog-agent/bin/agent/dist/jmx/jmxfetch-0.20.1-jar-with-dependencies.jar (deleted)
    

    Handling gc in versions lower than 1.0.0 is quite tricky. It is best to remove these options from your startup script

    -XX:+DisableExplicitGC -Djava.awt.headless=true -Xloggc:/opt/kafka/bin/../logs/kafkaServer-gc.log -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps

    But taking into consideration that we use a standard puppet module that it’s used by multiple teams it is still to be fixed. Fortunately from 1.0.0, GC is disabled by default.

    In order to fix what i showed you before, process restart is needed and we will do that.

    Cheers

  • Memory check by process in Linux

    Hi,

    I wanted to post this since it might be useful in some situations. On a Linux machine it seems that one way to check the memory usage by top processes is with ps aux –sort -rss (This means that it’s order by Resistent Set Size)  Once executed it will return an output similar to this:

    USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
    sorin 3673 0.6 27.3 3626020 563964 pts/1 Sl+ 02:24 1:09 java -Xmx1G -Xms1G -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+Disa
    sorin 1708 2.0 9.2 1835288 189692 ? Sl 02:11 3:56 /usr/bin/gnome-shell
    sorin 1967 0.6 8.0 1642280 166160 ? Sl 02:12 1:11 firefox-esr
    sorin 3413 0.1 3.7 2000252 77016 pts/0 Sl+ 02:21 0:19 java -Xmx512M -Xms512M -server -XX:+UseG1GC -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 -XX:+
    root 576 0.5 2.6 263688 54172 tty7 Ssl+ 02:11 1:07 /usr/bin/Xorg :0 -novtswitch -background none -noreset -verbose 3 -auth /var/run/gdm3/auth-for-Debian-gdm-Bu1jB
    sorin 1813 0.0 2.2 1175504 47196 ? Sl 02:11 0:00 /usr/lib/evolution/evolution-calendar-factory
    root 486 0.1 1.2 377568 26584 ? Ssl 02:11 0:21 /usr/bin/dockerd -H fd://

    If you want to get more detail of a PID status you can go to /proc/[pid]/status and you can find a lot of other informations. For example the top process on my Linux machine has the following header:

    sorin@debian:/proc/3673$ cat status
    Name: java
    State: S (sleeping)
    Tgid: 3673
    Ngid: 0
    Pid: 3673
    PPid: 3660
    TracerPid: 0
    Uid: 1000 1000 1000 1000
    Gid: 1000 1000 1000 1000
    FDSize: 256
    Groups: 24 25 29 30 44 46 108 111 116 1000
    VmPeak: 3626024 kB
    VmSize: 3626020 kB

    As you can see, the RSS is the same as VmSize.

    Cheers!