Categories
linux

Enable time sync on Manjaro

So I wanted for a while to use and to learn Manjaro and I grabbed Cinnamon 21.1.0

Installation process is pretty straight forward, I setup the correct time zone and installed all of the default packages.

Guess what, after rebooting the laptop the timezone was set correctly but the actual time was way off.

I tried to see if I can easily find a post to explain to me how it’s done but the standard GUI way didn’t work.

The actual solution is in the code below

[sorin-20fjs3dr01 ~]# timedatectl
               Local time: Sb 2021-08-28 13:07:40 EEST
           Universal time: Sb 2021-08-28 10:07:40 UTC
                 RTC time: Sb 2021-08-28 10:07:40
                Time zone: Europe/Bucharest (EEST, +0300)
System clock synchronized: no
              NTP service: inactive
          RTC in local TZ: no
[sorin-20fjs3dr01 ~]# systemctl status ntpd.service
○ ntpd.service - Network Time Service
     Loaded: loaded (/usr/lib/systemd/system/ntpd.service; disabled; vendor preset: disabled)
     Active: inactive (dead)
[sorin-20fjs3dr01 ~]#  systemctl status systemd-timesyncd.service
○ systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/usr/lib/systemd/system/systemd-timesyncd.service; disabled; vendor preset: enabled)
     Active: inactive (dead)
       Docs: man:systemd-timesyncd.service(8)
[sorin-20fjs3dr01 ~]# systemctl start systemd-timesyncd.service
[sorin-20fjs3dr01 ~]# ^C
[sorin-20fjs3dr01 ~]# systemctl status systemd-timesyncd.service
● systemd-timesyncd.service - Network Time Synchronization
     Loaded: loaded (/usr/lib/systemd/system/systemd-timesyncd.service; disabled; vendor preset: enabled)
     Active: active (running) since Sat 2021-08-28 13:09:09 EEST; 2h 59min left
       Docs: man:systemd-timesyncd.service(8)
   Main PID: 2080 (systemd-timesyn)
     Status: "Initial synchronization to time server 195.135.194.3:123 (0.manjaro.pool.ntp.org)."
      Tasks: 2 (limit: 19010)
     Memory: 1.3M
        CPU: 51ms
     CGroup: /system.slice/systemd-timesyncd.service
             └─2080 /usr/lib/systemd/systemd-timesyncd

aug 28 13:09:09 sorin-20fjs3dr01 systemd[1]: Starting Network Time Synchronization...
aug 28 13:09:09 sorin-20fjs3dr01 systemd[1]: Started Network Time Synchronization.
aug 28 10:09:10 sorin-20fjs3dr01 systemd-timesyncd[2080]: Initial synchronization to time server 195.135.194.3:123 (0.manjaro.pool.ntp.org).
[sorin-20fjs3dr01 ~]# systemctl enable systemd-timesyncd.service
Created symlink /etc/systemd/system/dbus-org.freedesktop.timesync1.service → /usr/lib/systemd/system/systemd-timesyncd.service.
Created symlink /etc/systemd/system/sysinit.target.wants/systemd-timesyncd.service → /usr/lib/systemd/system/systemd-timesyncd.service.
[sorin-20fjs3dr01 ~]# 

Turns out that both ntpd and timesyncd are dead and do not start by default, so the actual fix is by starting and enabling timesyncd.

Cheers,

Sorin

Categories
Uncategorized

Getting stocks basic data using yfinance

Hi,

If you are thinking of investing and also to create a perfect opportunity so that you can play with data in pandas, here is the use case I am working on.

Basically, from what I understood, if you want to value invest, there are two main parameters to take a look at before doing any other in depth research: P/B and P/E. Both of them show if the company has the potential to grow.

How can we retrieve these parameters using Python from Yahoo Finance for example … and the code that worked for me is as follows:

import yfinance as yf
import pandas as pd

payload=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
first_table = payload[0]
second_table = payload[1]
df_sp = first_table

statscsv = open('stats.csv', 'a')

for value in df_sp['Symbol']:
    stock = yf.Ticker(value)
    if 'priceToBook' in stock.info:
        statscsv.write(value+","+str(stock.info['priceToBook'])+","+str(stock.info['priceToSalesTrailing12Months'])+"\n")

statscsv.close()

I’ve been trying a lot to put the info directly in a pandas DataFrame and it did not work so for the purpose of querying the API only once, it makes a lot of sense to store it in an CSV file saved locally.

After it is saved locally you can manually load it to the DataFrame object by using (for my usage i manually added the column names into the file like Symbol,PB,PE at the beginning)

df_pb = pd.read_csv("stats.csv")

From what I saw, in some cases P/B data is not available in the output so the value ‘None’.

You can manually change that by replacing it with 0 and store it in a different DataFrame, like this

df_pb_clean = df_pb.replace({"None":"0"})

After you done this, you need to also convert the types of columns from object to float64 so that you can query specific values

df_pb_clean['PB'] = df_pb_clean['PB'].astype(float)
df_pb_clean['PE'] = df_pb_clean['PE'].astype(float)

After all of this is done, you can query it just as easy as

df_pb_green = df_pb_clean.query('0.0 < PB < 2.0')

And after that filter maybe also P/E for you use case.

The main goal is that we filter only the company with growth so that we can try to retrieve historical data and see main methods of analysis.

Cheers

Categories
kafka

SASL config issue on latest Kafka versions

Hello,

Today I want to share with you a problem that we needed to fix when we decided to activate SASL.

Normally, the steps are pretty straight forward and you can use Confluent doku or the general Apache Kafka.

The main catch is that if you have a certain property in your config file, the following error will appear in a loop:

[2021-01-11 09:17:28,052] ERROR Processor [0..n] closed connection from null (kafka.network.Processor)
java.io.IOException: Channel could not be created for socket java.nio.channels.SocketChannel[closed]
	at org.apache.kafka.common.network.Selector.buildAndAttachKafkaChannel(Selector.java:348)
	at org.apache.kafka.common.network.Selector.registerChannel(Selector.java:329)
	at org.apache.kafka.common.network.Selector.register(Selector.java:311)
	at kafka.network.Processor.configureNewConnections(SocketServer.scala:1024)
	at kafka.network.Processor.run(SocketServer.scala:757)
	at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: org.apache.kafka.common.KafkaException: java.lang.NullPointerException
	at org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:228)
	at org.apache.kafka.common.network.Selector.buildAndAttachKafkaChannel(Selector.java:338)
	... 5 more
Caused by: java.lang.NullPointerException
	at java.base/java.util.Objects.requireNonNull(Objects.java:221)
	at org.apache.kafka.common.security.authenticator.DefaultKafkaPrincipalBuilder.fromOldPrincipalBuilder(DefaultKafkaPrincipalBuilder.java:77)
	at org.apache.kafka.common.network.ChannelBuilders.createPrincipalBuilder(ChannelBuilders.java:216)
	at org.apache.kafka.common.security.authenticator.SaslServerAuthenticator.<init>(SaslServerAuthenticator.java:183)
	at org.apache.kafka.common.network.SaslChannelBuilder.buildServerAuthenticator(SaslChannelBuilder.java:262)
	at org.apache.kafka.common.network.SaslChannelBuilder.lambda$buildChannel$0(SaslChannelBuilder.java:207)
	at org.apache.kafka.common.network.KafkaChannel.<init>(KafkaChannel.java:143)
	at org.apache.kafka.common.network.SaslChannelBuilder.buildChannel(SaslChannelBuilder.java:224)
	... 6 more

The cause for this is property:

principal.builder.class=org.apache.kafka.common.security.auth.DefaultPrincipalBuilder

Normally, for the latest versions of Apache Kafka like 2.x.x, it should not be set at all so that when the process starts it will be like:

principal.builder.class=null
Categories
machine learning

Plot a math function in Python

Hi,

I just started a recap of calculus and wanted to know how and if it’s hard to plot functions in a programming language.

Searching this topic I found this article, which gives an elegant approach:

https://scriptverse.academy/tutorials/python-matplotlib-plot-function.html

After trying the code here is the result

Surely there are even more complex cases but at least there is a start for adapting the code.

Cheers

Categories
machine learning

No workpath update on Jupyter admin started instance

Hi,

Just a very small update. I saw that when you run the CMD under Administrator rights under Windows, the Jupyter working directory is automatically set to C:\Windows\System32…..which is not great at all.

I tried the standard method which is listed here but it does not work. Even after I save the file, it doesn’t take it into consideration and it will overwrite it at another export.

Just start a normal command prompt and run jupyter lab and it will take you as a working dir to your local user dir.

Cheers

Categories
linux

Recover swap file in vim

Hi,

This is a problem that I had since my virtual machine was not stopped properly and my ssh connection was ended prematurely.

https://superuser.com/questions/204209/how-can-i-recover-the-original-file-from-a-swp-file/205131

If you have a file.swp and you want to recover it, do as they say. Open the file in VIM and then type recover.

Cheers

Categories
cloud newtools

Exclusive SASL on Zookeeper connections

Something related to following article. It seems that even if SASL is configured until version 3.6.1, Zookeeper will still allow anonymous connections and actions.

There is now a new configuration available that will restrict such events and you can find it documented on the official Apache Zookeeper administration guide (zookeeper.sessionRequireClientSASLAuth)

The main catch is that it’s not suppose to be configured in zoo.cfg file, but added as a parameter in java.env as a part of SERVER_JVMFLAGS variable.

The old variable which was

zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config"

will become

zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config -Dzookeeper.allowSaslFailedClients=false -Dzookeeper.sessionRequireClientSASLAuth=true"

After this is implemented, when you try to connect using zkCli.sh, it will let you, but when trying to list the main node of resource tree it won’t work.

Example:

Connecting to localhost:2181
Welcome to ZooKeeper!
JLine support is enabled

WATCHER::

WatchedEvent state:SyncConnected type:None path:null
[zk: localhost:2181(CONNECTED) 0] ls /
KeeperErrorCode = Session closed because client failed to authenticate for /
[zk: localhost:2181(CONNECTED) 1] 

The same thing happens if you use zkCli.sh -server [hostname]:2181

In order to connect you will have to add to java.env a line with:

CLIENT_JVMFLAGS=-Djava.security.auth.login.config=/opt/zookeeper/conf/client_jaas.config"

Client file that includes structure

Client {
       org.apache.zookeeper.server.auth.DigestLoginModule required
       username="[client_username]"
       password="[client_password]";
};

Cheers

Categories
python

Unique value on columns – pandas

Hi,

Today is a short example on cases that have longer columns with spaces.

For example. I have a dataframe that has the following columns:

I have read in some sources that you can use the construction wine_new.[column name].unique() to filter the values.

If you have a one word column, it will work, but if the column is listed as multiple words, you can not use a construct like wine_new.’Page ID’.unique() because it will give a syntax error.

Good, so you try to rename it. why Page ID and not pageid? Ok, that should be easy

wine_new = wine_new.rename(columns={"Page ID": "pageid"}, errors="raise")

And it now looks “better”.

But if you need to keep the column name, you can just as easily use wine_new[‘Page ID’].unique() (If you want to count the number of unique values you can also use wine_new[‘Page ID’].nunique())

There are multiple resources on this topic but the approach is not explained using both of the versions on the majority of them.

Cheers

Categories
cloud machine learning python

Prometheus metrics to Pandas data frame

Hi,

We are trying to implement a decision tree algorithm in order to see if our resource usage can classify our servers in different categories.

First step in that process is querying Prometheus from Python and create some data frames with some basic information in order to get them aggregated.

To that purpose, you can also use the following lines of code:

import requests
import copy 

URL = "http://[node_hostname]:9090/api/v1/query?query=metric_to_be_quried[1d]"
  
r = requests.get(url = URL) 

data = r.json()

data_dict={}
metric_list = []
for i in data['data']['result']:
    data_dict = copy.deepcopy(i['metric'])
    for j in i['values']:
        data_dict['time'] = j[0]
        data_dict['value'] = j[1]
        metric_list.append(data_dict)    

df_metric = pd.DataFrame(metric_list)

Other pieces will follow.

Cheers

Categories
linux

Renice until cgroup implementation for process of Yahoo CMAK

Hi,

We saw that ex Kafka Manager, now called Yahoo CMAK was using more than enough CPU in some cases, in general related to bad SSL client config.

It’s not really clear if the CPU usage was real or there was only wait time for resource like memory or I/O (I don’t have an example to post right now, but there are multiple fixes for this).

The easiest one is to change the nice value for usage. What I observed is that normally it starts with nice value of 0. I guess this is default. General check for this works with

ps ax -o ni,cmd | grep cmak | grep -v grep

In order to change this, you can add a crontab line with following command:

pid=`ps ax -o pid,cmd | grep cmak | grep -v grep |  awk {'print $1'}`; ni=`ps ax -o ni,cmd | grep cmak | grep -v grep |  awk {'print $1'}`; if [ "$ni" = "0" ]; then renice 10 $pid; fi

Or, even easier than that, add Nice value under [Service] in /etc/systemd/system/multi-user.target.wants/kafka-manager.service

It does the trick until further cgroup policies are applied.