Category: cloud

  • Small script to retrieve the OS version from GCP compute engine

    I add here a small script that was drafted with AI but was tested and modified by me so that it returns the OS version for all of the VM instances under a project in GCP

    #!/bin/bash
    
    # Prompt for the project ID
    read -p "Enter your Google Cloud Project ID: " PROJECT_ID
    
    if [ -z "$PROJECT_ID" ]; then
      echo "Project ID cannot be empty."
      exit 1
    fi
    
    echo "Fetching VM instances and their OS versions for project: $PROJECT_ID"
    echo "--------------------------------------------------------------------"
    echo "Instance Name        | Zone                 | OS (from inventory)"
    echo "---------------------|----------------------|---------------------"
    
    # Get all instances (name and zone) in the project
    # The `instances list` command can list instances across all zones if no --zones flag is specified.
    # However, `instances describe` requires a specific zone.
    # So, we list name and zone, then describe each.
    gcloud compute instances list --project="$PROJECT_ID" --format="value(name,zone)" | while read -r INSTANCE_NAME ZONE; do
      # Get the licenses of the boot disk for the current instance
      # We filter for the disk that has boot=true
      # If multiple licenses, it takes the first one. Often OS licenses are listed.
      OS_INFO=$(gcloud compute instances os-inventory describe "$INSTANCE_NAME" \
        --zone="$ZONE" \
        --project="$PROJECT_ID" \
        --format="value(SystemInformation.LongName)" 2>/dev/null)
    
      # If no license info found, display a placeholder
      if [ -z "$OS_INFO" ] || [ "$OS_INFO" = "None" ]; then
        OS_VERSION="N/A or Custom"
      else
        # The command above should already give the last part, but an extra check/cleanup
        OS_VERSION="$OS_INFO"
      fi
    
      # Print the instance name, zone, and OS version in a formatted way
      printf "%-20s | %-20s | %s\n" "$INSTANCE_NAME" "$ZONE" "$OS_VERSION"
    done
    
    echo "--------------------------------------------------------------------"

    I don’t think any other details are needed in regard to this.

    Cheers,

    Sorin

  • Exclusive SASL on Zookeeper connections

    Something related to following article. It seems that even if SASL is configured until version 3.6.1, Zookeeper will still allow anonymous connections and actions.

    There is now a new configuration available that will restrict such events and you can find it documented on the official Apache Zookeeper administration guide (zookeeper.sessionRequireClientSASLAuth)

    The main catch is that it’s not suppose to be configured in zoo.cfg file, but added as a parameter in java.env as a part of SERVER_JVMFLAGS variable.

    The old variable which was

    zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config"

    will become

    zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config -Dzookeeper.allowSaslFailedClients=false -Dzookeeper.sessionRequireClientSASLAuth=true"

    After this is implemented, when you try to connect using zkCli.sh, it will let you, but when trying to list the main node of resource tree it won’t work.

    Example:

    Connecting to localhost:2181
    Welcome to ZooKeeper!
    JLine support is enabled
    
    WATCHER::
    
    WatchedEvent state:SyncConnected type:None path:null
    [zk: localhost:2181(CONNECTED) 0] ls /
    KeeperErrorCode = Session closed because client failed to authenticate for /
    [zk: localhost:2181(CONNECTED) 1] 
    

    The same thing happens if you use zkCli.sh -server [hostname]:2181

    In order to connect you will have to add to java.env a line with:

    CLIENT_JVMFLAGS=-Djava.security.auth.login.config=/opt/zookeeper/conf/client_jaas.config"

    Client file that includes structure

    Client {
           org.apache.zookeeper.server.auth.DigestLoginModule required
           username="[client_username]"
           password="[client_password]";
    };

    Cheers

  • Prometheus metrics to Pandas data frame

    Hi,

    We are trying to implement a decision tree algorithm in order to see if our resource usage can classify our servers in different categories.

    First step in that process is querying Prometheus from Python and create some data frames with some basic information in order to get them aggregated.

    To that purpose, you can also use the following lines of code:

    import requests
    import copy 
    
    URL = "http://[node_hostname]:9090/api/v1/query?query=metric_to_be_quried[1d]"
      
    r = requests.get(url = URL) 
    
    data = r.json()
    
    data_dict={}
    metric_list = []
    for i in data['data']['result']:
        data_dict = copy.deepcopy(i['metric'])
        for j in i['values']:
            data_dict['time'] = j[0]
            data_dict['value'] = j[1]
            metric_list.append(data_dict)    
    
    df_metric = pd.DataFrame(metric_list)

    Other pieces will follow.

    Cheers

  • Datadog and GCP are “friends” up to a point

    Hi,

    Since in the last period I preferred to publish more on Medium, let me give you the link to the latest article.

    There is an interesting case in which the combination of automation, Goggle Cloud Platform and Datadog didn’t go as we expected.

    https://medium.com/metrosystemsro/puppet-datadog-google-cloud-platform-recipe-for-a-small-outage-310166e551f1

    Hope you enjoy! I will get back with more also with interesting topics on this blog also.

    Cheers

  • Overriding OS fact with external one

    Hi,

    Short notice article. We had a issue in which the traefik module code was not running because of a wrong os fact. Although the image is Ubuntu 14.04, facter returns it like:

    {
      architecture => "amd64",
      family => "Debian",
      hardware => "x86_64",
      name => "Debian",
      release => {
        full => "jessie/sid",
        major => "jessie/sid"
      },
      selinux => {
        enabled => false
      }
    }

    I honestly don’t know why this happens since on rest of machines it works good, the way to fix it fast is by defining an external fact in /etc/facter/facts.d

    Create a file named os_fact.json, for example, that will contain this content:

    { 
       "os":{ 
          "architecture":"amd64",
          "distro":{ 
             "codename":"trusty",
             "description":"Ubuntu 14.04.6 LTS",
             "id":"Ubuntu",
             "release":{ 
                "full":"14.04",
                "major":"14.04"
             }
          },
          "family":"Debian",
          "hardware":"x86_64",
          "name":"Ubuntu",
          "release":{ 
             "full":"14.04",
             "major":"14.04"
          },
          "selinux":{ 
             "enabled":"false"
          }
       }
    }
    

    And it’s fixed.

    Cheers

  • Strange problem in puppet run for Ubuntu

    Hi,

    Short sharing of a strange case.

    We’ve written a small manifest in order to distribute some python scripts. You can find the reference here: https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

    When you try to run it on Ubuntu 14.04, there is this very strange error:

    Error: Failed to apply catalog: [nil, nil, nil, nil, nil, nil]

    The cause for this is as follows:

    Python 3.4.3 (default, Nov 12 2018, 22:25:49)
    [GCC 4.8.4] on linux (and I believe this is the default max version on trusty)

    In order to install the dependencies, you need python3-pip, so a short search returns following options:

    apt search python3-pip
    Sorting... Done
    Full Text Search... Done
    python3-pip/trusty-updates,now 1.5.4-1ubuntu4 all [installed]
      alternative Python package installer - Python 3 version of the package
    
    python3-pipeline/trusty 0.1.3-3 all
      iterator pipelines for Python 3

    If we want to list all the installed modules with pip3 list, guess what, it’s not working:

    Traceback (most recent call last):
       File "/usr/bin/pip3", line 5, in 
         from pkg_resources import load_entry_point
       File "/usr/local/lib/python3.4/dist-packages/pkg_resources/init.py", line 93, in 
         raise RuntimeError("Python 3.5 or later is required")
     RuntimeError: Python 3.5 or later is required

    So, main conclusion is that it’s not related to puppet, just the incompatibility between version for this old distribution.

    Cheers

  • Automatic increase of Kafka LVM on GCP

    I wrote an article for my company that was published on Medium regarding the topic in the subject. Please see the link

    https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

    Thanks

  • Using GCP recommender API for Compute engine

    Let’s keep it short. If you want to use Python libraries for Recommender API, this is how you connect to your project.

    from google.cloud.recommender_v1beta1 import RecommenderClient
    from google.oauth2 import service_account
    def main():
        credential = service_account.Credentials.from_service_account_file('account.json')
        project = "internal-project"
        location = "europe-west1-b"
        recommender = 'google.compute.instance.MachineTypeRecommender'
        client = RecommenderClient(credentials=credential)
        name = client.recommender_path(project, location, recommender)
       
        elements = client.list_recommendations(name,page_size=4)
        for i in elements:
            print(i)
    
    main()

    credential = RecommenderClient.from_service_account_file(‘account.json’) will not return any error, just hang.

    That’s all folks!

  • ELK query using Python with time range

    Short post. Sharing how you make an ELK query from Python using also timestamp:

    es=Elasticsearch([{'host':'[elk_host','port':elk_port}])
    
    query_body_mem = {
        "query": {
            "bool" : {
                "must" : [
                        {
                            "query_string" : {
                            "query": "metricset.module:system metricset.name:memory AND tags:test AND host.name:[hostname]"
                        }
                    },
                    {
                             "range" : {
                                "@timestamp" : {
                                    "gte" : "now-2d",
                                    "lt" :  "now"
                }
            
            }
       
                    }
                ]
            }
       
        }
        
    }
    
    res_mem=es.search(index="metricbeat-*", body=query_body_mem, size=500)
    df_mem = json_normalize(res_mem['hits']['hits'])
    

    And that’s all!

    Cheers