Category: cloud

  • Adding schema to BigQuery table

    I am currently working on a bigger article for Medium but until I can put it words so that it’s really something really worth sharing, just wanted to add on one of the steps that I learned while working.

    Normally I wanted to load some info that was stored in CSV to BigQuery for analysis and filtering.

    I though that you can just add the header to CSV file and it will automatically recognize it and load it.

    Turns out that it’s a little bit more complicated.

    Normally it should work, and since it did with my first CSV, couldn’t really understand what was wrong.

    Now, there are two parts to this story:

    • How can you add the Table schema manually, and it will reveal the actual issue.
    • What do you need to be aware of and why this happens

    How can you add the table schema manually

    The data that is written to CSV is actually a Dataframe, so you have info about the types of the columns directly from code

    dtype_mapping = {
        'object': 'STRING',
        'int64': 'FLOAT',
        'float64': 'FLOAT'
    }
    
    schema = []
    for column, dtype in df.dtypes.items():
        schema.append({
            'name': column,
            'type': dtype_mapping.get(str(dtype), 'STRING') # Default to STRING if type is not in map
        })
    
    import json
    print(json.dumps(schema, indent=2))

    Yes, I know, int64 should be mapped to INTEGER, but it turns out that for my case some columns even if in Python are market as int64, in BigQuery they need to be FLOAT. I know there is more memory allocated but the dataset is quite small.

    So you can easily use so that you can exclude the header.

    df.to_csv(f"df.csv", index=False, mode='a')

    The above piece of code will help you create a SCHEMA from a Dataframe header

    What do you need to be aware of and why this happens

    The actually reason why this happened is because I was not aware that somewhere in my csv file a line with the header definition still remained (yes, I actually wrote multiple dataframes with header and filtered with a Linux command, and it did not work)

    And when the file loaded, I actually saw this:

    Normally if that line was missing and no header, that it should had looked like

    Things to be learned from this exercise:

    • If you have to write multiple dataframes in a CSV file, don’t add the header and use the above code to generate a specific definition of the schema
    • Properly check the CSV not to have rogue lines that don’t match the rest of the structure of the data, otherwise you will find out that everything is converted to string and you don’t understand why.

    And now you know.

    Sorin

  • Small script to retrieve the OS version from GCP compute engine

    I add here a small script that was drafted with AI but was tested and modified by me so that it returns the OS version for all of the VM instances under a project in GCP

    #!/bin/bash
    
    # Prompt for the project ID
    read -p "Enter your Google Cloud Project ID: " PROJECT_ID
    
    if [ -z "$PROJECT_ID" ]; then
      echo "Project ID cannot be empty."
      exit 1
    fi
    
    echo "Fetching VM instances and their OS versions for project: $PROJECT_ID"
    echo "--------------------------------------------------------------------"
    echo "Instance Name        | Zone                 | OS (from inventory)"
    echo "---------------------|----------------------|---------------------"
    
    # Get all instances (name and zone) in the project
    # The `instances list` command can list instances across all zones if no --zones flag is specified.
    # However, `instances describe` requires a specific zone.
    # So, we list name and zone, then describe each.
    gcloud compute instances list --project="$PROJECT_ID" --format="value(name,zone)" | while read -r INSTANCE_NAME ZONE; do
      # Get the licenses of the boot disk for the current instance
      # We filter for the disk that has boot=true
      # If multiple licenses, it takes the first one. Often OS licenses are listed.
      OS_INFO=$(gcloud compute instances os-inventory describe "$INSTANCE_NAME" \
        --zone="$ZONE" \
        --project="$PROJECT_ID" \
        --format="value(SystemInformation.LongName)" 2>/dev/null)
    
      # If no license info found, display a placeholder
      if [ -z "$OS_INFO" ] || [ "$OS_INFO" = "None" ]; then
        OS_VERSION="N/A or Custom"
      else
        # The command above should already give the last part, but an extra check/cleanup
        OS_VERSION="$OS_INFO"
      fi
    
      # Print the instance name, zone, and OS version in a formatted way
      printf "%-20s | %-20s | %s\n" "$INSTANCE_NAME" "$ZONE" "$OS_VERSION"
    done
    
    echo "--------------------------------------------------------------------"

    I don’t think any other details are needed in regard to this.

    Cheers,

    Sorin

  • Exclusive SASL on Zookeeper connections

    Something related to following article. It seems that even if SASL is configured until version 3.6.1, Zookeeper will still allow anonymous connections and actions.

    There is now a new configuration available that will restrict such events and you can find it documented on the official Apache Zookeeper administration guide (zookeeper.sessionRequireClientSASLAuth)

    The main catch is that it’s not suppose to be configured in zoo.cfg file, but added as a parameter in java.env as a part of SERVER_JVMFLAGS variable.

    The old variable which was

    zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config"

    will become

    zookeeperstd::jvm_flags: "-Djava.security.auth.login.config=/opt/zookeeper/conf/zoo_jaas.config -Dzookeeper.allowSaslFailedClients=false -Dzookeeper.sessionRequireClientSASLAuth=true"

    After this is implemented, when you try to connect using zkCli.sh, it will let you, but when trying to list the main node of resource tree it won’t work.

    Example:

    Connecting to localhost:2181
    Welcome to ZooKeeper!
    JLine support is enabled
    
    WATCHER::
    
    WatchedEvent state:SyncConnected type:None path:null
    [zk: localhost:2181(CONNECTED) 0] ls /
    KeeperErrorCode = Session closed because client failed to authenticate for /
    [zk: localhost:2181(CONNECTED) 1] 
    

    The same thing happens if you use zkCli.sh -server [hostname]:2181

    In order to connect you will have to add to java.env a line with:

    CLIENT_JVMFLAGS=-Djava.security.auth.login.config=/opt/zookeeper/conf/client_jaas.config"

    Client file that includes structure

    Client {
           org.apache.zookeeper.server.auth.DigestLoginModule required
           username="[client_username]"
           password="[client_password]";
    };

    Cheers

  • Prometheus metrics to Pandas data frame

    Hi,

    We are trying to implement a decision tree algorithm in order to see if our resource usage can classify our servers in different categories.

    First step in that process is querying Prometheus from Python and create some data frames with some basic information in order to get them aggregated.

    To that purpose, you can also use the following lines of code:

    import requests
    import copy 
    
    URL = "http://[node_hostname]:9090/api/v1/query?query=metric_to_be_quried[1d]"
      
    r = requests.get(url = URL) 
    
    data = r.json()
    
    data_dict={}
    metric_list = []
    for i in data['data']['result']:
        data_dict = copy.deepcopy(i['metric'])
        for j in i['values']:
            data_dict['time'] = j[0]
            data_dict['value'] = j[1]
            metric_list.append(data_dict)    
    
    df_metric = pd.DataFrame(metric_list)

    Other pieces will follow.

    Cheers

  • Datadog and GCP are “friends” up to a point

    Hi,

    Since in the last period I preferred to publish more on Medium, let me give you the link to the latest article.

    There is an interesting case in which the combination of automation, Goggle Cloud Platform and Datadog didn’t go as we expected.

    https://medium.com/metrosystemsro/puppet-datadog-google-cloud-platform-recipe-for-a-small-outage-310166e551f1

    Hope you enjoy! I will get back with more also with interesting topics on this blog also.

    Cheers

  • Overriding OS fact with external one

    Hi,

    Short notice article. We had a issue in which the traefik module code was not running because of a wrong os fact. Although the image is Ubuntu 14.04, facter returns it like:

    {
      architecture => "amd64",
      family => "Debian",
      hardware => "x86_64",
      name => "Debian",
      release => {
        full => "jessie/sid",
        major => "jessie/sid"
      },
      selinux => {
        enabled => false
      }
    }

    I honestly don’t know why this happens since on rest of machines it works good, the way to fix it fast is by defining an external fact in /etc/facter/facts.d

    Create a file named os_fact.json, for example, that will contain this content:

    { 
       "os":{ 
          "architecture":"amd64",
          "distro":{ 
             "codename":"trusty",
             "description":"Ubuntu 14.04.6 LTS",
             "id":"Ubuntu",
             "release":{ 
                "full":"14.04",
                "major":"14.04"
             }
          },
          "family":"Debian",
          "hardware":"x86_64",
          "name":"Ubuntu",
          "release":{ 
             "full":"14.04",
             "major":"14.04"
          },
          "selinux":{ 
             "enabled":"false"
          }
       }
    }
    

    And it’s fixed.

    Cheers

  • Strange problem in puppet run for Ubuntu

    Hi,

    Short sharing of a strange case.

    We’ve written a small manifest in order to distribute some python scripts. You can find the reference here: https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

    When you try to run it on Ubuntu 14.04, there is this very strange error:

    Error: Failed to apply catalog: [nil, nil, nil, nil, nil, nil]

    The cause for this is as follows:

    Python 3.4.3 (default, Nov 12 2018, 22:25:49)
    [GCC 4.8.4] on linux (and I believe this is the default max version on trusty)

    In order to install the dependencies, you need python3-pip, so a short search returns following options:

    apt search python3-pip
    Sorting... Done
    Full Text Search... Done
    python3-pip/trusty-updates,now 1.5.4-1ubuntu4 all [installed]
      alternative Python package installer - Python 3 version of the package
    
    python3-pipeline/trusty 0.1.3-3 all
      iterator pipelines for Python 3

    If we want to list all the installed modules with pip3 list, guess what, it’s not working:

    Traceback (most recent call last):
       File "/usr/bin/pip3", line 5, in 
         from pkg_resources import load_entry_point
       File "/usr/local/lib/python3.4/dist-packages/pkg_resources/init.py", line 93, in 
         raise RuntimeError("Python 3.5 or later is required")
     RuntimeError: Python 3.5 or later is required

    So, main conclusion is that it’s not related to puppet, just the incompatibility between version for this old distribution.

    Cheers

  • Automatic increase of Kafka LVM on GCP

    I wrote an article for my company that was published on Medium regarding the topic in the subject. Please see the link

    https://medium.com/metrosystemsro/new-ground-automatic-increase-of-kafka-lvm-on-gcp-311633b0816c

    Thanks

  • Using GCP recommender API for Compute engine

    Let’s keep it short. If you want to use Python libraries for Recommender API, this is how you connect to your project.

    from google.cloud.recommender_v1beta1 import RecommenderClient
    from google.oauth2 import service_account
    def main():
        credential = service_account.Credentials.from_service_account_file('account.json')
        project = "internal-project"
        location = "europe-west1-b"
        recommender = 'google.compute.instance.MachineTypeRecommender'
        client = RecommenderClient(credentials=credential)
        name = client.recommender_path(project, location, recommender)
       
        elements = client.list_recommendations(name,page_size=4)
        for i in elements:
            print(i)
    
    main()

    credential = RecommenderClient.from_service_account_file(‘account.json’) will not return any error, just hang.

    That’s all folks!