Welcome!

Blog Feed Post

Kubernetes Containers: Logging and Monitoring support

In this post we will:

  • Introduce Kubernetes concepts and motivation for Kubernetes-aware monitoring and logging tooling
  • Show how to deploy the Sematext Docker Agent to each Kubernetes node with DaemonSet
  • Point out key Kubernetes metrics and log elements to help you troubleshoot and tune Docker and Kubernetes

Managing microservices in containers is typically done with Cluster Managers and Orchestration tools such as  Google Kubernetes, Apache Mesos, Docker Swarm, Docker Cloud, Amazon ECS, Hashicorp Nomad just to mention a few. However, each platform has slightly different of options to deploy containers or schedule tasks to each cluster node. This is why we started a Series of blog post with Docker Swarm Monitoring, and continue today with a quick tutorial for Container Monitoring and Log Collection on Kubernetes.

Kubernetes Core Concepts

Kubernetes is one of the most popular and stable management platforms for Docker containers – it powers Google Containers Engine (GCE) on the Google Cloud platform.

In Kubernetes, a group of one or more containers is called a pod. Containers in a pod are deployed together, and are started, stopped, and replicated as a group. A pod could represent e.g. a web server with a database that run together as a microservice including shared network and storage resources. Replication controllers manage the deployment of pods to the cluster nodes and are responsible for creation, scaling and termination of pods. For example, in case of a node shutdown, the replication controller moves the pods to other nodes to ensure the desired number of replicas for this pod is available. Kubernetes services provide the connectivity with a load balancing proxy for multiple pods that belong to a service. This way clients don’t need to know which node runs a pod for the current service request.  Each pod could have multiple labels. These labels are used to select resources for operations.  For example, a replication controller and services discover pods by label selectors for various operations.

Dynamic Deployments Require Dynamic Monitoring

The high level of automation for the container and microservice lifecycle makes the monitoring of Kubernetes more challenging than in more traditional, more static deployments.  Any static setup to monitor specific application containers would not work because Kubernetes makes its own decisions according to the defined deployment rules. It is not only the deployed microservices that need to be monitored. It is equally important to watch metrics and logs for Kubernetes core services themselves, such as Kubernetes Master running etcd, controller-manager, scheduler and apiserver and Kubernetes Workers (aka Minions) running kubelet and proxy service. Having a centralized place to keep an eye on all these services, their metrics and logs helps one spot problems in the cluster infrastructure. Kubernetes core services could be installed on bare metal, in virtual machines or as containers using Docker. Deploying Kubernetes core services in containers could be helpful with  deployment and monitoring operations – tools for container monitoring would cover both core services and application containers. So how does one monitor such a complex and dynamic environment?

Agent for Kubernetes Metrics and Logs

There are a number of open source docker monitoring one can cobble together to build a monitoring and log collection system (or systems).  The advantage is that the code is all free.  The downside is that this takes times – both initially when setting it up and later when maintaining.  That’s why we built Sematext Docker Agenta modern, Docker-aware metrics, events,and log collection agent.  It runs as a tiny container on every Docker host and collects logs, metrics and events for all cluster nodes and all containers. It discovers all containers (one pod might contain multiple containers) including containers for Kubernetes core services, if core services are deployed in Docker containers. After its deployment, all logs and metrics are immediately available out of the box. Why is this valuable?  Because it means you don’t have to spend the next N hours or days figuring out which data to collect and how to chart it, plus you don’t need the resources to maintain your own logging and monitoring infrastructure.  Let’s see how to deploy this agent.

Deploying Agent to all Kubernetes Nodes

Kubernetes provides DeamonSets, which ensure pods are added to nodes as nodes are added to the cluster. We can use this to easily deploy Sematext Agent to each cluster node!

Configure Sematext Docker Agent for Kubernetes

Sematext Docker Agent is configured via environment variables.

The Sematext Docker Agent Github page lists all options (e.g. filter for specific pods/images/containers), but we’ll keep it simple here:

  1. Get a free account at apps.sematext.com, if you don’t have one already.
  2. Create an SPM App of type “Docker” to obtain the SPM App Token.
    SPM App will hold your Kubernetes performance metrics and event.
  3. Create a Logsene App to obtain the Logsene App Token.
    Logsene App will hold your Kubernetes logs.
  4. Edit values of LOGSENE_TOKEN and SPM_TOKEN in the DaemonSet definition as shown below.
    1. Grab the latest sematext-agent-daemonset.yml (raw plain-text) template (also shown below)
    2. Store it somewhere on disk
    3. Replace the SPM_TOKEN and LOGSENE_TOKEN placeholders with your SPM and Logsene App tokens
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: sematext-agent
spec:
  template:
    metadata:
      labels:
        app: sematext-agent
    spec:
      selector: {}
      dnsPolicy: "ClusterFirst"
      restartPolicy: "Always"
      containers:
      - name: sematext-agent
        image: sematext/sematext-agent-docker:latest
        imagePullPolicy: "Always"
        env:
        - name: SPM_TOKEN
          value: "REPLACE THIS WITH YOUR SPM TOKEN"
        - name: LOGSENE_TOKEN
          value: "REPLACE THIS WITH YOUR LOGSENE TOKEN"
        - name: KUBERNETES
          value: "1"
        volumeMounts:
          - mountPath: /var/run/docker.sock
            name: docker-sock
          - mountPath: /etc/localtime
            name: localtime
      volumes:
        - name: docker-sock
          hostPath:
            path: /var/run/docker.sock
        - name: localtime
          hostPath:
            path: /etc/localtime

Run Agent as DaemonSet

Activate Sematext Agent Docker with kubectl:

>> kubectl create -f sematext-agent-daemonset.yml
daemonset "sematext-agent-daemonset" created

Now let’s check if the agent got deployed to all nodes:

> kubectl get pods

NAME                   READY     STATUS              RESTARTS   AGE

sematext-agent-nh4ez   0/1       ContainerCreating   0          6s

sematext-agent-s47vz   0/1       ImageNotReady       0          6s

The status “ImageNotReady” or “ContainerCreating” might be visible for a short time because Kubernetes must download the image for sematext/sematext-agent-docker first. The setting imagePullPolicy: “Always” specified in sematext-agent-daemonset.yml makes sure that Sematext Agent gets updated automatically using the image from Docker-Hub.

If we check again we’ll see Sematext Docker Agent got deployed to (all) cluster nodes:

> kubectl get pods -l sematext-agent

NAME                   READY     STATUS    RESTARTS   AGE

sematext-agent-nh4ez   1/1       Running   0          8s

sematext-agent-s47vz   1/1       Running   0          8s

Less than a minute after the deployment you should see your Kubernetes metrics and logs! Below are screenshots of various out of the box reports and explanations of various metrics’ meanings!

Interpreting Kubernetes Metrics

The metrics from all Kubernetes nodes are collected in a single SPM App, which aggregates metrics on several levels:

  • Cluster – metrics aggregated over all nodes displayed in SPM overview
  • Host / node level – metrics aggregated per node
  • Docker Image level – metrics aggregated by image name, e.g. all nginx webserver containers
  • Docker Container level – metrics aggregated for a single container

image03https://sematext.com/wp-content/uploads/2016/12/image03-300x137.png 300w, https://sematext.com/wp-content/uploads/2016/12/image03-768x349.png 768w, https://sematext.com/wp-content/uploads/2016/12/image03-1024x466.png 1024w" sizes="(max-width: 1279px) 100vw, 1279px" />

 Host and Container Metrics from the Kubernetes Cluster

Each detailed chart has filter options for Node, Docker Image, and Docker Container. As Kubernetes uses the pod name in the name of the Docker Containers a search by pod name in the Docker Container filter makes it easy to select all containers for a specific pod.

Let’s have a look at a few Kubernetes (and Docker) key metrics provided by SPM

 

  • Host Metrics such as CPU, Memory and Disk space usage. Docker images and containers consume more disk space than regular processes installed on a host. For example, an application image might include a Linux operating system and might have a size of 150-700 MB depending on the size of the base image and installed tools in the container. Data containers consume disk space on the host as well. In our experience watching the disk space and using cleanup tools is essential for continuous operations of Docker hosts.

 

image05https://sematext.com/wp-content/uploads/2016/12/image05-300x116.png 300w" sizes="(max-width: 589px) 100vw, 589px" />

Container count – represents the number of running containers per host

image04https://sematext.com/wp-content/uploads/2016/12/image04-300x78.png 300w, https://sematext.com/wp-content/uploads/2016/12/image04-768x200.png 768w, https://sematext.com/wp-content/uploads/2016/12/image04-1024x267.png 1024w" sizes="(max-width: 1327px) 100vw, 1327px" />

Container Counters per Kubernetes Node over time

 

  • Container Memory and Memory Fail Counters.  These metrics are important to watch and very important to tune applications. Memory limits should fit the footprint of the deployed pod (application) to avoid situations where Kubernetes uses default limits (e.g. defined for a namespace), which could lead to OOM kills of containers. Memory fail counters reflect the number of failed memory allocations in a container, and in case of OOM kills a Docker Event is triggered.  This event is then displayed in SPM because Sematext Docker Agents collects all Docker Events. The best practice is to tune memory setting in a few iterations:
    1. Monitor memory usage of the application container
    2. Set memory limits according to the observations
    3. Continue monitoring of memory, memory fail counters, and Out-Of-Memory events. If OOM events happen, the container memory limits may need to be increased, or debugging is required to find the reason for the high memory consumptions.

image00https://sematext.com/wp-content/uploads/2016/12/image00-300x204.png 300w, https://sematext.com/wp-content/uploads/2016/12/image00-768x522.png 768w, https://sematext.com/wp-content/uploads/2016/12/image00-1024x696.png 1024w" sizes="(max-width: 1457px) 100vw, 1457px" />

Container memory usage, limits and fail counters

 

  • Container CPU usage and throttled CPU time.  The CPU usage can be limited by CPU shares – unlike memory, CPU usage it is not a hard limit. Containers might use more CPU as long the resource is available, but in situations where other containers need the CPU limits apply and the CPU gets throttled to the limit.

 

image01https://sematext.com/wp-content/uploads/2016/12/image01-300x87.png 300w, https://sematext.com/wp-content/uploads/2016/12/image01-768x224.png 768w" sizes="(max-width: 885px) 100vw, 885px" />

There are more metrics to watch, like disk I/O throughput, Network Throughput and Network errors for containers, but let’s continue by looking at Kubernetes Logs next.

Understanding Kubernetes Logs

Kubernetes containers’ logs are not much different from Docker container logs. However, Kubernetes users need to view logs for the deployed pods. That’s why it is very useful to have Kubernetes-specific information available for log search, such as:

  • Kubernetes name space
  • Kubernetes pod name
  • Kubernetes container name
  • Docker image name
  • Kubernetes UID

Sematext Docker Agent extracts this information from the Docker container names and tags all logs with the information mentioned above. Having these data extracted in individual fields makes it is very  easy to watch logs of deployed pods, build reports from logs, quickly narrow down to problematic pods while troubleshooting, and so on!  If Kubernetes core components (such as kubelet, proxy, api server) are deployed via Docker the Sematext Docker Agent will collect Kubernetes core components logs as well.image02https://sematext.com/wp-content/uploads/2016/12/image02-300x144.png 300w, https://sematext.com/wp-content/uploads/2016/12/image02-768x368.png 768w" sizes="(max-width: 955px) 100vw, 955px" />

All logs from Kubernetes containers in Logsene

There are many other useful features Logsene and Sematext Docker Agent give you out of the box, such as:

  • Automatic format detection and parsing of logs
    • Sematext Docker Agent includes patterns to recognize and parse many log formats
  • Custom pattern definitions for specific images and application types
  • Automatic Geo-IP enrichment for container logs
  • Filtering logs e.g. to exclude noisy services
  • Masking of sensitive data in specific log fields (phone numbers, payment information, authentication tokens, …)
  • Alerts and scheduled reports based on logs
  • Analytics for structured logs e.g. in Kibana or Grafana

Most of those topics are described in our post Innovative Docker Log Management and are relevant for Kubernetes log management as well.  If you want to learn more about Docker monitoring, we’ve described that in Docker Key Metrics.  If you’d like to learn even more about Docker Monitoring and Logging stay tuned on sematext.com/blog or follow @sematext.  To monitor your Docker and Kubernetes environments, any of Sematext’s other pre-built integrations, your app’s custom metrics, or your Docker logs, sign up for a free 30-day trial account below.

SIGN UP – FREE TRIAL

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
DX World EXPO, LLC, a Lighthouse Point, Florida-based startup trade show producer and the creator of "DXWorldEXPO® - Digital Transformation Conference & Expo" has announced its executive management team. The team is headed by Levent Selamoglu, who has been named CEO. "Now is the time for a truly global DX event, to bring together the leading minds from the technology world in a conversation about Digital Transformation," he said in making the announcement.
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of the 22nd International Cloud Expo, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organizers to pass great deals to gre...
DevOps is under attack because developers don’t want to mess with infrastructure. They will happily own their code into production, but want to use platforms instead of raw automation. That’s changing the landscape that we understand as DevOps with both architecture concepts (CloudNative) and process redefinition (SRE). Rob Hirschfeld’s recent work in Kubernetes operations has led to the conclusion that containers and related platforms have changed the way we should be thinking about DevOps and...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develop...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
The next XaaS is CICDaaS. Why? Because CICD saves developers a huge amount of time. CD is an especially great option for projects that require multiple and frequent contributions to be integrated. But… securing CICD best practices is an emerging, essential, yet little understood practice for DevOps teams and their Cloud Service Providers. The only way to get CICD to work in a highly secure environment takes collaboration, patience and persistence. Building CICD in the cloud requires rigorous ar...
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
"Evatronix provides design services to companies that need to integrate the IoT technology in their products but they don't necessarily have the expertise, knowledge and design team to do so," explained Adam Morawiec, VP of Business Development at Evatronix, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
To get the most out of their data, successful companies are not focusing on queries and data lakes, they are actively integrating analytics into their operations with a data-first application development approach. Real-time adjustments to improve revenues, reduce costs, or mitigate risk rely on applications that minimize latency on a variety of data sources. In his session at @BigDataExpo, Jack Norris, Senior Vice President, Data and Applications at MapR Technologies, reviewed best practices to ...
Widespread fragmentation is stalling the growth of the IIoT and making it difficult for partners to work together. The number of software platforms, apps, hardware and connectivity standards is creating paralysis among businesses that are afraid of being locked into a solution. EdgeX Foundry is unifying the community around a common IoT edge framework and an ecosystem of interoperable components.
"ZeroStack is a startup in Silicon Valley. We're solving a very interesting problem around bringing public cloud convenience with private cloud control for enterprises and mid-size companies," explained Kamesh Pemmaraju, VP of Product Management at ZeroStack, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Large industrial manufacturing organizations are adopting the agile principles of cloud software companies. The industrial manufacturing development process has not scaled over time. Now that design CAD teams are geographically distributed, centralizing their work is key. With large multi-gigabyte projects, outdated tools have stifled industrial team agility, time-to-market milestones, and impacted P&L stakeholders.
"Akvelon is a software development company and we also provide consultancy services to folks who are looking to scale or accelerate their engineering roadmaps," explained Jeremiah Mothersell, Marketing Manager at Akvelon, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Enterprises are adopting Kubernetes to accelerate the development and the delivery of cloud-native applications. However, sharing a Kubernetes cluster between members of the same team can be challenging. And, sharing clusters across multiple teams is even harder. Kubernetes offers several constructs to help implement segmentation and isolation. However, these primitives can be complex to understand and apply. As a result, it’s becoming common for enterprises to end up with several clusters. Thi...