Blog Feed Post

Docker Log Management & Enrichment

Over the last several months we’ve made all kinds of improvements to Sematext Docker Agent (SDA).  If you’re not familiar with SDA yet, here it is in a nutshell

Sematext Docker Agent is a modern, open-source, Docker-native monitoring and log collection agent. It runs as a tiny container on each Docker host and provides automatic collection and processing of Docker Metrics, Events and Logs for all cluster nodes and all auto-discovered containers. It works with Kubernetes, Mesos, Docker Swarm, Docker Datacenter, Docker Cloud, as well as Amazon EC2, RancherOS, and CoreOS.

By default SDA collects all logs from all containers, but we’ve recently added LOGSENE_ENABLED_DEFAULT, a new flat that lets you control if log collection should be enabled by default or not. When set to false,  SDA will collect logs only from containers labelled with a LOGSENE_TOKEN=<Logsene app token> that, in addition to enabling log collection, also specifies where logs should be shipped. This is really useful because it:

  • Gives each team in an organization full control over log collection for their containers and the routing to their Logsene Apps or Elasticsearch indices. As you can imagine, this is very handy for larger organisations where e.g. one Swarm or Kubernetes cluster is shared by several teams, and each team needs to enable/disable log collection however they see fit and ship logs to different destinations.
  • Lets one exclude collection of logs from infrastructure containers whose logs may not contain enough value to be worth collecting.

The figure below illustrates this.  There are 3 containers running Nginx, Rancher agent, and MySQL.  We want to have explicit control over which containers logs are collected and where they are shipped.  We want to collect only Nginx logs, and we want to ship it to a specific Logsene app, not the default one set in SDA. We can accomplish that by doing the following

  • Set LOGSENE_ENABLED_DEFAULT=false flag in SDA config
  • Set Docker label LOGSENE_ENABLED=true for the Nginx container to enable its log collection
  • Set Docker label LOGSENE_TOKEN=…. for the Nginx container to specify which Logsene app we want to ship logs

https://sematext.com/wp-content/uploads/2017/05/LOGSENE_ENABLED_DEFAULT_... 300w, https://sematext.com/wp-content/uploads/2017/05/LOGSENE_ENABLED_DEFAULT_... 768w" sizes="(max-width: 960px) 100vw, 960px" />

Sematext Docker Agent log routing via container labels – disable log collection by default and enable it only for the Nginx container


We can also inverse the setup and enable log collection for all containers by default, as illustrated in the following figure.

https://sematext.com/wp-content/uploads/2017/05/LOGSENE_ENABLED_DEFAULT_... 300w, https://sematext.com/wp-content/uploads/2017/05/LOGSENE_ENABLED_DEFAULT_... 768w" sizes="(max-width: 960px) 100vw, 960px" />

Sematext Docker Agent log routing via container labels (default settings) – collect and ship all logs, either to default Logsene app or to container specific apps.

We’ve also introduced TAGGING_LABELS.  This flag lets you enrich your container logs with data extracted from your existing Docker environment variables or labels.  Just specify patterns for values to extract, e.g. TAGGING_LABELS=”com.docker.,com.myorg.,role*”. This will add extracted values to your log events as additional fields (you can also think of this as meta-data) and let you easily slice and dice your logs by using values from labels or environment variables.  You can, of course, also build custom reports using this data, thus extracting more value and operational insight from your existing data in Sematext Cloud.

https://sematext.com/wp-content/uploads/2017/05/LOGS_WITH_DOCKER_LABELS-... 300w, https://sematext.com/wp-content/uploads/2017/05/LOGS_WITH_DOCKER_LABELS-... 768w, https://sematext.com/wp-content/uploads/2017/05/LOGS_WITH_DOCKER_LABELS-... 1024w" sizes="(max-width: 2419px) 100vw, 2419px" />Enriched logs with Docker labels

Need a tool that collects your containers Metrics + Events + Logs?  Try Sematext Docker Agent, it’s all open-source and can send your data to Sematext Cloud so you don’t have to manage or build out the backend for storing all the monitoring data, alerting, etc.  For feature requests, bugs, or PRs, see sematext/sematext-agent-docker.


Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
DXWorldEXPO LLC announced today that Dez Blanchfield joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Dez is a strategic leader in business and digital transformation with 25 years of experience in the IT and telecommunications industries developing strategies and implementing business initiatives. He has a breadth of expertise spanning technologies such as cloud computing, big data and analytics, cognitive computing, m...
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
DXWorldEXPO LLC announced today that Kevin Jackson joined the faculty of CloudEXPO's "10-Year Anniversary Event" which will take place on November 11-13, 2018 in New York City. Kevin L. Jackson is a globally recognized cloud computing expert and Founder/Author of the award winning "Cloud Musings" blog. Mr. Jackson has also been recognized as a "Top 100 Cybersecurity Influencer and Brand" by Onalytica (2015), a Huffington Post "Top 100 Cloud Computing Experts on Twitter" (2013) and a "Top 50 C...
There is a huge demand for responsive, real-time mobile and web experiences, but current architectural patterns do not easily accommodate applications that respond to events in real time. Common solutions using message queues or HTTP long-polling quickly lead to resiliency, scalability and development velocity challenges. In his session at 21st Cloud Expo, Ryland Degnan, a Senior Software Engineer on the Netflix Edge Platform team, will discuss how by leveraging a reactive stream-based protocol,...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
Daniel Jones is CTO of EngineerBetter, helping enterprises deliver value faster. Previously he was an IT consultant, indie video games developer, head of web development in the finance sector, and an award-winning martial artist. Continuous Delivery makes it possible to exploit findings of cognitive psychology and neuroscience to increase the productivity and happiness of our teams.
Poor data quality and analytics drive down business value. In fact, Gartner estimated that the average financial impact of poor data quality on organizations is $9.7 million per year. But bad data is much more than a cost center. By eroding trust in information, analytics and the business decisions based on these, it is a serious impediment to digital transformation.
The standardization of container runtimes and images has sparked the creation of an almost overwhelming number of new open source projects that build on and otherwise work with these specifications. Of course, there's Kubernetes, which orchestrates and manages collections of containers. It was one of the first and best-known examples of projects that make containers truly useful for production use. However, more recently, the container ecosystem has truly exploded. A service mesh like Istio addr...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even budget be reexamined to enable this ongoing shift within the modern software factory? In her Day 2 Keynote at @DevOpsSummit at 21st Cloud Expo, Aruna Ravichandran, VP, DevOps Solutions Marketing, CA Technologies, was jo...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Digital Transformation: Preparing Cloud & IoT Security for the Age of Artificial Intelligence. As automation and artificial intelligence (AI) power solution development and delivery, many businesses need to build backend cloud capabilities. Well-poised organizations, marketing smart devices with AI and BlockChain capabilities prepare to refine compliance and regulatory capabilities in 2018. Volumes of health, financial, technical and privacy data, along with tightening compliance requirements by...
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...