Blog Feed Post

What is Syslog: Daemons, Message Formats and Protocols

Pretty much everyone’s heard about syslog: with its roots in the 80s, it’s still used for a lot of the logging done today. Mostly because of its long history, syslog is quite a vague concept, referring to many things. Which is why you’ve probably heard:

  • Check syslog, maybe it says something about the problem – referring to /var/log/messages.
  • Syslog doesn’t support messages longer than 1K – about message format restrictions.
  • Syslog is unreliable – referring to the UDP protocol.

In this post, we’ll explain the different facets by being specific: instead of saying “syslog”, you’ll read about syslog daemons, about syslog message formats and about syslog protocols.

Note the plurals: there are multiple options for each. We’ll show the important ones here to shed some light over the vague (and surprisingly rich) concept. Along the way, we’ll debunk some of the myths surrounding syslog. For example, you can choose to limit messages to 1K and you can choose to send them via UDP, but you don’t have to – it’s not even a default in modern syslog daemons.

Syslog daemons

A syslog daemon is a program that:

  • can receive local syslog messages. Traditionally /dev/log UNIX socket and kernel logs.
  • can write them to a file. Traditionally /var/log/messages or /var/log/syslog will receive everything, while some categories of messages go to specific files, like /var/log/mail.
  • can forward them to the network or other destinations. Traditionally, via UDP. Usually, the daemon also implements equivalent network listeners (UDP in this case).

This is where syslog is often referring to syslogd or sysklogd, the original BSD syslog daemon. Development for it stopped for Linux since 2007, but continued for BSDs and OSX. There are alternatives, most notably:
* rsyslog. Originally a fork of syslogd, it still can be used as a drop in replacement for it. Over the years, it evolved into a performance-oriented, multipurpose logging tool, that can read data from multiple sources, parse and enrich logs in various ways, and ship to various destinations
* syslog-ng. Unlike rsyslog, it used a different configuration format from the start (rsyslog eventually got to the same conclusion, but still supports the BSD syslog config syntax as well – which can be confusing at times). You’d see a similar feature set to rsyslog, like parsing unstructured data and shipping it to Elasticsearch or Kafka. It’s still fast and light, and while it may not have the ultimate performance of rsyslog (depends on the use-case, see the comments section), it has better documentation and it’s more portable
* nxlog. Yet another syslog daemon which evolved into a multi-purpose log shipper, it sets itself apart by working well on Windows

In essence, a modern syslog daemon is a log shipper that works with various syslog message formats and protocols. If you want to learn more about log shippers in general, we wrote a side-by-side comparison of Logstash and 5 other popular shippers, including rsyslog and syslog-ng.

Myths about syslog daemons

The one we come across most often is that syslog daemons are no good if you log to files or if you want to parse unstructured data. This used to be true years ago, but then so was Y2K. Things changed in the meantime. In the myth’s defense, some distributions ship with old versions of rsyslog and syslog-ng. Plus, the default configuration often only listens for /dev/log and kernel messages (it doesn’t need more), so it’s easy to generalize.

Syslog message formats

You’ll normally find syslog messages in two major formats:

RFC3164 a.k.a “the old format”

Although RFC suggests it’s a standard, RFC3164 was more of a collection of what was found in the wild at the time (2001), rather than a spec that implementations will adhere to. As a result, you’ll find slight variations of it. That said, most messages will look like the RFC3164 example:

<34>Oct 11 22:14:15 mymachine su: 'su root' failed for lonvick on /dev/pts/8

This is how the application should log to /dev/log, and you can see some structure:

  • <34> is a priority number. It represents the facility number multiplied by 8, to which severity is added. In this case, facility=4 (Auth) and severity=2 (Critical).
  • Oct 11 22:14:15 is commonly known as syslog timestamp. It misses the year, the time-zone and doesn’t have sub-second information. For those reasons, rsyslog also parses RFC3164-formatted messages with an ISO-8601 timestamp instead
  • mymachine is a host name where the message was written.
  • su: is a tag. Typically this is the process name – sometimes having a PID (like su[1234]:). The tag typically ends in a colon, but it may end up just with the square brackets or with a space.
  • the message (MSG) is everything after the tag. In this example, since we have the colon to separate the tag and the message, the message actually starts with a space. This tiny detail often gives a lot of headache when parsing.

In /var/log/messages, you’ll often see something like this:

Oct 11 22:14:15 su: 'su root' failed for lonvick on /dev/pts/8

This isn’t a syslog message format, it’s just how most syslog deamons write messages to files by default. Usually, you can choose how the output data looks like, for example rsyslog has templates.

RFC5424 a.k.a. “the new format”

RFC5424 came up in 2009 to deal with the problems of RFC3164. First of all, it’s an actual standard, that daemons and libraries chose to implement. Here’s an example message:

<34>1 2003-10-11T22:14:15.003Z mymachine.example.com su - - - 'su root' failed for lonvick on /dev/pts/8

Now we get an ISO-8601 timestamp, amongst other improvements. We also get more structure: the dashes you can see there are places for PID, message ID and other structured data you may have. That said, RFC5424 structured data never really took off, as people preferred to put JSON in the syslog message (whether it’s the old or the new format). Finally, the new format supports UTF8 and other encodings, not only ASCII, and it’s easier to extend because it has a version number (in this example, the 1 after the priority number).

Myths around the syslog message formats

The ones we see more often are:

Syslog protocols

Originally, syslog messages were sent over the wire via UDP – which was also mentioned in RFC3164. It was later standardized in RFC5426, after the new message format (RFC5424) was published.

Modern syslog daemons support other protocols as well. Most notably:

  • TCP. Just like the UDP, it was first used in the wild and then documented. That documentation finally came with RFC6587, which describes two flavors:
    • messages are delimited by a trailer character, typically a newline
    • messages are framed based on an octet count
  • TLS. Standardized in RFC5425, which allows for encryption and certificate-based authorization
  • RELP. Unlike plain TCP, RELP adds application-level acknowledgements, which provides at-least-once guarantees on delivering messages. You can also get RELP with TLS if you need encryption and authorization

Besides writing to files and communicating to each other, modern syslog daemons can also write to other destinations. For example, datastores like MySQL or Elasticsearch or queue systems such as Kafka and RabbitMQ. Each such destination often comes with its own protocol and message format. For example, Elasticsearch uses JSON over HTTP (though you can also secure it and send syslog messages over HTTPS).

Myths around syslog protocols

The ones we hear most come from the assumption that UDP is the only option, implying there’s no reliability, authorization or encryption.

The other frequent one is that you can’t send multiline messages, like stack traces. This is only true for TCP syslog, if newlines are used for delimiting. Then, a stacktrace will end up as multiple messages at the destination – unless its newlines are escaped at the source and reverted at the destination. With UDP, multiline logs work out of the box, because you have one message per datagram. Other protocols (TLS, RELP and octet-delimited TCP) also handle multiline logs well, by framing messages.

What’s next?

Hopefully this post helped clear the fog around syslog. If you’re looking for tips on how to configure your syslog daemon, you can find a lot of them on this blog. We especially love the topic of centralizing logs with Elasticsearch. That’s because we run Logsene, our logging SaaS that exposes the Elasticsearch API which supports all the syslog we discussed here in terms of message formats (including JSON over syslog) and protocols (UDP, TCP, TLS, RELP).

You can sign up and get a free Logsene trial here. You’ll find configuration samples for all major log shippers (including the syslog daemons discussed in this post).

If you’re looking to build a log-centralization solution for yourself, we can help: either through logging consulting or trainings on Elasticsearch and its logging ecosystem. If you’re into logging in general and want to build such solutions for others, we’re hiring worldwide.


Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
In his session at 21st Cloud Expo, Raju Shreewastava, founder of Big Data Trunk, provided a fun and simple way to introduce Machine Leaning to anyone and everyone. He solved a machine learning problem and demonstrated an easy way to be able to do machine learning without even coding. Raju Shreewastava is the founder of Big Data Trunk (www.BigDataTrunk.com), a Big Data Training and consulting firm with offices in the United States. He previously led the data warehouse/business intelligence and B...
Blockchain is a shared, secure record of exchange that establishes trust, accountability and transparency across business networks. Supported by the Linux Foundation's open source, open-standards based Hyperledger Project, Blockchain has the potential to improve regulatory compliance, reduce cost as well as advance trade. Are you curious about how Blockchain is built for business? In her session at 21st Cloud Expo, René Bostic, Technical VP of the IBM Cloud Unit in North America, discussed the b...
The past few years have brought a sea change in the way applications are architected, developed, and consumed—increasing both the complexity of testing and the business impact of software failures. How can software testing professionals keep pace with modern application delivery, given the trends that impact both architectures (cloud, microservices, and APIs) and processes (DevOps, agile, and continuous delivery)? This is where continuous testing comes in. D
SYS-CON Events announced today that Synametrics Technologies will exhibit at SYS-CON's 22nd International Cloud Expo®, which will take place on June 5-7, 2018, at the Javits Center in New York, NY. Synametrics Technologies is a privately held company based in Plainsboro, New Jersey that has been providing solutions for the developer community since 1997. Based on the success of its initial product offerings such as WinSQL, Xeams, SynaMan and Syncrify, Synametrics continues to create and hone in...
With tough new regulations coming to Europe on data privacy in May 2018, Calligo will explain why in reality the effect is global and transforms how you consider critical data. EU GDPR fundamentally rewrites the rules for cloud, Big Data and IoT. In his session at 21st Cloud Expo, Adam Ryan, Vice President and General Manager EMEA at Calligo, examined the regulations and provided insight on how it affects technology, challenges the established rules and will usher in new levels of diligence arou...
Nordstrom is transforming the way that they do business and the cloud is the key to enabling speed and hyper personalized customer experiences. In his session at 21st Cloud Expo, Ken Schow, VP of Engineering at Nordstrom, discussed some of the key learnings and common pitfalls of large enterprises moving to the cloud. This includes strategies around choosing a cloud provider(s), architecture, and lessons learned. In addition, he covered some of the best practices for structured team migration an...
As you move to the cloud, your network should be efficient, secure, and easy to manage. An enterprise adopting a hybrid or public cloud needs systems and tools that provide: Agility: ability to deliver applications and services faster, even in complex hybrid environments Easier manageability: enable reliable connectivity with complete oversight as the data center network evolves Greater efficiency: eliminate wasted effort while reducing errors and optimize asset utilization Security: imple...
Mobile device usage has increased exponentially during the past several years, as consumers rely on handhelds for everything from news and weather to banking and purchases. What can we expect in the next few years? The way in which we interact with our devices will fundamentally change, as businesses leverage Artificial Intelligence. We already see this taking shape as businesses leverage AI for cost savings and customer responsiveness. This trend will continue, as AI is used for more sophistica...
No hype cycles or predictions of a gazillion things here. IoT is here. You get it. You know your business and have great ideas for a business transformation strategy. What comes next? Time to make it happen. In his session at @ThingsExpo, Jay Mason, an Associate Partner of Analytics, IoT & Cybersecurity at M&S Consulting, presented a step-by-step plan to develop your technology implementation strategy. He also discussed the evaluation of communication standards and IoT messaging protocols, data...
Companies are harnessing data in ways we once associated with science fiction. Analysts have access to a plethora of visualization and reporting tools, but considering the vast amount of data businesses collect and limitations of CPUs, end users are forced to design their structures and systems with limitations. Until now. As the cloud toolkit to analyze data has evolved, GPUs have stepped in to massively parallel SQL, visualization and machine learning.
The 22nd International Cloud Expo | 1st DXWorld Expo has announced that its Call for Papers is open. Cloud Expo | DXWorld Expo, to be held June 5-7, 2018, at the Javits Center in New York, NY, brings together Cloud Computing, Digital Transformation, Big Data, Internet of Things, DevOps, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
Modern software design has fundamentally changed how we manage applications, causing many to turn to containers as the new virtual machine for resource management. As container adoption grows beyond stateless applications to stateful workloads, the need for persistent storage is foundational - something customers routinely cite as a top pain point. In his session at @DevOpsSummit at 21st Cloud Expo, Bill Borsari, Head of Systems Engineering at Datera, explored how organizations can reap the bene...
In his Opening Keynote at 21st Cloud Expo, John Considine, General Manager of IBM Cloud Infrastructure, led attendees through the exciting evolution of the cloud. He looked at this major disruption from the perspective of technology, business models, and what this means for enterprises of all sizes. John Considine is General Manager of Cloud Infrastructure Services at IBM. In that role he is responsible for leading IBM’s public cloud infrastructure including strategy, development, and offering m...
Kubernetes is an open source system for automating deployment, scaling, and management of containerized applications. Kubernetes was originally built by Google, leveraging years of experience with managing container workloads, and is now a Cloud Native Compute Foundation (CNCF) project. Kubernetes has been widely adopted by the community, supported on all major public and private cloud providers, and is gaining rapid adoption in enterprises. However, Kubernetes may seem intimidating and complex ...
In his session at 21st Cloud Expo, Michael Burley, a Senior Business Development Executive in IT Services at NetApp, described how NetApp designed a three-year program of work to migrate 25PB of a major telco's enterprise data to a new STaaS platform, and then secured a long-term contract to manage and operate the platform. This significant program blended the best of NetApp’s solutions and services capabilities to enable this telco’s successful adoption of private cloud storage and launching ...