new rsyslog devel branch: 7.5

After the successful and important release of 7.4 stable, we are working hard on further improving rsyslog. Today, we release rsyslog 7.5.0, which opens up the new development branch.

With that branch, we’ll further focus on security. Right with the first release, we provide a much-often demanded feature: TLS support for the Reliable Event Logging Protocol (RELP). Even better, we also support the compression feature that is included in GnuTLS, so you can use im/omrelp not only with TLS, but also turn on compression if you like so.

There is already a guide for TLS-secured RELP with rsyslog available. It was written for the experimental 7.3.16 release, which never was officially announced. So the guide contains some (now-unnecessary) build steps, but also a full example of client and server configurations.

Note that the current TLS support supports anonymous authentication, only (via Diffie-Hellman key exchange). This is due to the current librelp implementation. However, librelp is scheduled to become more feature-rich and rsyslog will support this, once it is available. In general, you can expect some more enhancements (and more fine-grained config control for those in the need) for rsyslog’s RELP subsystem.

There are more things to come, especially in the security context. So stay tuned to what the next version of rsyslog will provide!

rsyslog: How can I see which field contains which data item?

A topic that comes up on the rsyslog mailing list or support forum very often is that folks do not know exactly which values are contained on which fields (or properties, like they are called in rsyslog, e.g. TAG, MSG and so on).

So I thought I write a quick block post on how to do that. I admit, I do this mostly to save me some time typing and having it at hand for reference in the future.

This is such a common case, that rsyslog contains a template which will output all fields. The basic idea is to use that template and write all message into a special file. A user can than look up that file, find the message in question, and see exactly which field/property is populated with which data.

It’s very easy to do. All you need to place this at the top of your rsyslog.conf:


*.* /var/log/debugfmt;RSYSLOG_DebugFormat

and restart rsyslog. After that, all messages will be written to /var/log/debugfmt. Note that each message is written to multiple lines, with each of its properties. This is a very rough example of how this may look:


Debug line with all properties:
FROMHOST: 'linux', fromhost-ip: '127.0.0.1', HOSTNAME: 'linux', PRI: 46,
syslogtag 'rsyslogd:', programname: 'rsyslogd', APP-NAME: 'rsyslogd', PROCID: '-', MSGID: '-',
TIMESTAMP: 'Jun 10 18:56:18', STRUCTURED-DATA: '-',
msg: ' [origin software="rsyslogd" swVersion="7.4.0" x-pid="4800" x-info="http://www.rsyslog.com"] start'
escaped msg: ' [origin software="rsyslogd" swVersion="7.4.0" x-pid="4800" x-info="http://www.rsyslog.com"] start'
inputname: rsyslogd rawmsg: ' [origin software="rsyslogd" swVersion="7.4.0" x-pid="4800" x-info="http://www.rsyslog.com"] start'

Note that I lied a bit: it’s not actually all properties, but the most important ones (among others, the JSON properties are missing, as this is a large amoutn of data – but I may add it in later versions).

If you know what to know what, for example, APP-NAME is, you can simply look it up (here it is “rsyslogd”).

Note that I have deliberately given this example in legacy configuration language, so that it can be used by all users of rsyslog. You can of course also make use of the more advanced versions.

PRISM – and Google and Facebook Intelligence

While almost everyone (including me) is talking about the PRISM system, it may be worth stepping a little back and looking at one of the underlying problems.

Looking closely, PRISM drew it’s data from just a couple of large sources, including Google and Facebook. But, again, it’s very few sources. While I guess they combine this data with intelligence they have gathered elsewhere, the fact remains that only two handful of sites seem to store quite a bit of information about everyone one the Internet.

So the question probably is what do privately owned corporations like Google and Facebook do with our data? Privacy advocates are never tired to warn about privacy concerns in regard to the Internet giant’s services. In fact, the big guys probably have much more insight into their user’s habits than intelligence agencies knew in the past about high-profile people.

And it’s not only their users: think about all those Facebook “Like” buttons, Google “+1” buttons or even Google Analytics, which is used on an extremely large number of sites. Not to mention blogger (that I am using right now ;)) or youtube, or… If if you are not logged in to your G+ or FB account – or even don’t have one – they gather information about you based on your IP address or geographical location. With IPv6, this problem will become even worse, as it is much easier to continuously identify someone surfing the web, especially if the ISP isn’t looking too much into privacy-enhancements.

And it’s undeniable those giants not only have but make use of the wealth of personal profile information: just look at how targeted ads are nowadays and how customized search results are, based on what the companies know about each and every one of us. Actually, this is what is the actual (astronomical) market value of these companies: not their service per sé but rather their intelligence capabilities over the users base – and how they make use of it.

The best example in my opinion is probably Google Now: it is absolutely astonishing and fascinating how well Google predicts what is of interest for me. Of course it is also pretty frightening, if you just think a couple of seconds about it. Plus, users even give away their roaming profile to use this service which “simple is to good to turn off for privacy concerns”. I have to admit I use it, too, at last every now and then – knowing that only continuous use makes it really valuable to the user – and Google.

Of course, there are good reasons to use services like Facebook and Google – we all do it routinely. In some situations it’s even unavoidable to use them. But all of that is probably a story for a different blog posting.

Let me conclude that PRISM should also be a wakeup call about how much power about our personal lifes we put into the hands of a very small set of large commercial companies. After all, this is what the government is after, trying to even better understand out actions by combining various data sources. In that light, does one really wonder why governments are not so eager about new and better privacy laws and enforcing them?

Bottom line? There is not only Web 2.0, there is also Privacy 2.0. While the first got great features, the latter got pretty feature-less, at least from the personal privacy point of view.

new rsyslog 7.4 stable branch released

We just released rsyslog 7.4.0, a new stable release which replaces the 7.2 branch. After nine month of hard work, there are many exciting enhancements compared to 7.2, and I thought I give you a quick rundown of the more important new features. Note that while I list some “minor features” at the end of this posting, the list is not complete. I left out those things that are primarily of interest for smaller groups of users. So if you look for a specific feature not mentioned here, it may pay to look at the ChangeLog or post a question to the rsyslog mailing list.

With this release, the rsyslog project officially supports version 7.4. Support for 7.2 will gradually be phased out. If support for older versions is required, utilizing Adiscon’s professional services for rsyslog is recommended.

Note that I only list main headlines for each of the features. Follow links (where provided) to gain more in-depth information.

Security Package

 

Improved Rate-Limiters

  • introduction of Linux-like rate-limiting for all inputs
  • “Last message repeated n times” now done on a per-input basis; makes it much more useful AND increases processing speed.

Systemd Journal support

  • omjournal writes messages to the journal
  • imjournal obtains messages including journal-specific meta data items from the journal

Performance Improvements

  • Disk Queue Performance greatly improved
  • DNS cache greatly improved and optimized
  • output file writer (omfile) now fully supports background writing
  • script optimizer now changes frequently-used long-running filters into equivalent quick ones when possible (this even affects some distros default configs and is a great performance saver)

Minor Features

  • various plugins now support the new RainerScript based config language
  • omlibdbi improvements, among them support for transactions
  • ommysql now supports transactions
  • improved omfile zip writing speed
  • performance optimizations
  • omelasticsearch improvements (bulk processing mode, local error file)
  • omudpspoof now supports packets larger than 1472 by using fragmentation
  • omrelp now supports session timeout
  • contributed module (not project-supported) omrabbitmq was added

rsyslog journal support

We expect that rsyslog and the systemd journal will be found together in quite some szenarios (if you are curios on what exactly we mean, check the “rsyslog vs. journal?” posting).

As such, it makes a lot of sense to think about providing integration facilities. Thanks to rsyslog’s modular architecture, it wasn’t very hard to provide the necessary building blocks. In the 7.3 experimental branch, two new modules (omjournal and imjournal) have been developed. They provide the capability to write to the journal as well as pull data out of it. Usually, the latter is not really necessary, as journald still provides log messages to the system log socket. But unfortunately, journal developers have decided only to pass on a subset of the logging information. They exclude the structured data content. However, such data is only available if their own logging API is being used by applications, and this is currently not really the case. So right now using just the regular system log socket input should be sufficient in almost all cases. Howerver, should structured data become more prominent in the journal, using imjournal gains rsyslog access to it.

For some more background information on the integration, you can also watch a quick presentation that I recorded:

rsyslog logfile encryption

Starting with version 7.3.11, rsyslog provides native support for log file encryption. In order to keep things highly flexible, both an encryption provider interface as well as an provider have been designed and implemented.

The provider interface enables to use different encryption method and tools. All that is need to be done is to write the actual provider to that interface. The encryption provider (called a “crypto provider”) then implements everything necessary for the actual encryption. Note that decryption is not part of the rsyslog core. However, a new user tool – rscryutil – is provided by the project to handle this task. This tool is currently being considered to be part of the crypto provider. Consequently, there is no specific interface for it. The reasoning behind that decision is that there is very little provider-independet plumbing in this tool, so abstracting things looks a bit like over-engineering (we may change that view in the future based on how things evolve).

The initial crypto provider is based on libgcrypt, which looks like an excellent choice for (almost?) all use cases. Note that we support symmetric cryptography, only, inside the crypto provider. This is simply due to the fact that public/private key cryptography is too slow to be used for mass encryption (and this is what we do with log files!). Keep in your mind that even TLS uses symmetric cryptography for the actual session data (for the same reason), and uses public/private key cryptography only for the exchange of the symmetric key. In any case, folks using that functionality in high-secure environments are advised to check the exact security requirements. Periodic re-keying of the encrypted log files may be necessary.

No on to “how does it work?“: The encryption functionality, at the action level, is enabled by specifying a crypto provider, for example as follows:

action(type=”omfile” file=”/var/log/logfile”
       cry.provider=”gcry” 
       )

This tells the action to load the crypto provider. What then happens, is up to the crypto provider. For obvious reasons, just loading it is not sufficient, you need at least to provide a crypto key. How this is done depends on the crypto provider. It is assumed that all crypto providers user the “cry.” config parameter name space. With the gcry provider, a full action may look like this:

action(type=”omfile” file=”/var/log/logfile”
       cry.provider=”gcry” cry.key=”testtesttesttest”
       )

This is the bare minimal set of parameters for gcry – it’ll use defaults for the algorithm and use the key directly as specified from the configuration file.

Specifying the key directly in the configuration is usually a bad idea: in most setups, many or even all users can read the config file, so the key is far from being secret. We even thought of not permitting this option, as it is so insecure. We left it in, however, as it is a great help in initial testing.

For production use, there are two other modes: one is to read the key from a side-file (which needs to be sufficiently secured!) or obtain it via a special program that can obtain the key via any method it likes. The latter is meant to be used for higher security use cases. We assume that the (user-written) program can do all those “interesting things” that you can do to manage crypto keys in a secure manner. Access to the keyprogram, of course, needs to be properly secured in that case.

I hope this gives you a glimpse at how rsyslog log file encryption works. For details on the crypto provider parameters, see the official rsyslog grcy crypto provider documentation.

TLS for librelp

If you followed librelp’s git, you have probably already noticed that there is increased activity. This is due to the fact that TLS support is finally being added! Thanks to some unnamed sponsor, we could invest “a bit” of time to make this happen.

We have decided to base TLS support on GnuTLS, which has matured very much, is preferred by Debian and fully supported by Red Hat and has no licensing issues with GPL like openssl has (plus the sponsor also preferred it). We build TLS support directly into librelp, as we assume it will get very popular, so an abstraction layer would not make that much sense, especially given the fact the GnuTLS nowadays is almost already installed by default. And remember that an abstraction layer always adds code complexity and an (albeit limited) runtime overhead.

Librelp 1.1.0 will be the first version with basic TLS support. With “basic”, we mean that this is a full TLS implementation, but there are some useful additional features not yet present. Most importantly, this version will not support certifiates but rather work with anonymous Diffie-Hellmann key exchange. This means that while the integrity and privacy of the session can be guaranteed as far as the network is concerned, this version does not guard against man-in-the-middle attacks. The reason simply is that there is no way to mutually authenticate peers without certificates. We still think it makes a lot of sense to release that version, as it greatly improves the situation.

Obviously, we have plans to add certificate support in the very near future. And this also means we will add ways for mutual authentication, much like in rsyslog’s RFC 5425 implementation. It’s not finally decided if we will support all authentication options RFC 5425 offers (some may not be very relevant in practice), but that’s so far undecided. We currently strongly consider to start with fingerprint-based authentication, as this permits the ability to do mutual authentication without the need to setup a full-blown PKI. Also, most folks know fingerprint authentication: this is what ssh does when it connects to a remote machine.

So stay tuned to librelp development, many more exciting things are coming up. Please note that rsyslog 7.5.0 will be the first version to utilize the new librelp features – but that’s something for a different blog posting.

[This is also cross-posted to the librelp site]

rsyslog vs. systemd journal?

I gave an invited talk on this topic at LinuxTag 2013 in Berlin. I was originally asked to talk about “rsyslog vs. journal”, but requested that a question mark is added: “rsyslog vs. journal?”. This title much better reflects our current thinking in regard to the journal project.

Rather than eloborating on what’s our position, I thought it is easier if I just share the slide deck – and the full paper I have written on it. In a nutshell, both answer the question what we currently think of the journal, where we see which technology deployed and which cool things rsyslog can do to enhance enterprise logging. There is also a very intereting history lesson included. But enough of that, on to the real things:

The paper should definitely have all the details you ever want to know (well… ;)) and is a good read if you want to dig deeper:

Rsyslog vs Systemd Journal (Paper) from Rainer Gerhards

Note: the PDF can be downloaded directly from slideshare (use the “Save” button right on top of the paper).

LinuxTag 2013

I gave a talk on “rsyslog vs journal?” at LinuxTag 2013 in Berlin (slides an paper now available at “rsyslog vs. journal?” blog post). It was a great event, and I had quite some good discussions with rsyslog users. As it looks, the v7 config is very well received and many folks are moving toward that version.

Of course, I also learned (not surprisingly) that there is desire for better doc. In some discussions, the idea of small video tuturials came up, and I have to admit that I like this idea. It looks like it is quicker to do for me than writing full-blown tutorials and yet is probably very useful especially for folks who look for a very specific target. So I hope to find time to do some experimenting. I’ll probably start with some extracts from my talk, first doing the theoretic thing and then showing how things actually work – in 5 minute shots. So stay tuned.

In the mean time, here is a quick glimpse at the LinuxTag social event, which I also enjoyed very much (it’s actually rather short, because I wasn’t so much into just filming ;)).

Moving to github?

I am re-evaluating my development environment. One idea that pops up is if I should move the rsyslog project over to github. Initially, I was rather sceptic about using a third-party for the git repository (after all, a git server is not rocket science…), but github seems to have gotten momentum in the past years. But so far it is more or less my gut feeling that migrating over to it may make sense.

So I am looking for feedback from my users and fellow developers: what are the pros and cons on moving to github in your opinion? Please be subjective, that’s what I am looking for. So there is no need to be shy.

Please comment and let me know your thoughts!
Rainer