systemd journal endlessly sends data to imjournal

When Fedora updated to rsyslog 7.4.0 in Fedora 19, they changed the default way how they obtain local log messages. In all previous releases, imuxsock, the traditional Unix/Linux input was used. That also works in conjunction with journal, at least as far as traditional syslog is involved. This is because journal provides these messages via the log socket it passes to rsyslog. Unfortunately, journal hides information if structured content is available. In order to obtain that, Red Hat has contributed a new module, imjournal, which reads the journal database via native journal APIs and as such gets hold of all data. With 7.4.0, the Fedora maintainers switched the default input in the desire to get full logging information into the system.

Unfortunately, rapidly many users begun to report serious problems with duplicate messages exhausting their system resources. As can be seen in this systemd journal database corruption Red Hat bug tracker, the root cause of this problem is that the journal database is corrupted and the native journal API thus endlessly returns the same messages ever and ever again.

The authors of imjournal did not use rsyslog’s ratelimiting capability in this module. Actually, there was good reason not to do it, as journal itself provides some rate-limiting, so doing it twice doesn’t sound like a good idea.

If you dig deeper into the journal bug, you’ll notice that this is not a new problem. It occurs with journal tools as well, and does so obviously for quite a while. In my opinion, it now surfaced as a big issue simply because imjournal was the first application ever to make serious use of the journal database.

After I received reports on this problem, I acted immediately. First and foremost, I put a warning message into imjournal telling about the risk on relying on the journal database. I can obviously not solve that problem, but then I mitigated it by adding rate-limiting to the module. The current git version by default does not permit more than 20,000 messages within a 10 minute interval, so even with a broken journal, no more than 120,000 messages per hour should be emitted. That’s as far as I can go handling this situation. Note that there is already a fix for the current problem in journal, it is also included in the bug tracker I mentioned above. However, in the past couple of weeks I talked to quite some folks on rsyslog/journal integration, and journal database corruption was frequently mentioned. Nobody told me about loops, but as I guard against future problems, I will keep the ratelimiting default active in imjournal.

Note that the rsyslog project “officially” always suggested to use imjournal only in cases where there is hard need for it (this was a result of talking to the end users). Thankfully I published a presentation on rsyslog and systemd journal integration including this fact just last week, so I can even proof that recommendation ;)

How will we go forward? If you need a quick patch, you should rebuild rsyslog from the git v7-stable branch. For a single patch that addresses the ratelimiting, you can use this one. The full patch set will be included in 7.4.1. We are working with Red Hat to release it ASAP, hopefully tomorrow.

Another viable solution is to simply disable imjournal in rsyslog.conf and replace it with the former imuxsock configuration. Please note that this problem does not affect any version of rsyslog installed from Adiscon’s repositories. This is because we never use imjournal by default.

I hope this clarifies the root cause of this problem, how to mitigate it and how the rsyslog team intends to prevent further harm from potential systemd journal database corruption.

rsyslog journal support

We expect that rsyslog and the systemd journal will be found together in quite some szenarios (if you are curios on what exactly we mean, check the “rsyslog vs. journal?” posting).

As such, it makes a lot of sense to think about providing integration facilities. Thanks to rsyslog’s modular architecture, it wasn’t very hard to provide the necessary building blocks. In the 7.3 experimental branch, two new modules (omjournal and imjournal) have been developed. They provide the capability to write to the journal as well as pull data out of it. Usually, the latter is not really necessary, as journald still provides log messages to the system log socket. But unfortunately, journal developers have decided only to pass on a subset of the logging information. They exclude the structured data content. However, such data is only available if their own logging API is being used by applications, and this is currently not really the case. So right now using just the regular system log socket input should be sufficient in almost all cases. Howerver, should structured data become more prominent in the journal, using imjournal gains rsyslog access to it.

For some more background information on the integration, you can also watch a quick presentation that I recorded:

rsyslog vs. systemd journal?

I gave an invited talk on this topic at LinuxTag 2013 in Berlin. I was originally asked to talk about “rsyslog vs. journal”, but requested that a question mark is added: “rsyslog vs. journal?”. This title much better reflects our current thinking in regard to the journal project.

Rather than eloborating on what’s our position, I thought it is easier if I just share the slide deck – and the full paper I have written on it. In a nutshell, both answer the question what we currently think of the journal, where we see which technology deployed and which cool things rsyslog can do to enhance enterprise logging. There is also a very intereting history lesson included. But enough of that, on to the real things:

The paper should definitely have all the details you ever want to know (well… ;)) and is a good read if you want to dig deeper:

Rsyslog vs Systemd Journal (Paper) from Rainer Gerhards

Note: the PDF can be downloaded directly from slideshare (use the “Save” button right on top of the paper).

Disappointing press reactions…

Friday, 13th… Systemd v38 went out recently and it includes a journald test release. That’s great and congrats to the systemd team to that release. What me saddens is the that the press still conveys only Lennart’s position on how bad syslog is. The counter arguments (http://blog.gerhards.net/2011/11/serious-syslog-problems.html) are mentioned nowhere. Lesson learned? All it takes is some interesting-sounding claims and everyone believes them. Does anybody still wonder why I think journald will become standard [and would do so even if it were totally evil]? ;-)

But don’t get me wrong: while I think the journald proposal and the way it is getting pushed has some flaws, I also see good in it. My initial reaction still stands: http://blog.gerhards.net/2011/11/journald-and-rsyslog.html Unfortunately, the world, and us, the masses, seems not to be prepared for weighted arguments… On to the weekend, have a great one :-)

What I don’t like about journald / Linux Journal

I heard of journald only a couple of hours ago (Tuesday?) and since then some intense discussion is going on. I had a chance to look at the journald material in more depth. I also had a quick look at journald’s source, but (as I know) Lennart and I are on the strictly opposite sides in regard to the amount of comment lines in source files (I put half the spec in, Lennart nothing at all ;-)). So I did not try too hard to make sense of the code and my impression is still primarily based on the initial paper (though I have to admit his code is probably simpler as he does not need to carry any legacy nor consider platforms other than recent Linux). 

The contra-syslog arguments can be classified in two classes: vaporware and correct fact. In the vaporware camp are things like the hash chaining “security urban legend”, the timezone argument (though he is right in regard to current practice inside distros), syslog network transport and compression (this list is not conclusive). Technically correct is the current store format, different log sources, and free-formedness of messages (I prefer the term “semi-structuredness”). This list is also not conclusive.

I think Lennart makes some good points, but discredits the paper somewhat by going overboard at times. It looks like he really needed some hard selling points (I also got this impression by his usage of the kernel.org breakage to promote this effort…). I think his paper would have been more useful if he had argued only those problems that actually exist. I am in full agreement that there are some spots that really deserve to be changed and addressed. Unfortunately, the paper is phrased in such terms that people not at least at the medium expert level on logging tend to believe everything that is stated.

The question is how the actual problems can best be fixed. Is it necessary to create a totally new infrastructure and throw away everything that exists? Maybe. I still prefer the alternative approach: why not extend existing technology? I modeled rsyslog specifically for this reason to be a highly modular system where extensions can easily be added. As far as I understand, syslog-ng has also moved to such a design in the recent v3 version. In rsyslog, I have accepted even experimental technology inside the source tree quickly. Getting a new log store was on my agenda for quite some month (the syslog-ng commercial fork already has it). I unfortunately had not enough time to implement it – and nobody else helped out with it. Wouldn’t it have been a good idea to contribute something to rsyslog instead of crafting something totally new?

Another thing that I strongly doubt is if the Linux journal idea will actually manage to solve the logging format dilemma. Microsoft’s event log is in place for 15+ years, and app developer still don’t use it correctly (as I initially wrote, the Linux Journal looks quite similar to the Windows Event Log). While I think the UUID idea is actually not a bad one, I seriously doubt all developers will understand and use it (correctly). This is a problem with the Windows Event log as well. One needs to know that a lot of high-profile folks are working for several years (10+) on solving this dilemma. Lennart may be a genius, but I have concerns that he over-promises (but I really wish he has success, that would be a very, very big advantage for the community).

One thing, I have to admit, that disappoints me is that Lennart never approached me with his proposal. He knows me (even personally) and we have worked together on systemd/rsyslog integration. I heard about journald first on a Google Alert and quickly after from some folks who asked me what went on. Then  I found out that the systemd development mailing list also never mentioned any work on journald. So, to me, it looks the idea was well hidden for a surprise at Kernel Summit. Well done, but not my style ;-) This missing openness concerns me. My decisions in regard to rsyslog were controversial at times and dictatorial at others (and for sure sometimes wrong). And we currently have some big and controversial discussion on rsyslog going on (partly fueled by the arrival of journald). But I have always played very open, communicated what I had on my mind (in advance), discussed and did never try to hide something. This, to be honest, is how I expect work to be carried out on an important system component. I also never met Lennart at any of the standard bodies work on logging. Not everyone runs Linux and probably not even everyone on Linux will run journald. So standards matter.