How often is rsyslog installed?

Of course, how often a software is actually installed is an interesting number for each project. So it is for rsyslog. And, of course, we do not have any data. While you meet some users on the forum and mailing list, they are only a very tiny subset of the user basis. With infrastructure projects like rsyslog, people often do not even know that they run it (what a shame…). Anyhow, it is motivational (and useful for promotions) to know how often it is installed.

So I started to search for some metrics. A good starting point is the observation that beginning with version 8, rsyslog is the default syslogd for Fedora. So, basically, each instance of Fedora 8+ means an instance of rsyslog. Thankfully, I was able to find some metrics directly from the Fedora project. If I sum up the metrics for F8 to F10, I have around 8 million systems. I guess this includes upgrades and now-dead systems. So we are probably down to 5 million (or is this too optimistic?). As a side-note, I agree that some folks may remove rsyslog in spite of some other logging system. But that will probably be special cases, so I don’t think it is useful to try to hard to find out a decent number of these (aka “I ignore that” ;)).

The next major source of installations is probably Debian Lenny. Since a few month, rsyslog has become there default syslogd, too. I have not yet found any metrics for Lenny (do you know? – if so, please mail me). I think the number will be way lower than current Fedora (given that it is not yet flagged as stable). So it will probably not add a big number of systems, maybe half a million?

Another source may be several smaller distributions (like centos) where rsyslog is the default. This adds another source of installations.

Finally, we have the cases where folks intentionally install rsyslog. Sadly, these are the fewest cases, but as I said: this is what you expect from an infrastructure project. And logging is definitely a niche. Few folks have big interest in it. So, if looking just for numbers, these cases are almost irrelevant (of course, from any other aspect these are the most important ones for the project, they really drive it!).

Not having any real Debian metrics, I think a reasonable conclusion is that we have around 5 million systems running rsyslog by today (January 2009). I’ll probably refer to that number if someone asks (and some folks begin to ask). If you have a different opinion, metrics, ideas – please comment to this post or email me.

Carnival of Logging!

This is a proposal to start a “Carnival of Logging”. Huh… what? I guess by now everybody is participating (or at least has written) in one or more blog carnivals. But a Carnival of Logging – am I mad? Who the heck is interested in such a thing?

Yes, I know… Logging is not sexy. Indeed, most folks don’t even know there is something like logging (except, of course, if they burn wood ;)). However, some folks blog about (computer) logging. Really. Every now and then I find a few interesting posts about logging and things related to it – like what role does logging play in compliance? What about forensics? What about security, … So there is some potential interest.

I think, there is a place for a “Carnival of Logging” and I would like to organize it :) I guess we could argue weeks if it make sense to actually do it – or we can simply try. I prefer the later (… and may be proven wrong “the hard way”).

This is what I propose:

  • If you are blogging about logging or areas that are closely related to logging (so that you have something log-related at least every know and then), please email me. Let me know which blog post you would like to see highlighted in the first carnival. Please also let me know if you would be interested in hosting future carnivals.
  • I will compile the first carnival of logging out of the messages I receive. It will be hosted on this site. I have no definite schedule for it yet, because I do not know the volume of posts coming in. My goal, however, is to have this done by February, 11th.
  • After the initial post has been done, I’ll email all those that submitted carnival entries and ask for new posts ;) I’ll pick a host from the list of those that have opted to host and forward entries I have received to that host. The host will then compile a new carnival and post it on his blog.
  • Once a carnival is out, all participants should link to it from their blog.
  • I’ll keep an archive of all carnival posts here on this site.
  • The carnival of logging should be on a fixed schedule. That probably depends a bit on the volume, but I suggest that the carnival is written at least bi-weekly.

Well – and now it is up to you! Please send in your carnival entries! I think a Carnival of Logging would be a useful addition to the logging world and hope that others agree. Also, please spread the word, so that we get more exposure and more participants! Also, feel free to post any questions and comments you have.

SyslogAppliance 0.0.6 out

Finally, I made a new version of the syslog appliance. It is not the really big release. Friday, I intended to do just a refresh, but then I ended up integrating a capability to discard messages older than 60 days (obviously an optional feature). Still, it looked like a quick action, but phpLogcon gave me a somewhat hard time. Finally, I even discovered a bug and could fix it.

Probably the next milestone for the appliance is SMP support. I’ve done some preliminary work, but on the other hand there is so much more to do. Let’s see…

phpLogCon now in FreeBSD ports tree

Good news, folks: phpLogCon is now available in the FreeBSD ports treee. I’ve just read confirmation on a forum thread. This is obviously very good news, thanks everyone for making that happen. Given that there is also a port for rsyslog, now both components can work together and benefit from each other.

Of course, phpLogCon does not require rsyslog. It can work perfectly with any other syslogd, as long as it is pointed to the right files or databases. But having rsyslog’s ability at hand in addition to phpLogCon is quite handy – and vice versa.

Good news to start a day ;)

NASA list server compromised?

As a space geek, I am subscribed to NASA’s HSFNEWS mailing list. When I looked at my mailbox this morning, a spam message that claimed to have been posted via the Nasa list server caught my attention. Obviously, it is quite easy to forge email and so I thought that this may be a fake, too. However, closer examination reveals headers that makes me think this could be a real thing.

Of course, HSFNEWS is just one of the many mailing lists NASA offers and also of course it is run on an auxiliary system, invalid messages slipping through can have quite bad effects. Of course, a message with subject

“[HSFNEWS] She’ll always want to give head now”

will hopefully immediately classified as spam by anyone (or do you think the message is about alien encounters? ;)). But what if the message would be much more carefully crafted to carry out something evil? After all, the message could look much like it comes from an official NASA source. Just think about the various Obama hoaxes and scams that we have seen lately?

I am still not 100% convinced that the mail actually originated from the NASA list server (I have tried to contact someone in charge over there and hope to get some results). To help you get an idea yourself, here is the complete message source, except a few things on my local delivery record as well as valid mail addresses that do not need to be posted here.

If someone has an opinion if the mail was run over NASA’s server, please post a comment or drop me a mail.


MIME-Version: 1.0
Content-Type: multipart/alternative; boundary=”—-_=_NextPart_001_01C97B42.7A7F7080″
Received: from jsc-listserv-01.jsc.nasa.gov (jsc-listserv-01.jsc.nasa.gov
[128.157.5.25]) by mailin.adiscon.com (Postfix) with ESMTP id 06205241C002
for ; Tue, 20 Jan 2009 21:52:51 +0100 (CET)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Received: from jsc-listserv-01.jsc.nasa.gov (jsc-listserv-01
[128.157.5.25]) by jsc-listserv-01.jsc.nasa.gov (8.13.1/8.13.1) with ESMTP
id n0K7cgeV024815; Tue, 20 Jan 2009 15:01:22 -0600
Received: by JSC-LISTSERV-01.JSC.NASA.GOV (LISTSERV-TCP/IP release 15.0)
with spool id 553828 for HSFNEWS@JSC-LISTSERV-01.JSC.NASA.GOV;
Tue, 20 Jan 2009 15:01:20 -0600
Received: from 200-127-202-12.cab.prima.net.ar
(200-127-202-12.cab.prima.net.ar [200.127.202.12]) by
jsc-listserv-01.jsc.nasa.gov (8.13.1/8.13.1) with ESMTP id
n0KKPY2D029413 for ; Tue, 20 Jan
2009 14:25:35 -0600
Return-Path:
X-OriginalArrivalTime: 20 Jan 2009 21:03:01.0983 (UTC)
FILETIME=[7B156EF0:01C97B42]
List-Owner:
Approved-By: {removed}@NASA.GOV
Content-class: urn:content-classes:message
Subject: [HSFNEWS] She’ll always want to give head now
Date: Tue, 20 Jan 2009 21:25:34 +0100
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [HSFNEWS] She’ll always want to give head now
Thread-Index: Acl7Qns+aZRN9mnKS56dl4osL2myOw==
List-Help: ,

List-Subscribe:

List-Unsubscribe:

From: “joynt”
To:
Reply-To: “hsfnews”

——_=_NextPart_001_01C97B42.7A7F7080
Content-Type: text/plain; charset=”iso-8859-1″
Content-Transfer-Encoding: quoted-printable

Can’t see images?
To view this email as a web page, go here =
{actual spam removed}

RFC 3195 back in the game?

RFC 3195 was thought to be the solution for reliable syslog. It is based on TCP and the BEEP protocol. It was written in November 2001 but never gained much attention. The premier reason everyone tells you is complexity of BEEP (and lack of toolkits). A few years ago, I wrote my own logging-related RFC 3195 toolkit, liblogging. It, too, did not get much momentum.

Anyhow, I used a modified version of liblogging to offer RFC 3195 support under Windows as part of the MonitorWare product line. Again, we never heard much of this feature. In rsylog, I created an input plugin for RFC 3195. At that time, however, I already had the feeling 3195 was a failure. So I was hesitant to implement an output plugin, too. And, as expected: nobody every cared, except for some folks building packages. And these not for practical needs but for the sake of getting packages for everything…

So up until now, I would conclude that 3195 is indeed a failure. However, there seems to be some increasing interest. At least, I got a couple of questions the past weeks on RFC3195 and Adiscon, my company, just got a not-so-small order of its EventReporter product which explicitly has RFC 3195 put into the requirements. Is this a sign of increasing interest? Or is just somebody filling check mark items? This remains to be seen.

So, there seems to be a slight chance that RFC 3195 is getting revived. Maybe it took just some year so that the idea could ripen. In any case, I am prepared for RFC 3195 demand. Maybe finally doing all that work begins to pay off…

Theoretical vs. Practical Performance

I found an interesting post “Algorithm Performance in Academic Papers and Practice” by fellow security blogger Steve Burnett on the SecurityBloggersNetwork.

Steve questions whether theoretical performance gains, often given based on big O notation, can really be realized in practice. He claims they often can not.

I tend to agree to his point, at least in broad terms. One must first remind oneself that big O notation is quite coarse, it tells you about the potential extremes. In practice, however, these extreme cases may routinely never hit. Even further, I think it depends very much on what the algorithm in question is actually doing. If it is “just” doing calculations on its own, theoretical performance benefits can much better be predicted than if there is any external reference.

The obvious case is code that needs to do some library or operating system calls inside the algorithm. A good example is the performance optimization I and David Lang did on rsyslog in fall of 2008. Here, and I have to admit partly to my surprise, it turned out that optimizing rsyslog algorithms actually had almost no effect in boosting the daemon’s performance. Why? Simply because a hand full of system calls, many time-related, used up the majority of execution time. So rather than optimizing the algorithms used, we optimized out OS calls and that had a very big effect (and even after that initial optimization, there is still much room for improvement just by looking at the APIs). Of course, this is an extreme sample, because usual syslog server processing is not at all computational and the frequent context switches themselves are performance intense. The picture for a graphics application is probably totally different.

However, many less obvious cases exist, and I guess lots of them have to do with the fact that multiple processes and/or thread are executed. So resources are shared. On such a system, for example, theoretical performance gains may be lost due to task switches which purge vital data off the CPU cache. Even worse, a theoretically optimized algorithm may require additional main memory, which may, in practice, force cache purges because the cache size now is insufficient. Funny, eh?

Wrap-up: big O notation is useful, but for practical needs, it needs to be taken with a grain of salt. For real world deployments, actual performance testing is quite important. And as a side-note, test results on a dedicated system may be quite different from practical performance on a system where other things are also executed…

WinSyslog German Site

We are selling a Windows Syslog daemon (WinSyslog) for many, many years now (since 1995 if I remember correctly). Interesting is the “language issue”. Back at the late 90s, we had English and German pages for that product. Some time later, we dropped the German pages because almost nobody ever accessed them (funny, ain’t it?).

Now we are giving it another shot. While talking with some peers, they claimed there is more demand for German language in IT security today than it was 10 years ago. Really? If so, I have to admit I am surprised. I thought that the IT world speaks English and the IT security/auditing world even more so. Anyhow, I always like to experiment. So we at Adiscon agreed to translate some important content of the WinSyslog pages into German and see what happens.

As a side-note, the discussion with my peers started another experiment which did not require discussions inside the company. Rsyslog got a German language support forum in October 2008. Guess what? There is only a single user post in it, and that post tells that the poster thinks it is unnecessary to have a German language forum. So far, it looks like I was right – but let’s see what a product site brings ;) (It sounds somewhat logical that an open source support forum has different metrics than an commercial software product site, so I think there really can be different results).

Use of application-level acks in RELP

I received a very well crafted question about RELP reliability via the rsyslog mailing list this morning. I think it makes perfect sense to highlight this question here in the blog instead of letting it die unread and hard to find in the mailing list archives. Before reading this post, it would be useful to read my rant on “On the unreliability of plain tcp syslog” if you have not already done so. It will greatly help understand the fine details of what the message talks about.

Here we go, original posters’s text in italics, my replies in between it:

In my research of rsyslog to determine its suitability for a
particular situation I have some questions left unanswered. I need
relatively-guaranteed delivery. I will continue to review the
available info including source code to see if I can answer the
questions, but I hope it may be productive to ask questions here.

In the documentation, you describe the situation where syslog silently
loses tcp messages, not because the tcp protocol permits it but
because the send function returns after delivering the message to a
local buffer before it is actually delivered.

But there is a more-fundamental reason an application-level ack is
required. An application can fail (someone trips over the power cord)
between when the application receives the data and when it records it.

1. Does rsyslog send the ack in the RELP protocol occur after the
message has been safely recorded in whatever queue has been configured
or forwarded on so its delivery status is as safe as it will get (of
course how safe depends upon options chosen), or was it only intended
to solve the case of TCP buffering-based unreliability?


RELP is designed to provide end-to-end reliability. The TCP buffering issue is just highlighted because it is so subtle that most people tend to overlook it. An application abort seems to be more obvious and RELP handles that.

HOWEVER, that does not mean messages are necessarily recorded when the ACK is sent. It depends on the configuration. In RELP, the acknowledgment is sent after the reception callback has been called. This can be seen in the relevant RELP module. For rsyslog’s imrelp, this means the callback returns after the message has been enqueued in the main message queue.

It now depends on how that queue is configured. By default, messages are buffered in main memory. So when rsyslog aborts for some reason (or is terminated by user request) before this message is being processed, it is lost – while the sender still got a positive ACK. This is how things are done by default, and it is useful for many scenarios. Of course, it does not provide the audit-grade reliability that RELP aims for. But the default config needs to take care of the usual use case and this is not audit-grade reliablity (just think of the numerous home systems that run rsyslog and should do so in the least intrusive way).

If you are serious about your logs, you need to configure the engine to be fully reliable. The most important thing is a good understanding of the queue engine. You need to read and understand the rsyslog queue docs, as they form the basis on which reliability can be built.

The other thing you need to know is your exact requirements. Asking for reliability is easy, implementing it is not. The more you near 100% reliability (which you will never reach for one reason or the other) the more complex scenarios get. I am sure the original post knows quite well what he want, but I am often approached by people who just want to have it “totally reliable” … but don’t want to spent the fortune it requires (really – ever thought about the redundant data centers, power plants, satellite and sea links et all you need for that?). So it is absolutely vital to have good requirements, which also includes of when loss is acceptable, and at what cost this comes.

Once you have these requirements, a rsyslog configuration that matches them can be designed.

At this point, I’d like to note that it may also be useful to consider rsyslog professional services as it provides valuable aid during design and probably deployment of a solution (I can’t go into the full depth of enterprise requirements here).

To go back to the original question: RELP has almost everything that is needed, but configuring the whole system in an audit-grade way requires (ample) work.

2. Presumably there is a client API that speaks RELP. Can it be
configured to return an error to the client if there is no ACK (i.e.
if the log it sent did not make it into the configured safe location
which could be on a disk-based queue), or does it only retry? Where is
this API?


The API is in librelp. But actually this is not what you are looking for. In rsyslog, an output module (here: omrelp) provides the status back to the caller. Then, configuration decides what happens. Messages may be discarded, sent to a different destination or retried.

With omrelp, I think we have some hardcoded ways to preserve the message, but I have no time yet to look this up in detail. In any case, RELP will not loose messages but may duplicate few of them (within the current unacked window) if the remote peer simply dies. Again, this requires proper configuration of the rsyslog components.

Even with that, you may loose messages if the local rsyslogd dies (not terminates, but dies for some unexpected reason, e.g. a segfault, kill -9 or whatever) but still has messages in a not persisted queue. Again, this can be mitigated by proper configuration, but that must be designed. Also, it is very costly in terms of performance. A good reading on the subtleties can be in the rsyslog mailing list archive. I suggest to have a look at it.

Certainly the TCP caching case you mention in your pages is one a user
is more likely to be able to reproduce, but that is all the more
reason for me to be concerned that the less-reproducible situations
that could cause a message to occasionally become lost are handled
correctly.


I don’t think app-abort is less reproducable – kill -9 `cat /var/run/rsyslog.pid` will do nicely. Actually, from feedback I received, many users seem to understand the implications of a program/system abort. But far fewer understand the issues inherent in TCP. Thus I am focusing so much on the later. But of course, everything needs to be considered. Read the thread about the reliable queue (really!). It goes great lengths, but still does not offer a full solution. Getting things reliable (or secure) is very, very challenging and requires in-depth knowledge.

So I am glad you asked and provided an opportunity for this to be written :)

Rainer

Strong passwords? Forbidden!

American Express, as a bank and card issuer should be a fairly security sensitive company. Right? Well, it looks like they have not yet learned their lesson. Occasionally, I log in to my AmEx account to gain access to memebership rewards (these nice gimmicks that shall trick you into charging to AmEx as much as possible). I tend to have my credentials not at hand when doing so, but thankfully AmEx has a quite secure system to recover your credentials.

What really bugs me is their password requirement. A password can have a maximum of 8 characters and consist only of letters and numbers! Ouch… what about strong passwords? They are simply forbidden by AmEx. The funny thing is that the web site doesn’t even complain when you enter a too-strong (aka longer or alphanumeric) password. It simply ignores the extra characters. Some time last year this drove me crazy as I could not log in after changing my password. Guess what, I used a too strong one and of course it didn’t match to what the system generated. I called customer service and also complained about being forced to use insecure passwords. That was several month ago.

New year, new try – old problem… Nothing learned, still 8 chars max and only letters and number. Frankly, AmEx, who is advising you on security? I really wonder if under US law AmEx is responsible if someone breaks into my account. I think they should…