The typical logging problem as viewed from syslog

I run into different syslog use cases from time to time. So I thought it is a good idea to express what I think the typical logging problem is. As I consider it the typical problem, syslog (and WinSyslog and rsyslog in specific) address most needs very well. What they spare is the analysis and correlation part, but other members of the family (like our log analyzer) and third parties care well for that.

So the typical logging problem, as seen from the syslog perspective, is:

  1. there exists events that need to be logged
  2. a single “higher-level” event E may consist of a
    number of fine-grained lower level events e_i
  3. each of the e_i’s may be on different
    systems / proxies
  4. each e_i consists of a subset of properties
    p_j from a set of all possible common properties P
  5. in order to gain higher-level knowledge, the
    high-level event E must be reconstructed from
    e_i’s obtained from *various* sources
  6. a transport mechanism must exist to move event
    e_i records from one system to another, e.g., to
    a central correlator
  7. systems from many different suppliers may be involved,
    resulting in different syntax and semantic of
    the higher-level objects
  8. there is potentially a massive amount of events
  9. events potentially need to be stored for
    an extended period of time
  10. quick review of at least the current event data
    (today, past week) is often desired
  11. there exists lots of noise data
  12. the data needs to be fed into backend processes,
    like billing systems

on leap seconds and syslog

I was recently asked how syslog handles leap seconds. I thought it would be useful to reproduce my thoughts, initially expressed via private mail, here in the blog.

RFC5424 specifically forbids leap seconds, as during our discussions we found many cases where leap seconds caused grief. I also think the the TAI is considering aborting the use of leap seconds for this reason as well. To the best of my knowledge, GPS also does not use leap seconds. The ultimate reason to abandon UTC leap seconds in syslog was the we failed to identify an operating system that would expose leap seconds to a user process. So a syslogd or any other syslog sender would not even be able to see that one was introduced. From the syslog perspective, a leap second is just like any other second, but time flows “somewhat slower”. I guess we are in the same boat as many operating systems with this perspective.

In RFC5424 we didn’t explicitly state what time stamp should be written during a leap second – because we thought it could actually never happen (why? explained above!). But I would say that “Leap seconds MUST NOT be used” to me means that it should be expressed as the 59th second of said minute. But even if you bump the minute and use the 0 second, I cannot see how this should be problematic. On a single system, time should still evolve serially. For correlating events form multiple systems, the timestamp alone is insufficient in any case. You cannot closely enough synchronize the different real time clocks. So you need a different meachanism (like Lamport clocks) for this in any case.

Helping to find major issues in rsyslog…

Rsyslog has a very rapid development process, complex capabilities and now gradually gets more and more exposure. While we are happy about this, it also has some bad effects: some deployment scenarios have probably never been tested and it may be impossible to test them for the development team because of resources needed. So while we try to avoid this, one may see a serious problem during deployments in demanding, non-standard, environments (hopefully not with a stable version, but chances are good you’ll run into troubles with the development versions).

Active support from the user base is very important to help us track down those things. Most often, serious problems are the result of some memory misadressing. During development, we routinely use valgrind, a very well and capable memory debugger. This helps us to create pretty clean code. But valgrind can not detect anything, most importantly not code pathes that are never executed. So of most use for us is information about aborts and abort locations.

Unforutnately, faults rooted in adressing errors typically show up only later, so the actual abort location is in an unrelated spot. To help track down the original spot, libc later than 5.4.23 offers support for finding, and possible temporary relief from it, by means of the MALLOC_CHECK_ environment variable. Setting it to 2 is a useful troubleshooting aid for us. It will make the program abort as soon as the check routines detect anything suspicious (unfortunately, this may still not be the root cause, but hopefully closer to it). Setting it to 0 may even make some problems disappear (but it will NOT fix them!). With functionality comes cost, and so exporting MALLOC_CHECK_ without need comes at a performance penalty. However, we strongly recommend adding this instrumentation to your test environment should you see any serious problems. Chances are good it will help us interpret a dump better, and thus be able to quicker craft a fix.

In order to get useful information, we need some backtrace of the abort. First, you need to make sure that a core file is created. Under Fedora, for example, that means you need to have an “ulimit -c unlimited” in place.

Now let’s assume you got a core file (e.g. in /core.1234). So what to do next? Sending a core file to us is most often pointless – we need to have the exact same system configuration in order to interpret it correctly. Obviously, chances are extremely slim for this to be. So we would appreciate if you could extract the most important information. This is done as follows:

  • $gdb /path/to/rsyslogd
  • $info thread
  • you’ll see a number of threads (in the range 0 to n with n being
    the highest number). For each of them, do the following (let’s assume
    that i is the thread number):

    • $ thread i (e.g. thread 0, thread 1, …)
    • $bt

  • then you can quit gdb with “$q”

Then please send all information that gdb spit out to the development team. It is best to first ask on the forum or mailing list on how to do that. The developers will keep in contact with you and, I fear, will probably ask for other things as well ;)

Note that we strive for highest reliability of the engine even in unusual deployment scenarios. Unfortunately, this is hard to achieve, especially with limited resources. So we are depending on cooperation from users. This is your chance to make a big contribution to the project without the need to program or do anything else except get a problem solved ;)

Tools to detect stack adressing Problems?

Since I have begun to use the valgrind memory debugger routinely in rsyslog development (some two years ago), the quality of the source has much increased. Unfortunately, however, valgrind is not able to detect problems related to misaddressing variables on the stack. The 5.3.6 bug I was hunting for almost a week is a good example of this. Valgrind also provides only limited support for global data, as far as I know (and see from testing results).

This becomes an even more important restriction as I moved a lot of former heap memory use to the stack for performance reasons. I remember at least one more major bug hunting effort that was hard to find because it affected only stack space.

So I am currently looking for tools that could complement valgrind by providing good stack checking capabilities. As one tool, mudflap was suggested to me. It sounds interesting, but gives me a very hard time [very hard to read debug output (no symbolic names for dlloade’ed modules, (false?) reports for areas where I can not see anything wrong as well as frequent (threading-related?) crashes when running under instrumentation). Maybe I am just misinterpreting the output…

In short: I would highly appreciate suggestions for tools that can help with debugging stack memory access (global data would be a plus) – and/or instructions on how to interpret mudflap, if that is considered to be *the* tool for that use case.

Thanks,
Rainer

rsyslog – feature “schedule”

Every know and then somebody asks what “release schedule” I have on my mind for future rsyslog releases. Today, it happened again, so I’ll do my usual blog post ;). Long time readers of this blog will know that this is a snapshot of what I have on my mind – open source development is quite dynamic and so what I actually implement can be quite different – and has been so in the past. It may be a good time to read my blog post describing how I assign priorities to rsyslog work.

For the foreseeable future, I have two primary things on my mind: one is a set of tools to gain insight knowledge of rsyslog’s inner workings while it is running. This includes statistics, but goes beyond (still, a recent forum post on rsyslog statistics may be an interesting read). This feature is of interest for the community at large, but it is also something that I need to do some in-depth performance analysis plus it is a real great debugging helper. As such, I intend not only to provide the glue inside rsyslog, but also create a full-blown GUI so that the power can actually be used. If nothing comes in the way, this is my top priority for new work (I intended to begin with it during summer time, but then more important things came into its way – but now it is becoming really pressing…).

The next feature I have on my mind is a change to the configuration language, which may also include some core changes. The community complains rightfully that rsyslog’s configuration is a real pain. It is extremely hard to configure some of the most enhanced features – even I need to think hard about how to create some desired results. This is a result of the growth of rsyslog. When the current config system was invented (some three years ago?), we had a handful of low-power commands. This has dramatically changed. For some time, I intended to replace the config language solely by a scripting language. This I no longer believe in. A full-blown scripting language would be a very desirable enhancement, but the base configuration must be done without it (this is also a performance issue). Redoing the config language includes untangling some of the inner workings, adding more flexibility. I am working towards that goal for roughly two and a half month now and that part went well. Now I need to do the next step. I expect that a new config format requires at least a month, more realistic two, to materialize. But adding more features with the current config system is of limited use, because only “expert experts” could configure them. But while the config is important, it is on the second spot on my todo list, right after the GUI and diagnostics tools.

GUI and diagnostics I expect another at least two month to get to something decent. Adding these numbers, I really do not think what the next larger features could be that I intend to implement. If all goes well, I can think about this in spring.

Also, I am currently quite busy with some other, paid, projects. So the time I can spent on rsyslog at the moment is limited. I devote much of this time to fixing bugs, with a primary goal being to get v5 finally ready for prime time (it looks good, but we are not yet fully there).

Also, I notice that adoption rate increases. I notice that by a large growth in support requests both on the mailing list as well as the forum. This is good new, but the bad news is that there are only few frequent contributors. So there is a lot of things that I need to take care of myself, and this needs increasingly more time – time that I obviously do not have for bug-fixing or developing new features. To get things a bit balanced, I have stopped responding to some questions, those that I think either a little google search can lead results to or those that obviously have a primarily commercial background. I’d like to respond to anything – but unfortunately, I simply do not have the time (if I did, rsyslog development would be totally stalled).

As I said, this is just a snapshot of how things look. Maybe tomorrow a sponsor shows up that changes my todo list considerably (we had only very few occurrences of such, but we thankfully had ;)). Even with a sponsor, I am tied with work for the rest of this year, then I have a little vacation, some more paid work, so that I think I can begin working on larger features mid-January, maybe a bit later. Bottom line: don’t take any “schedule” for granted, but I hope you get an overall idea of how things evolve. And: please continue to send in bug reports and feature request. Feature requests are very important – I use them (and their occurrence rate!) to judge how much demand for a feature there is in the community.

Happy syslogging!

rsyslog multithreading

From time to time, I receive questions on how many cores rsyslog can run on a highly parallel system. Rsyslog is massivley multi-threaded, but that does not necessarily mean that each configuration, and even each use case, can actually benefit from it.

The most important thing to gain a speedup from parallelism is the ability to break down the workload (this is called “partitioning”) and distribute it to a set of threads, which than can work in parallel on each part.

For the partitioning to work well, the workload, and configuration, must be “partionable”. Let me give a counter-example. If you have a single sender and a single action (yes, this sometimes is the case!), there can not be much parallelism. Such a config looks like this (using imtcp as an example here):

$TCPServerRun 10514
*.* /path/to/file

This can not gain much, because we have on thread for the TCP receiver, one thread for the filtering and one for the output. With the queue engine, we can increase the number of threads that will work on filters in parallel, but these have almost nothing to do in any case. We can not, however, walk in parallel into the output action, because a) the output plugin interface guarantees that only one thread hits a plugin at one time and b) it wouldn’t make much sense here in any case: what would it help if we had hit the output twice and then need top synchronize the file access? No much…

So the bottom line is that a configuration like the one above is highly sequential in nature and consequently there is almost no gain by running some of the tasks concurrently. So, out of the box, rsyslog gains speedup from parallel processing in more complex cases, with more complex rule and many of them.

We are working the provide excellent speedup even for sequential configurations. But this is a long and complex road. For example, in v5 we have now de-coupled message parsing from the receiver thread, resulting in somewhat improved speedup for sequential configs like the one above. Also, we have added batching support in v5, which reduces some overhead involved with multiple threads (and thus reduces the gain we could potentially have). And in late v4 builds we introduced the ability to do double-buffered block i/o for output files, which can considerably reduce i/o overhead for high end systems and also runs in pipeline mode, sequzing a bit more parallelism out of the sequential job.

So with the newer engines, we have been able to apply a basic processing pipeline that looks like

input -> parse & filter -> generate file data -> write

which can be done in parallel. Of course, the file write is action-specific, but I guess you get the idea. What you need to do, however, is configure all that. And even then, you can not expect a 4-time speedup on a quad core system. I’d say you can be happy if the speedup is around 2, depending on a lot of factors.

To get to higher speedups, the job must be made more parallel. One idea is to spread the input, e.g. run it on four ports, then create four rulesets with ruleset queues for each of the inputs. Ideally, to solve the file bottleneck, these should write into four different files. While I did not have the opportunity to test this out in an actual deployment, that should gain a much larger speedup. Because now we have four of this pipelines running in parall, on partitioned data where there is no need to synchronize between them.

Well, almost… The bad news is that the current code base (5.5.0 as of this writing) does unfortunately not yet provide the ability to run the input on more than one thread. So if you have 1000 tcp connections, all of these need to be processed by a single thread (even though they may use different ports, that doesn’t matter…). It is not as bad as it sounds, because the input now is *very* quick (remember the parsing is done concurrently in a different thread [pool!]). But still it causes some loss of parallel processing where not strictly needed. My thinking is that we should either do a “one thread per connection” server (not any longer such a big problem on 64bit machines) or (better but even more effort) do a thread pool for pulling data from the connections. Unfortunately, I do not have time to tackle that beast, but maybe someone is interested in sponsoring that work (that would be *really* useful)?

As you can see, full speedup by using multiple cores is perfectly doable, but going the max requires a lot of careful thinking. And, of course, I have to admit that the best features are present in the newest releases (somewhat naturally…). Obviously, there is some stability risk involved with them, but on the other hand I had some very good success reports from some high-end sites, at least on of them has v5 already deployed in large-scale production.

I could only touch the issue here, but I hope the information is useful. For further reading, I recommend both the doc on queues, as well as my explanation on how messages are processed in rsyslog. These documents are somewhat older and do not cover all details of pipeline processing (which simply did not exist at that time), but I think they will be very useful to read. And, yes, updating them is another thing on my too-long todo list…

rsyslog internal messages

I had an interesting conversation with someone who runs multiple instances of rsyslog on a machine for remote reception, only and for some reasons, another syslogd for local messages. The question arose where rsyslog error messages are emitted to.

It was expected that the showed up in the other syslogd. However, that is not the case, and for good reason. So I thought it is good to provide some general advise on how internal messages are emitted.

First of all, internal messages are messages generated by the rsyslog itself. The vast majority of them is error messages (like config error, resource error, unauthorized connect etc…), but there are also some status-like messages (like rsyslogd startup and shutdown, unexpectedly dropping tcp connection, …). Traditionally, rsyslog does not make a distinction between status and error messages (we could change that over time, but so far nobody asked what means this is not worth the hassle).

Rsyslogd is a syslogd, so all message it emits internally are syslog messages. For obvious reasons, they use the “syslog” facility. And as all are flagged as error message, to total priority is “syslog.err”. The internal message source is implicitly bound to the default ruleset.

It now depends on how that ruleset is defined where these messages show up. I strongly encourage everyone to include a rule that logs these message. If there are some e.g. config issues, they can be easily solved by looking at the emitted error message. But if you do not have them, it can take you ages to sort out what is wrong.

So you should always make sure that “syslog.err” (or probably better “syslog.*”) is logged somewhere.

If you now would like to use another syslogd to log these messages, but not rsyslog itself, you do what you usually do in this situation: first of all, make sure that no local rule logs syslog.* messages. Then, include a rule that forward syslog.* to the recipient that you want to receive it. You have the full flexibility of the rule engine at hand to limit or reformat those messages. Note that an elegant solution to do both is including the following 2 lines at the top of rsyslog.conf (I assume you use UDP-forwarding to another syslogd running on the same host machine):



syslog.* @127.0.0.1
& ~

Note that the tilde character is the discard action.

disk assisted mode performance in rsyslog

I would just like to clearify one thing: rsyslog supports disk assistance in which case messages are written to disk if the in-memory queue becomes full.

However, it is generally bad if the system needs to go to the disk during normal operations. That is primarily meant for things like output targets going offline. If this happens during normal operations, one is probably lost. In the v3&v4 engines, when disk mode is enabled, the in-memory worker threads are shut down. So all processing then takes place over the disk. That means processing will be slower than before. So if the system was incapable of handling the work load when running on a pure in-memory queue, it will definitely be incapable of handling it in disk mode.

Note that things are different in recent v5 engines: starting with 5.3.5, the disk worker runs concurrently to the in-memory workers and as such the performance is similar to what it was in non-disk mode. Still, overall processing is probably slower, so going to disk is not a cure for a system that can not handle the overall workload. In v5, however, it may be a way to handle excess bursts.

Priorities for rsyslog Work

I receive requests for support and code additions to rsyslog every day and I am grateful so many people express their interest and see rsyslog as a useful tool.

The bottom line, unfortunately, is that I can not do everything and I also can not do many things as quickly as I would like to. Also, I have to admit, there are some things that I do not like to do, at least as a cost-free activity. The typical example is work that benefits only a single or small subset of commercial organizations.

I suggest that you read a bit about my philosophy on how open source projects are paid philosophy. Note that “payment” includes for more things other than money, for example good suggestions and bug reports.

I tend to follow this priority scheme, with some variations:

  1. security-related issues
  2. serious problems in the current release
  3. serious problems in previous releases
  4. paid work
  5. things useful to the community at large
  6. things useful to smaller parts of the community (with descending priority)
  7. support for non-commercial folks
  8. bugs in older releases already fixed in newer ones
  9. activities aiding only commercial organizations

The term “things useful” is deliberately vague. Among others, it involves fixing bugs, adding new features and following support requests. However, support requests usually fall only in that category if either a bug is involved or I can gain some more insight into things that need to be changed (like better doc, general user needs, etc…).

Note that, as of my philosophy, I try to avoid doing work for free that only benefits a commercial party, but neither me personally nor the project. If you find this harsh, read my in-depth explanation of that philosophy.

Paying for Open Source Projects…

A selected the word “paying” in this post’s title deliberately. Of course, open source software usually is (and should be) cost-free to all interested parties, but that does not mean there comes no price tag whatsoever with it.

As an open source author I need to admit that it is virtually impossible to give away everything without any price. “Price”, in my perception, does not necessarily mean “money”. There are many benefits you may gain from working on software, and money is only one of them.

But first of all, let me re-iterate the FSF’s “freedom vs. free beer” argument, in which I fully believe:

“Free software” is a matter of liberty, not price. To understand the concept, you should think of “free” as in “free speech,” not as in “free beer.”

This is very true. In my personal mind, I would really love to give away any work I create to those that need it. But that thought involves some subtle issues. One prominent problem is that other people may think differently. For example, my landlord doesn’t like this idea. Nor does my bakery. Not even the computer manufacturer, on whom’s system I develop my software! What a shame! So if I gave away everything for free, I would, thanks to the social security system, probably not die, but I would definitely not have a machine to create those things I would like to provide for free.

So it looks like I need to make a compromise, give away certain things and charge for others. One approach would be to quit computing as a profession and become a gardener instead. In my spare time I could then program and give away everything for free. The bottom line is that I could program much less than I can do currently. Also, I prefer programming over gardening. So this does not look like a good approach – neither for me personally (the then-unhappy gardener) nor for society at large (who can no longer gain the full benefit of my work: believe me, I am far more productive as a programmer as opposed to a gardener…).

So this seems to be the wrong approach. It naturally follows that I need to charge for some of the computing work I do.

Then, look at my motivation as an open source developer. I’d like to make the world a little bit a better place, providing useful tools. And, if I am honest, I may even like to get a little bit of fame as a recognized open source developer. I guess that motivates many, but few admit to it ;) This hits a sweet spot of “payment”: being recognized feels good and thus it keeps me motivated. Seeing the project grow and spread also motivates me. Projects where there is no feedback and which do not grow are usually quickly abandoned. Why? Because not even the most basic “payment” is provided in exchange for the work done.

So a very important form of “payment” to open source authors, at least in my point of view, are contributions to the project, help in spreading news about it, and, (very, very valuable) good bug reports. Everything that helps push a project and make it evolve. Of course contributions in any form are also happily accepted (be it software, hardware, book, …., and of course money). Money is not evil. It pays the electricity to run my machine, among others.

Taken the arguments together, there is no ideal world where I can give away everything and receive in exchange whatever I need (and, I barely remember, experiments in trying this failed miserably…).

With that on my mind, I begin to divide the world in “friends” and “foes”. Friends are those that provide me with some form of “payment”, that is anything that is useful for me. Good examples are the folks that write the open source software *I* use (aha, this is cyclic!), folks that provide good bug reports and try out development versions etc. Any activity that I can also use to my benefit makes you my friend.

Then, there are “foes”. That world probably is too hard and maybe should be re-phrased as “non-friends”. But the term and the idea is well known.

If you are not my friend, you do not contribute anything that I can use for my benefit. This doesn’t mean you are a bad guy. If you and I do not have anything in common, why should you do something that benefits me? There are far more people that I never provided any benefit to than there are people where I did. I guess that is true for almost all of us except a few outstanding people (which then usually receive admiration as a sort of “payment”).

But if you are not my friend, you should not expect from me that I do anything for free for you. Envision a stranger comes to your home and asks you if you would like to help him build his home. I guess you will be astonished and probably ask “Why should I do that?”. Now assume the sole answer is “Because that is good for me, the stranger, but you need to bring your own clothes and tools and need to pay the gas to come to my home”. Would you be willing to help that guy out? I guess, the answer would be “no” in almost all cases.

So why should I as an open source developer create software for or otherwise help a non-friend? Why am I supposed to say “yes, of course” if a stranger asks me “Can you implement this and that, but you need to pay for your own hardware and the electricity used and also for…”? The answer is: I am not! So don’t silently expect me to do that.

Of course, the question itself may have made you my friend. How come? Simple: the idea you propose may be a very useful idea for my project. If it gets implemented, it will help many of my currently existing friends and it will eventually help spread the project. So by providing the idea itself, you did me a big favor, which one may consider as a form of “payment”. Consequently, I often implement things asked for by complete strangers. And I often try to help out complete strangers on the mailing list and on other support channels. Here, I often learn a real lot about what is good and bad about my projects. This is a very valuable for of “payment” for me.

HOWEVER, and this is my personal limit, whenever I am asked to do something for free, I evaluate *my* benefit in doing so. Of course, this includes the benefit to the project and the benefit to the community at large, but this all goes into the picture of “my” benefit as the sum of all that.

So if a complete stranger asks me to do something, I check for immediate benefits in doing that. Sometimes, there are cases where I can see a benefit, but only to that stranger. Usually, these are things corporate guys need, and they are very special and non-generic. If there is no benefit at all, I simply do not look any further. Of course, the proper solution here is that those folks can actually pay money to make me implement whatever they need. The logic behind this is that when they pay money, the help fund activities that also benefit the project at large. But if they are corporate guys, and they do not get any money approved for what they (think they) need, they don’t really need it at all! Because if it were really useful for there corporation, they would have received the money grant (corporations are very good in making these trade-offs, though they occasionally fail ;)). So in short, the money is even a filter that prevents me from spending time on things that nobody really needs!

If a friend comes along and asks me to do something, I still need to evaluate the request. But I am much more likely to implement the functionality requested (its a game of “give and take”). Of course, I need to evaluate the overall priority for my project here, too. But friends definitely receive a priority boost if at all possible. And I think this is only fair.

In general, I favor requests that are useful to the community at large over those that are only useful to a small subset of it. I tend not to implement without any form or “hard” payment (hardware, money, a nice vacation on Hawaii… ;)) anything that is only useful to a single commercial organization. For example, why should I provide free services to a company that expects me to pay, e.g. the utility bill? If you do not give your services to me for free, don’t expect me to give my time for free to just your benefit (think about the “stranger asking for my help on building his home” analogy).

My thoughts my sound very material, but in fact they just describe on what I think is fair in the non-perfect world we live in. Remember that most non-profit organizations are my friend, because they offer useful service to “me” (as part of the community). And think about my thoughts in the intro of this blog post about my inability to do any useful work at all if I would NOT have a somewhat material point of view. So, honestly, I think my philosophy here is not actual “material” but rather a result of how life is…

Edit: it may also useful to have a look at my blog post “work, friends and personality“, which looks at a very similar issue from a slightly different angle.

The philosophy also influences priority decisions in my open source projects, as outlined for example in “rsyslog work priorites“.