How Software gets stable…

I have received a couple of questions the past days if this or that rsyslog feature can be introduced into the stable branch soon. So I thought it is time to blog about what makes software stable – and what not…

But let me first start by something apparently unrelated: let me confess that, from time to time, I like to enjoy some good wine (Californian Merlot and Cabernet especially – ask my for my mailing address if you would like to contribute some! ;)). And at some special occasions, I spend way to much money just to get the “old stuff”: those nice wines that have aged in oak barriques. To cut a long story short, those wines are stored in barrels not only for storage, but because the exposure to the oak, as well as some properties of the storage container, interact with the wine and make it taste better. Wikipedia has the full story, and also this interesting quote:

The length of time that a wine spends in the barrel is dependent on the varietal and style of wine that the winemaker wishes to make. The majority of oak flavoring is imparted in the first few months that the wine is in contact with oak but a longer term exposure can affect the wine through the light aeration that the barrel allows which helps to precipitate the phenolic compounds and quickens the aging process of the wine.[8] New World Pinot noir may spend less than a year in oak. Premium Cabernet Sauvignon may spend two years. The very tannic Nebbiolo grape may spend four or more years in oak. High end Rioja producers will sometimes age their wines up to ten years in American oak to get a desired earthy, vanilla character.

Read it again: “High end Rioja producers will sometimes age their wines up to ten years in American oak to get a desired earthy, vanilla character.

So what would the Riojan winemaker probably say if you asked him for a great 2008 wine (we are in early 2009 currently, just for the records)? How about “Be patient, my friend – wait another 9 years, and you can enjoy it!” And what if you begged him you need it now, immediately? “I am sorry, but I can’t accelerate time…“. And if you told him you really, really need it because otherwise you can not close an important business deal? Maybe he says “Listen my friend. Some things simply need time. You can’t hurry them. But if you need to have something that can’t really exist, I can get you a bottle of that wine and label it as ‘Famos Riojan 10-year aged Wine from 2008’ – but we both know what is in the bottle!“. Technically speaking, the winemaker is not even cheating – he claims that the wine is from 2008, and so how can it be aged 10 years? If anyone buys that (today), the onlooker is probably very much in fault.

As a side-note, all too often our society works in that way: someone requests something that is impossible to do, someone begs long enough until someone else cheats, everybody knows – and we all are happy (at least up to the point where the cheat gets us into real trouble… – pick your favorite economic crisis to elaborate).
The moral from the story? Some things need time. And you can’t replace time by anything else. If you want to have the real taste of a wine aged 10 years in oak… you need 10 years.

By now you probably wonder what all of this has to do with software. A lot! Have you ever thought what makes software stable? In closed source, you hopefully have a large testing department that helps you nail down bugs. In open source, you usually do not have many of these folks, but you have something much better: a community of loyal users eager to break their systems with the latest and greatest of what you happen to have thrown together ;)

In either case, you start with a relatively unstable program and with each bug report (assuming you fix it), the software gets more stable. While fixing bugs, however, you may introduce new instabilities. The larger the fix, the larger the risk. So the more you change, the larger the need to re-test and the larger the probability that while one issue is fixed one (or more!) issues have been newly created. For very large fixes, you may even end with a much worse version of the software than you had before.

Thankfully, a patch to fix a bug is usually much smaller than what was fixed. Often, it is just a few lines of code, so the risk to worsen things is low. Why is the patch usually just a few lines long? Simply because you fix some larger thing that usually works quite well. So you need to change some details which were not properly thought out and thus resulted in wrong behavior (if you made a design error, that’s a different story…).

So the more bug reports you get, and the more of them you fix, the more stable a software gets. You may have seen some formal verifications in computer science, but in practice, for most applications, this is the simple truth on how things work.

Now to new features: features are usually the opposite from a bugfix: introducing a new feature tends to be a larger effort, touching much more code and adding code where code never has been ;) If you add new features, chances are great that you introduce new bugs. So with each feature added, you should expect that the stability of your code decreases (and, oh boy, it does!). So how to iron out these newly introduced bugs? Simply wait for bug reports, fix them, wait for more – until you have reached at least a decent level of stability (aka “no new/serious bug reports received for a period of n days, whatever you have n defined to be).

And what if you then introduce a new feature? I guess by now you know: that’ll decrease stability so you need to iterate through the bugfixing process … and so on.

But, hey, we are doing open source. I *love* to add features every day! Umm… I guess my program will never reach a decent level of stability. Bad…

What to do? Taking a long vacation (seducing…) is not a real solution. Who will fix bugs while I am away (shame on me for mentioning this…)? But a pattern appears if you follow this thought: what you need to do to make a program stable is fix bugs for a period of time but refrain from adding new features!

Thanks to git, this can easily be done: you simply create one code branch for a version that shall become stable, and create another branch for the version where you create new features (the development branch). With a bit of git vodoo, you can even import fixes from your stabilizing branch to the development branch. Once you are happy with the stability of your code (in the stabilizing branch), you are ready to declare it to be stable! For that, you’ll probably have a separate branch. Then, you can start the game again: copy the state of your development branch to the stabilizing branch, do not touch that branch except for bug fixes and continue adding new features to the development branch. Iterate this as long as you are interested in your project.

This, in short form, is how rsyslog is created. Currently, there are four main branches, plus a number of utility branches that aid the development of specific features (let’s ignore them in this context here): we have the development (also called “master”) branch which equates to the … yes… development branch from the sample above;). The stabilizing branch is called “beta” in rsyslog terms. Then, we have a v2-stable and a v3-stable branch. Both are actually stable, but v2-is probably even more stable because it has – except for bug fixes – not been touched for many months more. It also has the fewest features, so it is probably the best choice if you are primarily interested in stability and do not need any of the new features. As rsyslog is further developed, we will add extra stable branches (e.g. there will probably be a v4- and v5-stable branch – but we may also no longer maintain v2-stable at this point because nobody uses it any longer [just like dinosaurs are no longer maintained ;)]).

Did you read carefully? Did you get the message? So let me ask:
What makes software stable?

Bug fixes? Testing? Money (yes, yes, please throw at me!)?

REALLY? Let me repeat:
WHAT MAKES SOFTWARE STABLE?

There is only one real ingredient and that is: TIME! Just like good wine, software needs to age. Thankfully, age, for software, is defined in number of different test cases. So money can accelerate aging of software (as some chemistry guru may be able for wine, probably with the same side-effects…). But for the typical open source project, stability simply goes along with the rate at which the community adopts new releases, tests them AND submits bugs, so that the authors can work on fixing broken things.

And what is the moral of the story? Finally, I am coming back to the opening questions: there is nothing but time that make rsyslog stable. So if you ask me to add a feature today, and I do, you can not expect it to be immediately stable – simply because this is not how things work (thanks, btw, for trusting so much in my programming abilities ;)). The new feature needs to go through all the stages, that is it must be applied to the current development build (otherwise we would de-stabilize the current beta, what is not desirable). Then, this is migrated to the stable build over time, where it can finally fully stabilize and, whenever the bug rate seems to justify this, it can move on to the stable build. For rsyslog, this typically means between three to four, sometimes more month are needed before a new feature hits the stable branches. And there is little you can do against that.

“But… hey, I need a stable version of that cool feature now! My manager demands it. Hey, I’ll also pay you for it…” Guess what? I can do the same the winemaker did. Of course, and if you ask really nicely, I can create a v3-stable-cool version for you, which is a version with the cool feature that I have declared immediately stable (btw, it’s mostly the same thing that all others just cal l “the beta”). If that satisfies your boss, I’ll happy to do. But we both know what you have gotten… ;)

Of course, I am exaggerating a bit here: in software, we can somewhat increase the speed of stabilizing by adding testers. Money (and even more motivation) can do that. We can also backport single new features to so-far stable branches (note the fine print!). This reduces the stability a bit, but obviously not as much as for the development version. However, this requires effort (read: time and/or money) and it may be impractical for many features. Some features simply rely on others that were newly introduced in that development version and if you backport the whole bunch of them, you’ll have something as much changed as the development version, but in an environment where the component integration is not as well tested and understood. Of course, some company policies (seem to) force you to do that. If so, the end result is that you have a system that is much less stable than the development version, but has a seemingly “stable” label. Wow, how cool! As the common sense says says: “everyone gets what one asks for” ;)

So what is the bottom line? Good software and good wine has something in common: time to ripen! Think about this the next time to ask me to offer a new feature as part of a stable branch. Its simply impossible. But, of course, you can bribe me to stick that “stable” label onto a mangled-with version…

Platform importance for rsyslog

If you follow my blog or the rsyslog mailing list, you probably already know that rsyslog is available on a number of platforms. Thanks to contributors, rsyslog runs on BSD and is seen on Solaris and HP-UX too. The later two are not real ports yet and each of them has their restrictions. Also, I’d like to see support for AIX, but was not even able yet to obtain a compile platform.

HOWEVER… as much as I desire multi-platform support, it is the truth that rsyslog stems from and is fueled by the Linux community. This is where the major contributions come from and this is also where the major interest originates. Plus, this is the only truly free platform, so it lives up to the same spirit that rsyslog has.

When it comes to putting effort into the project, I have limited resources. Naturally, I put those resources to where they create the most effect. For that reason, most of the development is focused towards Linux (followed by BSD, where there is also an active community). Solaris and friends live mostly in the corporate world and so questions asking for rsyslog on these platforms mostly come from for-profit organizations. And there are very few of these requests. So I can not give them priority, because they do not benefit the project sufficiently large. HOWEVER, if the corporations put some money up and sponsor development, that is definitely in the interest of the project, because it allows us to grow and the sponsorship will probably allow us to do other things as well. Everyone benefits.

Once a platform is implemented, it must be maintained. Obviously, there is little point in orphaning a platform that we already run on. But for platforms with little interest, it is probably not justified to test each and every new release (just think of the testing time required). I’d call those platforms “tier 2” platforms and think I can look at them only in response to a problem report. Of course, we offer rsyslog support contracts and if a sufficiently large number of users decide to purchase these contracts (extremely low numbers today, to phrase it politely) and these purchasers are interested in e.g. Solaris, we will most probably change priorities and all out of sudden Solaris will become “tier 1”. Of course, this may push away some community-requested work, but again I think this is in the overall interest of the project: if we can secure continuous funding, not only from one source (Adiscon), but many, we can be much more sure we can implement more and more cool things in the future.

I hope this clarifies my position on the importance of the various platforms for rsyslog and how I will handle them.

Oh, and one final note: if a platform requires me to even purchase hardware (Solaris/Sparc for example), I will not do that unless someone donates a machine (NOT LEND it, but donate, so that at least for the next three years I can ensure maintaining rsyslog on it – a virtual machine, of course, is sufficient if you happen to have some inside a cloud ;)). It would be just plainly silly to put real money at supporting a community that does not contribute back ;)

rsyslog video tutorials…

I started thinking about video tutorials a few days ago. Videos are cool and more and more people use them. So why not create a couple of them for rsyslog?

The idea is simple and I think it will work equally well for teaching both conceptual topics as well as practical “how to” types of problems. The later probably works even better…

I could investigate, design and build my tutorial in a perfect way. The result would obviously be very useful and perfect – but most probably there never would be any result due to time constraints and priorities. With this on my mind, I created a very first trial tutorial this morning, all in all in less than an hour. It took me some more minutes to get it up on the web site, but this effort will never again be required.

The question this trial shall answer is: is it possible to create something useful (not perfect) in little time? My personal feeling is mixed. I think one notices quickly that the material is not as much organized as you would expect from a talk. Also, some additional slides would definitely have enhanced the usefulness – but also increased production time very much. On the other hand, I think some information is conveyed by the presentation. And, even better, information that you can not obtain with reasonable effort from any other place.

So: is it useful or not? What could improve the usefulness without causing a large increase in production time? Does it make sense to create sub-optimal content but be able to create it as it can quickly be done? If so, which other topics would you like to see covered?

Please have a look at the rsyslog message flow video tutorial and let me know your thoughts!

rsyslog and solaris

This week, I had the opportunity to work a bit on rsyslog on Solaris. Most importantly, I could set up a compile and test environment (*not* that easy if you don’t know your way around Solaris…) and have integrated those patches that folks have sent over time (unfortunately I have lost many of the contributor names, so if you are among them please let me know for proper credits!).

I was able to integrate those patches and make sure that they don’t break the linux build (I am still a bit in the verification process, but it looks good). I have created a solaris branch in git and will in the future keep solaris-specific additions in that branch. I will merge that branch back into the master branches every time I am confident enough that it doesn’t break anything in the main stream build.

I was satisfied to see that not that many changes were required for a Solaris build. So the initial effort, some month ago, seems to have paid well. I have seen that the solaris git branch compiles, but I have not done any serious testing on Solaris. Still, I am short on time and I have to admit I have spent more time on it this week than I should. So testing is off-limits for now…

However, I got some good impression on what it takes to make rsyslog really run on Solaris. First of all, even gcc4 does not provide the atomic instructions that it is used to provide on Linux. This case is not really handled in the code, so the end result is that the binary will be racy. I guess it will run, but it will have subtle issues on high-volume log servers and/or serves that run asynchronous action queues. Especially if the later is used, I’d expect rsyslogd to segfault every now and then (but without async actions it should not be that bad, at least I think).

There also still does no kernel input plugin exist (or an imklog driver). I also guess there may be issues with the local log socket. I’d still caution everybody to be very, very careful when experimenting with the local log socket. I remember earlier testing where rsyslogd simply destroyed the socket but never was able to re-create it. Some other tweaks are probably required to core and runtime files. Some compiler messages point into that direction (and part of that may even be nasty).

I have compiled only the bare essentials, without TLS, database drivers or anything else fancy. I expect some mild to moderate problems with them, too.

So in short, the current code base is probably be used to run a relatively stable syslog relay or file-only receiver. I wouldn’t put it in too much production, though. For folks interested in rsyslog on Solaris, we now at least have a version again that can be build and serve as a basis for extension. I am glad I could do that.

As a side-note, I am still looking for sponsors of a full rsyslog Solaris porting effort. If you would like to sponsor (or know someone who does), just mail me and I’ll help settle the dirty details ;)

I hope this update – and the progress made – on rsyslog on Solaris is useful for a couple of folks.

rsyslog doc – state of the art…

Most people agree that rsyslog is a decent and useful piece of software. However, most people (including me) also agree that the rsyslog documentation is, ahem, sub-optimal.

When I code, I always think “I’ll do the doc soon”. But when “soon” arrives, something else is in the way. Yet another (justified) feature request, articles and other projects (yes, they exist ;)). At least I try to convey the important concepts and backgrounds here in the blog, but you have a hard time if you intend to extract a specific feature from the blog. So: the doc is in a bad shape.

I just got an offer from an volunteer who would like to help with the doc. That may even be the start of a rsyslog doc team. In any case, that’s a fantastic opportunity. First of all, more doc means more and happier users. Secondly, I think it is very useful when someone other than me writes user doc. I can’t even envision the questions that a regular user may ask, and this is a problem for any manual I write.

I hope this collaboration manifests. In order to aid it, let me briefly describe what currently exists: www.rsyslog.com is driven by Postnuke for various reasons, the most important one that I have a postnuke wiz at hand, so I do not need to dig in any dirty details if I need something extra ;) Postnuke is a CMS, so dynamic content can be added and is easy to edit by anyone else. So far, we use the web site itself primarily for news announcements.

The real doc set is kept as HTML. We use a Postnuke module to integrate that static html into the CMS. The HTML doc set exists only once, right inside the rsyslog git tree. When I make changes, they automatically go into git, go into the tarball and I also copy them over to the web site. All of this is without any effort, which is good. The bottom line is that the HTML doc set needs to be modified by patches or me pulling from someone else’s git archive (both of which I will happily do). I think it is good to have the html pages available in the tarball, previous discussion on the rsyslog mailing list showed that package maintainers think so, too.

There exists two man pages. They are extremely bad. They need to be hand-synced with the html pages and I almost always forget to do so. Man pages do not go onto the web (besides some very old copies I produced via a clumsy way). But the live in git and the tarball, too.

A partial effort was done to internationalize the doc set, based on the usage of docbook. I think this is a good approach and the work done so far is kept in the rsyslog docbook branch. However, the approach currently focuses on the man pages. I do not know if it will work for the HTML doc, too.

I find docbook a very interesting concept, but the learning curve is steep. I simply had not enough time yet to dig deeply into it to start any serious work with it (html and LaTeX are still king for me ;)).

We have also a few places of obviously user-contributed content, the most important one being the rsyslog wiki. It contains many useful things, among others config samples. The bad thing about the wiki is that there is only a single one. So it probably is not the place to describe things that are very version dependent. Or is it and I have just the wrong approach – correct me!

Worth mentioning is also the rsyslog knowledge base, which primarily focuses dynamic content and discussions. But the search function is a very useful tool. Also, part of the larger knowledge base is devoted to gather information on how to configure syslog devices, how to best react to messages and how to consolidate e.g. Windows events. This obviously is not direct rsyslog documentation, but I hope it is useful and will continue to grow even more useful.

Finally, there is the mailing list and most importantly the mailing list archive. While this is definitely not considered a documentation resource, the archive has a lot of valuable information and it may even be a starting point for creating “real” doc.

I hope this is a good and complete wrap-up of the doc situation. If I have forgotten anything or you’d like to tell me your thoughts: just use the comment function! :)

rsyslog now default on stable Debian

Hi all,

good news today. Actually, the good news already happened last Saturday. The Debian project announced the new stable Debian 5.0 release.

Finally having a new stable Debian is very good news in itself – congrats, Debian team. You work is much appreciated!

But this time, this was even better news for me. Have a look at the detail release notes and you know why: Debian now comes with a new syslogd, finally replacing sysklogd. And, guess what – rsyslog is the deamon of choice! So it is time to celebrate for the rsyslog community, too.

There were a couple of good reasons for Debian to switch to rsyslog. Among others, an “active upstream” was part of the sucess, thanks for that, folks (though I tend to think that after the more or less unmaintained sysklogd package it took not much to be considered “active and responsive” ;)).

Special thanks go to Michael Biebl, who worked really hard to make rsyslog available on Debian. It is one thing to write a great syslogd, it is a totally different one to integrate it into an distro’s infrastructure. Michael has done a tremendous job, and I think this is his success at least as much as it mine. He is very eager to do all the details right and has provided excellent advise to me very often. Michael, thanks for all of this and I hope you’ll share a virtual bottle of Champagne with me ;)

Also, the rsyslog community needs sincere thanks. Without folks that spread word and help others get rsyslog going this project wouldn’t see the success it experiences today.

I am very happy to have rsyslog now running by default on Fedora and Debian, as well as a myriad of derivates. Thanks to everyone who helped made this happen. So on to a nice, little celebration!

Thanks again,
Rainer

PS: promise: we’ll keep rsyslog in excellent shape and continue in our quest for a world-class syslog and event processing subsystem!

When does rsyslog close output files?

I had an interesting question on the rsyslog mailing list that boils down to when rsyslog closes output files. So I thought I talk a bit about it in my blog, too.

What we need to look at is when a file is closed.
It is closed when there is need to. So, when is there need? There are currently three cases where need arises

a) HUP or restart
b) output channel max size logic
c) change in filename (for dynafiles, only)

I think a) needs no further explanation. Case b) should also be self-explanatory: if an output channel is set to a maximum size, and that size is reached, the file is closed and a new one re-opened. So for the time being let’s focus on case c):

I simplified a bit. Actually, the file is not closed immediately when the file name changes. The file is kept open, in a kind of cache. So when the very same file name is used again, the file descriptor is taken from the cache and there is no need to call open and close APIs (very time consuming). The usual case is that something like HOSTNAME or TAG is used in dynamic filename generation. In these cases, it is quite common that a small set of different filenames is written to. So with the cache logic, we can ensure that we have good performance no matter in what order messages come in (generally, they appear random and thus there is a large probability that the next message will go to a different file on a sufficiently busy system). A file is actually closed only if the cache runs out of space (or cases a) or b) above happen).

Let’s look at how this works. We have the following message sequence:


Host Msg
A M1
A M2
B Ma
A M3
B Mb

and we have a filename template, for simplicity, that consists of only %HOSTNAME%. What now happens is that with the first message the file “A” is opened. Obviously, messages M1 and M2 are written to file “A”. Now, Ma comes in from host B. If the name is newly evaluated, Ma is written to file B. Then, M3 again to file A and Mb to file B.

As you can see, the messages are put into the right files, and these files are only opened once. So far, they have not been closed (and will not until either a) happens), because we have just two file descriptors and those can easily be kept in cache (the current default for the cache size, I think, 100).

I hope this is useful information.

On the reliable plain tcp syslog issue … again

Today, I thought hard about the reliable plain TCP syslog issue. Remeber? I have ranted numerous times on why “plain tcp syslog is not reliable” (this link points to the initial entry), and I have shown that by design it is not possible to build a 100% reliable logging system without application level acks.

However, it hit me during my morning shower (when else?) that we can at least reduce the issue we have with the plain TCP syslog protocol. At the core of the issue is the local TCP stack’s send buffer. It enhances performance but also causes our app to not know exactly what has been transmitted and what not. The larger the send buffer, the larger our “window of uncertainty” (WoU) about which messages made it to the remote end. So if we are prepared to sacrifice some performance, we can shrink this WoU. And we can simply do that by shrinking the send buffer. It’s so simple that I wonder a shower was required…

In any case, I’ll follow that route in rsyslog in the next days. But please don’t get me wrong: plain TCP syslog will not be reliable if the idea works. It will just be less unreliable – but much less ;)

Begun to roll out race patches…

I have now begun to roll out the rsyslog race patches. Before the weekend, I rolled out the patch for the debian_lenny and development branches and today the beta branch followed. I am now looking forward for feedback from the field. The patch for v3-stable is ready (and available via git), but I’d like to get at least a bit more feedback before I do another stable release.

rsyslog: optimizing exception handling

The recent analysis of rsyslog’s race condition has fueled some related and some not-so-related discussions. Among them is an old-time favorite, that is performance enhancement. I have finally taken the time to write about rsyslog’s “exception handling” and what I do not like about it.

I am reproducing a forum post here, in the hopes that it will be easier to find – and attract more attention – if it is available via the blog. Comments I would appreciate via the forum, so that I can keep track of them in a single location. With that said, here we go:

In rsyslog, a kind of exception handling is done by the “iRet” mechanism. In short, there exists an integer data type that conveys a universal return code. This code ranges from “all OK” over “all OK, but this and that information”, “we had a warning” to “something went wrong”. States are encoded as integer numbers. By calling convention, almost all functions return such an iRet value (named after its variable name). More importantly, every caller checks the outcome and employs a kind of exception handling when something unexpected happened (like doing resource cleanup). As an aid to the developer, most of the inner workings are encapsulated in easy to use macros.

For example, the return code checking is done via the CHKiRet(f(x)) macro, which expands to something like

if((iRet = f(x)) != RS_RET_OK)
goto abort_finalize;

As such, the innocent-looking (and frequently found )sequence

CHKiRet(f(x));
CHKiRet(g(x));
CHKiRet(h(x));

results in lots of conditional branches. Such code places a big burden on a CPU’s speculative execution resources. For example, it may need a lot of space in the branch pattern table, ejecting other, potentially useful entries from the cache. Given the fact that the quality of speculative execution affects execution speed considerably on modern CPUs, pressing the speculative system to its max is probably not a wise idea.

One performance enhancement approach is to find ways that enable the code to be executed in larger linear blocks. The most important observation is that in almost all cases, the if() condition is never true, because typically the outcome of the function called is an OK state.

I thought about using longjmp to provide the necessary functionality, but the setup effort for longjmp, on *quick* lock, seems to be too high, especially in the case of the number of small functions that are present in rsyslog (and inlinening does not help with this issue). The answer is probably too look at how the C++ exception mechanism is implemented and build a solution similar to that (just like many of the object callbacks are inspired by the C++ method call tables).

I have not yet begun to dig seriously into this optimization, as there are plenty of other things that can be improved and that promise to have much more effect (like the reduction of the overall number of system calls needed on a per message basis).

However, I would appreciate feedback on this issue. Please post to the forum thread, so that I have the information at hand when I finally can turn to optimizing that code area.