rsyslog work log and future directions

Hi folks, probably the last rsyslog work log post for 2007. Thanks for sticking around – and hopefully I’ll see you again in 2008. It’ll become a very exciting year, with a lot of new features. I am eager to implement what is right now on my head, and I’ll most probably will start with modifying the message queue, an endeavor that will ultimately lead to store-and-forward capability just like in syslog-ng’s premium edition. And the good news is that I hope to finish that in January 2008 ;) — what also means that I have made up my priorities. Was not an easy job, and I hope I got it right. So store-and-forward with enhanced output threading is first and the other things will follow later. To me, the hardest decision was to put off expressions, another feature at least I would like to see the sooner the better.

But now back to the work log:
2007-12-27
– added $UDPServerAddress config directive
– added capability to have multiple UDP listeners running concurrently
– applied cross-platform patch from darix to facilitate GSS-API compile
on more platforms
– some cleanup
– internal restructuring in omfwd.c – stage work for further modularization
I think I also fixed a bug as a side-effect – but not looked to much at it
2007-12-28
– took TCPSend() apart and made it generic via function pointers
– moved TCPSend() and frame building code to tcpsyslog.c
– omgssapi created
– removed gss-api code from omfwd.c

gss-api and rsyslog v2

I initially sent this message only to the mailing list. But now I think it make sense to reproduce it here. So there we go:

I am working on the modular structure of rsyslog v3. I am currently revisiting gss-api support. I notice that with the current omfwd, it will be extremely hard to separate gss-api support into its own module. Doing so will break backward compatibility to the configuration file.

GSS-API has been out only for a few days, and mostly over the holiday period. So it is much less of a concern if we introduce now some changes that will case rsyslog.conf format modifications. Much less trouble than when we release v2, a release expected to be in wide use for at least half a year, if not much longer. V2 released with the current syntax would require me to do some tricks in v3 to keep compatibility. Quite complex.

So I decided to create a omgssapi for v3 and extract the gss-api code from omfwd. It looks like this can be done without too much code duplication. There will be some duplicate code, but it will shrink as v3 continues to be developed. Once I have a good working version, which I expect very soon, I will backport that to the v1/2 source tree. I’ll then do a new v1 release with a slightly incompatible gss-api config file syntax. After this is out for a few days, I hope I can than finally push out that version as v2.

I hope this is a good decision. I think it will save us major future trouble at the expense of a relatively slight disturbance in the late v1 timeline. I guess most user’s won’t even notice there is a change.

As always, Feedback is appreciated.

Things to do in rsyslog…

I have made good progress with rsyslog‘s input modules. As it looks, the basic things are done and the input module interface has been proven to be both quite stable as well as very simple. It doesn’t yet support different instances, but I begin to think that I do not even need them – also not in the long term.

Of course, most of the current input modules are not clean modules. They have a lot of dependencies to other parts of the code, which is not yet able to be dynamically loaded. But at least there is a foundation on which additional modules could be build. Getting the current input modules to be real clean modules will require further stage work. Many thanks need to be done.

So what to do next? It now comes down to both a matter of priorities and dependencies. I am writing this note here mostly for myself. It helps my clear up my thoughts and will also probably serve as a reference for quite a while. My thoughts may be hard to understand – sorry for that. But I thought I make them public when I write them down – even if they are not really targeted toward others. I still hope they may help you get some more background info.

So what’s to do:


- find a way to handle global settings
- multi-threaded output modules
a perquisite for
- create queued outputs (write to queue if action fails and
restart when it resumed)
- re-write way config file is read
probably perquisite for:
- create expression support
- in templates
- in selector filters
- create interface for (loadable) user function modules
- create a system to allow loading "library" loadable modules
(e.g. network library for imudp, imtcp, ...)
- separate GSSAPI from plain TCP (requires libs and lib extension system)

xmas rsyslog work log

I’ve been a bit busy with rsyslog over the xmas break. Here are the changes:

2007-12-25
– moved some more net functionality out of syslogd.c – stage work
– fixed duplicate license text in syslogd.c – made it ambigious
– moved udp net code – again, stage work
– moved some of the udp input code to its right place
2007-12-26
– moved cross-platform define for AI_NUMERICSERV to net.h
– made udp code somewhat less dependable on global variables — stage work
– removed omfwd code dependency on “finet”
– removed imudp code dependency on “finet”
– removed active INET code from syslogd.c – still some auxiliary things
remain
– fixed socket leak in omfwd.c
– removed global variable LogPort
– removed gloable variable AcceptRemote and external def of bFinished

Seasons Greetings to Everyone

My best wishes to everyone! Let me share this lovely impression:

fog and cool temperatures create a christmas wonderland - seen 2007 in GermanyI thought this image conveys much of the beauty of our planet earth and the hopefully peaceful holiday season. I wish all of you great holidays, nice gifts and time with your beloved ones.

In 2007, we’ve gone a long way. Both from an Adiscon perspective, with lots of new product releases and great features and also from the rsyslog point of view. And, of course, there were private highlights as well, for example my unforgettable trip to view space shuttle Discovery’s sts-120 launch. Thanks everyone for your support and all the kind words I received!

Once again, a great holiday season to all of you!

PS: if you enjoyed the image above, you may want to have a look at my xmas 2007 impressions gallery.

rsyslog work log for 2007-12-21

It has been a good day today! Finally, the alarm() call has been deleted! :) That was something long on my agenda, but I couldn’t do it without the redesign of the inputs. The alarm() was not really a big issue, but it became an annoyance to me because it was so hard to remove.

I would also like to mention that I will do only occasional work during the holiday period. So I do not expect more serious changes until early January. Some releases, however, are due next week (maybe 2.0.0).

Here is the detailed rsyslog worklog for today:

– removed no longer needed mutex from omfwd
– released a preview of 3.0.0 “as is” to mailing list – just to get the idea
– begun work on imtcp
– created first version of imtcp (still very much depending on syslogd.c for
configuration and a lot of other things)
– cleaned up code (resulting in some shuffeling from syslogd.c to the
“right” module)
– prepared for imudp
– created an initial version of imudp.c. The majority of UDP reception code
is now in that module and it is dynamically loadable. HOWEVER, that doesn’t
mean it is a proper module. There are still many, many dependencies on
global variables, cross-module calls and such. However, havin the code base
separated allows me to carry out some other cleanup before I return to
create a really clean implementation of these modules. So it is kind of a
stage work. Just don’t mistake it with “the real thing”…
– removed code no longer needed
– finally, alarm() has gone away :) — this is now done by the main thread
– some cleanup

rsyslog work log for 2007-12-20

Yesterday was a very busy day for rsyslog. I am on a good path to input modularization, but the hardest part needs still be done ;)

Here is the log:

– bugfix: fixing memory leak when message queue is full and during
parsing. Thanks to varmojfekoj for the patch.
– working on a potential race condition on the new input module
interface. See newsgroup posting for details on the issue:
http://groups.google.com/group/comp.programming.threads/msg/330b9675f17a1ad6
I tried some mutex operations but came to the conclusion that this
does not really help. So I have now switched to plain thread
cancellation, which so far seems to be OK. Need more practical
experience with other input modules to make a final decision. Thus
I leave all code in and have just disabled the problematic
code.
– implemented $klogUseSyscallInterface config directive
– implemented $klogSymbolLookup config directive
– moved unix socket code to its own module (imuxsock)
– implemented $OmitLocalLogging config directive
– bugfix: memory leak in cfsysline.c/doGetWord() fixed
– implemented $SystemLogSocketName config directive
– implemented $AddUnixListenSocket config directive
– MILESTONE reached: imuxsock initial version done
– removed single-threading support for sending TCP messages; caused
simplyfication of output module interface as well as core syslog
processing.
– moved udp send code to its own function

rsyslog work log…

Here is my recent rsyslog work log:

2007-12-18
– removed files from cvs that not belong there (thanks to Michael Biebl for
pointing that out)
– restructured #include’s somewhat thanks to Michael Biebl
– code cleanups thanks to Michael Biebl
– applied Michael Biebl’s patch to enhance $includeconfig to support
wildcard filenames
2007-12-19
– applied some more cleanup provided by Michael Biebl
– applied enhanced gss-api functionality provided by varmojfekoj
– GSS-API support for syslog/TCP connections was added. Thanks to
varmojfekoj for providing the patch with this functionality
– release 1.21.0
– added the -c option
– enhanced -c option support (some basics)
– bugfix: llDestroy() left the list with invalid root/last pointers

modules, core functionality and rsyslog v3…

As I have written, I have begun to work on rsyslog v3 (and so far I am pleased to say that I have made quite good progress on it). One of the things with rsyslog v3 is that it will have an even more module architecture, utilizing loadable modules for almost everything. I asked on the mailing list about backward compatibility and I received this very good response by Michael Biebl:

One thing I was wondering:

If you intend to shift all (even core) functionality into loadable modules, how do do you handle things like –help or available command line options like -m?

Do you want to hardcode it or will you provide an interface, where rsyslog will query the module about its help message and available options.

I’m also still a bit uncertain, if moving everything resp. core functionality to modules is a good idea (for problems you already mentioned). Imho having all core functionality in a single binary is simply much more robust and fool proof. For things like the SQL db output plugin, the module interface is great, because it avoids to pull in large library and package dependencies and allows to install them on a as need basis. For other functionality I still need to recognize the benefits.

Rainer, could you roughly sketch, how you envision to break rsyslog into loadable modules in v3. Which kind of functionality would be loadable as module, which functionality do you plan to keep in the rsyslogd binary. A listing of all (planned) modules + the provided functionality and requirements would really help.

Another thing: Say you move the regexp support into a separate module. If a regexp is then used in rsyslog.conf, will you bail out with an error, simply print a warning (which could go unnoticed and the poor administrator doesn’t know why his regexp doesn’t know) or load modules on demand.

For the latter you’d need some kind of interface to query the *.so files for their supported functionality. I.e. the modules would export a list of config directives it supports and rsyslog could upon startup query each available module and create a map.

So, e.g. the ommysql module would export its support for the :ommysql: config directive. Whenever rsyslog finds such a config directive it could/would load the modules on demand.

Same could be done for the command line parameters. The imklog module would export, that it supports the -m command line parameter. Whenever that commandline parameter is used, rsyslog would know which module to load.

There are only rough ideas and there is certainly still much to consider. But what do you think about the basic idea?

This is a great response – it not only asks questions but offers some good solutions, too. It comes at a perfect time, too, because there is much that is not yet finalized for v3. For sure I have (hopefully good ;)) ideas, but all of them need to be proven in practice. The issues that come up here are a good example.

So, now let me go into the rough sketch about I envision what v3 will do. Note that it is what I envision *today* – it may change if I get good reasoning for change and/or smarter solutions.

First, let me introduce two blog posts which you may want to read before continuing here:

And, most importantly, this post already has the root reasoning for pushing things out of the syslogd core:

Let me highlight the two most important parts from that later post:

This is exactly the way rsyslog is heading: we will try to provide an ultry-slim framework which offers just the basic things needed to orchestrate the plug-ins. Most of the functionality will indeed be available via plug-ins, dynamically loaded as needed.

With that design philosophy, we can make rsyslog really universally available, even on low-powered devices (loading just a few plug-ins). At the high end, systems with a lot of plug-ins loaded will be able to handle the most demanding tasks.

And this is actually what the v3 effort is all about: rsyslog should become as modular as possible, with the least amount of code in the core linked binary and everything else provided via plugins. I still do not know exactly how that will happen, I am approaching it incrementally. I am now at the input plugins and trying to set them right.

In the longer term, there will be at least three different types of plugins: output, input and “filter”. I think I do not need to elaborate about the first to. Filter plugins will provide work together with expressions, another feature to come. It will enhance the template and filter system to provide a rich expression capability supporting function calls. For example, a template may look like this in a future release:

$Template MyTemplate, substr(MSG, 5, 10) + “/” + tolower(FROMHOST) + “/”

and a filter condition may be

:expr:substr(MSG, 5, 10) == “error” /var/log/errorlog

Don’t bash me for the config format shown above, that will also change ;)

Regexpt functionality will then be provided by something like a regexp() function. Functions will be defined in loadable modules. Pretty no function will be in the core. A module may contain multiple functions.

Bottom line: almost everything will be a loadable module. If you do not load modules, rsyslog will not do anything useful.

Now a quick look at the command line options: I don’t like them. Take -r, for example. Sure, it allows you to specify a listener port and also allows to convey that a listener should be started at all. But how about multiple instances? How about advanced configuration parameters? I think command line options are good for simple cases but rsyslog will provide much more than can be done with simple cases. I favor to replace all command line options with configuration file directives. This is the right place for them to be. Except, of course, such things like where to look for the master configuration file.

Which brings up backward compatibility. As you know, I begin to be puzzled about that. After all, rsyslog is meant to be a drop-in replacement for sysklogd. That means it should run with the same options like sysklogd – and should also enable administrators to build on their knowledge with sysklogd. Tough call.

Thankfully, sur5r introduced the idea of having a compatibility mode. He suggested to look at the absence of a rsyslog.conf file and then conclude that we need to run in that mode. That probably is a good suggestion that I will pick up. It can also be extended: how about a, for example, “-c” command line switch. If absent it tells rsyslog to use compatibility mode. And it should absent in previous versions as well as sysklogd, because it was not defined there.

Now let’s think. If we know we need to provide compatibility, we can load a plugin implementing compatibility settings (again, moving that out of the core functionality). Once loaded, it could analyze the rest of the command line and load whatever modules are necessary to make rsyslogd correctly interpret a post v3 configuration file. That way we have a somewhat larger then necessary memory footprint, but all works well.

Then back to native mode. Here, indeed, I’d expect that the user loads each and every module needed. I assume, however, that for any typical package the maintainer will probably load all “core” functionality (like write to file, user message, several inputs, common filter functions, …) right there in the default rsyslog.conf. This make sense for today’s hardware. It also will make the config quite foolproof. A good way to implement that would work on the semantics of $IncludeConfig. How about:

$ModLoad /whereever/necessrayplugins/

which would load all plugins in that directory.

The key point, however, is that in a limited environment, the very same binaries can be used. No recompilation required. This would be scenarios with e.g. embedded devices – or security sensitive environments where only those components that are absolutely vital should run (which is good practice because it protects you from bugs in the not-loaded code).

I personally find it OK to handle the situation as described above. I don’t like magic autoloading of modules.

This modular approach has also great advantages when it comes to maintaining the code and making sure it is as bugfree as possible. Modules tend to be small, modules should be independent of each other. So testing and finding/fixing bugs that escaped testing should be considerably easier than with the v2 code base. There are also numerous other advantages, but I think that goes to far for this post…

Comments are appreciated. Especially if you do not like what I intend to do. Now is the time to speak up. In a few weeks from now, things have probably evolved too far to change some of the basics.

rsyslog changes for 2007-12-17

Yesterday’s rsyslog changes:

2007-12-17
– fixed a potential race condition with enqueueMsg() – thanks to mildew
for making me aware of this issue
– created thread-class internal wrapper for calling user supplied thread
main function
– solved an issue when compiling immark.c on some platforms. LARGEFILE
preprocessor defines are changed in rsyslog.h, which causes grief
for zlib. As a temporary solution, I have moved rsyslog.h right at the
beginnng of the include order. It’s somewhat dirty, but it works. I think
the real solution will be inside the autoconf files.
– moved thread termination code out to threads.c
– implemented $MarkMessagePeriod config directive
– command $ResetConfigVariables implemented for immark.c
– begun imklog, replacing klogd.c (finally we get rid of it…)
– implemented $DebugPrintKernelSymbols
– implemented afterRun input module interface function
– implemented $klogSymbolsTwice config directive

As you can see, it was quite a busy day. The input module interface has already materialized for the most part.