new rsyslog config system materializes…

The past weeks I have worked pretty hard on the new rsyslog config system. The legacy system was quite different from what you expect in modern software. Most importantly, the legacy system applied directives as the config file was read, which made it extremely hard to restructure the config file format. That also prevented features like privilege drop from working fully correct.

I have now basically changed the config system so that there is a clear difference between the config load phase and applying the config. Most importantly, this means privilege drop now works correctly in all cases (but I bet some users taking advantage of oddities of the old system will probably complain soon ;)). Other than that, there are no user-visible enhancements at the moment. However, the internal plumbing has changed dramatically and enables future change. Most importantly, this finally creates a path to a new config language, as we now have a clear interface as part of the in-memory representation of the config, which is config language agnostic.

With this initial release, there may still be some things inside the core that can be optimized. Right now, the system aims at the capability to have multiple config objects loaded (but not active) at the same time. However, there are some data instances where this is not cleanly followed in order to reuse some code. This is not a problem, because the rest of the rsyslog engine does not support dynamic config reload (and thus multiple configs at runtime) at all.

Also it must be noted that the current code is quite experimental. So there is some risk involved in running the initial 6.3.0 version. However, all dramatic changes were made to the config system. That means if the system initializes correctly, it will most probably run without any issues. The risk window is constrained to the initial startup, what should be quite controllable. Users that use privilege drop are advised to check that their configurations work as expected. The previous system did some initialization with full privileges. This is no longer the case, except for modules that actually require full privileges (e.g. imtcp to bind privileged ports). Most importantly, files are now created with dropped privileges right from the beginning. I expect that some (unclean) configurations will run into trouble with that. The good news about that is that the would run into trouble with older releases as well, but only after a HUP. Now things break immediately, what makes them much easier to diagnose.

So what’s next in regard to the config? It depends a bit on the overall workload. I will probably try to have a look at the config language next, which is another non-trivial task. Also past discussions tell me that it is extremely hard to find a format that satisfies all needs. I have already reviewed the last elaborated discussion (June and July 2010 – search for “conf” on these pages) and begun to reconsider some of the options. But this is probably a topic for a separate blog posting…

rsyslog config reload – random thoughts

This blog post is more or less a think tank, maybe even an utility to clear my mind. Please note that I am not talking about anything that is right now present in rsyslog. I am not even saying that it will be in the near future. But I’d like to think a bit about alternatives on the route to there.

Let’s assume rsyslog shall have two abilities:

  • use different config languages
  • dynamically reload a config without a full restart (thus applying a delta between new and old config)

In any case, the usual approach is to have an object representing a full configuration. This is an in-memory object. Usually, it is created during parsing the configuration file(s). During that parsing, nothing of the new config is actually carried out, just the in-memory presentation build. In that model, it would also be possible to have several fully populated config objects in core at the same time. The important thing to note is that none of them actually affects the current system – they are just loaded and ready to use.

Usually, in such a design, there is one thing that is called the “running conf”. This configuration is the one the system actually uses for processing.

So how is a new configuration activated? In a first step, the config parsers create an in-memory object. Once this is done, that object is a candidate config, one that could be activated. To actually activate it, the candidate config is loaded as running config. During that process, all the settings are applied and services are started. Please note that a dynamic config reload can be done by first creating a delta between the candidate config and the current running config. This delta can then be used to keep the currently existing config running, but just modify it so that it is equivalent to the candidate config. This process can be less intrusive than shutting down the running config and restarting it based on the candidate config. As an example, a rsyslog system may have several hundered incoming TCP connections open. If the delta is just the addition of a new output file, there is no need to shut down these TCP connections as part of the (delta-driven) candidate config activation. Whereas, if no delta were to be used, all connections would need to be shut down and re-established after restart. There is obviously benefit in delta-based configuration apply. However, it should be noted that there are many subtle issues associated with creating and applying the delta.

Thinking about such a loaded/candidate/running config system for rsyslog, there is one overall issue that complicates things: loadable modules! Each module not only provides extra functionality, it also provides a set of configuration settings to modify its functionality. As such, we have a problem with config parsing: in order to fully parse a config file, we actually need to load the module as part of config processing. Or more precisely, we need to load its configuration file processor. However, it does not make sense to split each module into a config file processor and the actual module, at least I think so. Splitting them up would make things over-complex IMHO. However, we must demand that a module, in its config processing code, must not do anything other then creating configuration objects. Most importantly, it must not start any service or act out any non-config related activity. If it would do so, it would possibly affect the current running configuration. Also, config processing does not necessarily mean that the config will actually be activated, so the module must not assume that its processing will ever be called. As a side-note, this is one of the issues with the current legacy configuration system (as seen in all versions prior to v6 and early v6 versions).

Rsyslog traditionally keeps a list of loaded modules. Modules are added to that list when they are loaded inside the configuration system. During system startup, that very same list is also used to activate services inside the module. So that single list serves both as

  • a registry of loaded modules (e.g. to know what is already available and what needs to be unloaded)
  • a registry of modules required to for configuration

Both functions are tied together into a single list because rsyslog currently has only the concept of a single configuration, and not a candidate/running (multi-)config system. For the latter, it is necessary to differentiate between the two cases:

As we need to load modules during config parsing, we still need a single global list that keeps track of modules already present inside the system. Please note that with a multi-config system, a module that is first “loaded” inside a currently being parsed configuration file may actually already be loaded inside the system. In that case, a duplicate load must be avoided and the already existing module be used. The global, config-independent list is required to support this functionality.

On the other hand, such a global list can no longer be used to activate services for a specific config. This is easy to see when we have a config A which uses e.g. a TCP listener and we have a config B which does not. If B is activated, the TCP listener shall obviously not be activated. As such we need a dedicated, config-specific list of modules that are part of the current configuration. Let’s call this one the “config module list” and the other one the “loaded module list”.

The loaded module list is than just use to keep track of which modules are loaded. Also, it will/can be used to locate a module, so that global functions (like the config parser) can be found and carried out. Note that I call the config parser global, not config-specific. The reason is that the config parser does not take a config instance as input, but rather has no input (other than the config language) but has the config instance as output. As such, it emits config specific data, but does not require it for processing. So it is global.

The config module list in contrast must hold all config specific data elements for the module (most importantly the module specific config instance itself). The config module list is to be used for all config-specific actions. For example, it will be used to activate a module’s services when the candidate config becomes a running config (maybe via a delta-apply process).

Note that on-demand module unload can be done via reference counting, which is already implemented in rsyslog. When a module is put onto a config module list, the count is incremented. If it is removed from such a list (usually because the in-memory config is destroyed), the count is decremented. An unload happens if the reference count reaches zero. If the module would be required by later processing another config, that would trigger a reload just as if the module had never before been loaded.

Note that a clearly defined and implemented split in global vs. configuration specific functionality is of vital importance for a multi-config system. This probably has some subtle issues as well. Right out of my head, I can think of the problem of some potentially global configuration settings (a term that seems to contradict itself in this context – just think a bit about it…). For example, we have the module search path, which tells us from where to load modules. With different configs, we can potentially have different module search pathes. That, in turn, can load to modules with the same name being loaded from different locations. That means we could potentially have different functionality, including different sets of config parameters (!) in the system at the same time. This could lead to some hard to diagnose issues. So it looks necessary to have the ability to load the same module via different pathes concurrently, and apply only the “right” module to the config in question. Looking at the current code base, implementing this would be even harder than just splitting out the global/config-specific lists. Maybe this would be something that shall be added at a later stage, *if* at all we take the path down that road. Also, there may be other issues along the way that I do not currently envision…

coding new config format begun

After a long discussion about potential new config formats for rsyslog, we came to the conclusion that the current format is not as bad as I thought and just needs scoping (OK, the whole story is longer, but I can’t repeat that lengthy discussion in a single blog post, see mailing list archive for details).

After some thinking, I finally started coding today.

I have begun to implement action scoping. A snapshot with partial functionality is available at

http://git.adiscon.com/?p=rsyslog.git;a=shortlog;h=refs/heads/newconf

It does NOT yet include the necessary output plugin interface change (or updated plugins), but it implements

$Begin action
$End action
$StrictScoping [on/off] — off default

So if you want to play a bit with it, feel free to do so. Note that it disallows global statements within action scope and in overall has somewhat better detection of user errors (these are easier to detect when scoping is used).

Note that scoping is purely optional: if not enabled, it will not be used. So any current rsyslog.conf remains valid.

I will see that I change the projects’s output plugins next week, and will possibly then make an experimental release.

Why I think Lua is no solution for rsyslog

During our discussion about the new rsyslog config format, a couple of times it was suggested to use Lua inside rsyslog. So far, I reject this idea for a couple of reasons, and I thought it is time to write up of them.

But first of all let me explain that I do not dislike Lua. I think it is a very good tool, and it can be extremely useful to embed it. What I question is if Lua is the right thing for rsyslog.

Let’s first think about what rsyslog is: it is a very high-speed, very scalable message processor that handles message processing via fine-tuned algorithms and with a lot of concurrency. As one important detail, the underlying engine can be seen as a specialized SIMD (single instruction multiple data) computer. That is a machine that is able to execute the same instruction on multiple data elements at the same time concurrently. Speaking in rsyslog terms, this means that a single rule will process multiple messages “at once” in a concurrent operation (to be precise, the level of concurrency depends on a large number of factors, but let’s stick with the simplified view). Also, rsyslog is able to execute various parts of a ruleset concurrently to other parts. Some of of these work units can be delayed for very long time.

So a rsyslog configuration is a partially ordered set of filters and actions, which are executed in parallel based on this partial ordering (thus the concurrency). Each of the parallel execution threads then works in a SIMD manner, with the notable fact that we have a variable number of data elements. All communication between the various parts is via a message passing mechanism, which provides very high speed, very high reliability and different storage drivers (like memory and disk). Finally, speed is gained by state-of-the-art non-blocking synchronization and proper partitioning (quite honestly, not much now, but this work is scheduled).

Then, there is Lua. Reading about the implementation of Lua 5.0, I learn that Lua employs a virtual machine and all code is interpreted. Also, other than rsyslog’s engine, Lua’s VM is a VM for a programming language, not a message processor (not surprisingly…). Thus, it’s optimized statements are for control-of-flow instructions. I don’t see anything that permits SIMD execution. That alone, based on rsyslog experience, means a 400 to 800% drop in performance — just use a rsyslog v3 engine which did not have this mode. Not that this is the difference between the ability to process e.g. 200,000 messages per second (mps) vs. 50,000 mps. That in turn is already argument enough for me to reject the idea of Lua inside rsyslog.

But it comes even worse: nor surprisingly for something that claims to be simple and easy, Lua’s threading is limited and makes it somewhat hard to integrate with threaded code (which would not be an issue, as the core part of rsyslog would be replaced by Lua if I’d use it). So coroutines would probably be the way to go. Reading the coroutines tutorial, I learn

“Coroutines in Lua are not operating system threads or processes. Coroutines are blocks of Lua code which are created within Lua, and have their own flow of control like threads. Only one coroutine ever runs at a time,”

Which is really bad news for rsyslog,which tries to fully utilize highly parallel hardware.

“and it runs until it activates another coroutine, or yields (returns to the coroutine that invoked it). Coroutines are a way to express multiple cooperating threads of control in a convenient and natural way, but do not execute in parallel, and thus gain no performance benefit from multiple CPU’s.

That actually does not need a comment ;)

“However, since coroutines switch much faster than operating system threads and do not typically require complex and sometimes expensive locking mechanisms, using coroutines is typically faster than the equivalent program using full OS threads.”

Well, with a lot of effort going into threading, rsyslog is much faster on multiple CPUs than on a single one.

So weighing this, we are back to how the rsyslog v1 engine worked. I don’t even have performance numbers for that at hand, it was too long ago.

Looking at the Lua doc, I do not find any reference to trying to match the partial order of execution into something that gains concurrency effects (and given the single-threadedness, there is no point in doing so…).

I have not dug deeper, but I am sure I will also find no concept of queues, message passing etc. If at all, I would expect such concepts in an object oriented language, which Lua claims not to be primarily.

So using Lua inside rsyslog means that I will remove almost all of those things that helped me make up a high performance syslogd, and it will also remove lot’s of abilities from the product. The only solution then would be to heavily modify Lua. And even if I did, I wonder if it’s maintainers would like the direction I would need to take, as this would add a lot of extra code to Lua. Which supposedly is not needed for the typical simple (read: non highly parallel) applications that use Lua.

This is why I reject Lua, as well as other off the shelf script interpreters for rsyslog. They do not reflect the needs a high-speed, high-concurrency message processor has.

Of course, I’d like to be proven wrong. If you can, please do. But please do not state generics but rather tell me how exactly I can gain SIMD capability and look-free multi-threaded synchronization with the embedded language of your choice. Note that I am not trying to discourage you. The problem is that I received so often suggestions that “this and that” is such a great replacement for the config, invested quite some time in the evaluation and always saw it is the same problem. So before I do that again, I’d like to have some evidence that the time to evaluate the solution is well spent. And support for rsyslog’s concurrency requirements is definitely something we need.

I would also like to add some notes from one (of the many) mailing list post about Lua. These comments, I hope, provide some additional information about the concerns I have:

David Lang stated:
> If speed or security are not major issues, having a config language be
> a
> snippet of code is definantly convienient and lets the person do a huge
> number of things that the program author never thought of (see simple
> event correltator for an example of this), but in rsyslog speed is a
> significant issue (processing multiple hundreds of thousands of logs
> per
> second doesn’t leave much time) and I don’t think that an interpreter
> is
> up to the task. Interpreted languages also usually don’t support
> multi-threaded operation well.

And I replied:

This is a *very* important point. And it is the single reason why I
re-thought about RainerScript and tend not to use it. While (in design) it
can do anything I ever need, the interpretation is too slow — at least as
far as the current implementation is concerned. I have read up on Lua, and
there seem to be large similarities between how Lua works and how
RainerScript actually (in filters!) works. Let met assume that Lua is far
more optimized than RainerScript. Even then, it is a generic engine and
running that engine to actually process syslog data is simply too slow.

In order to gain the high data rates we have. Using my test lab as an
example, we are currently at ~250,000 mps. The goal for my next performance
tuning step will be to double that value (I don’t know yet when I will start
with that work). Overall, the design shall be that rsyslog almost linearly
scales with the number of CPUs (and network cards) added. I’ve done a couple
of design errors with that in the past, but now I am through with that, have
done a lot of research and think that I can achieve this nearly-linear
speedup in the future. That means there will no longer be an actual upper
limit on the number of messages per second rsyslog can process. Of course,
even on a single processor, we need *excellent* performance.

For the single-processor, this means we need highly optimized, up to the task
algorithms that don’t do many things generically.

For the multi-processor, that means we need to run as many of these tasks
truly concurrently.

For example, in the last performance tuning step, I radically changed the way
rules are processed. Rather than thinking in terms of messages and steps to
be done on these, I now have an implementation that works, semi-parallel, on
the batch as whole and (logically) computes sub-steps of message processing
concurrently to each other (to truly elaborate on this would take a day or
more to write, thus I spare the details).

I don’t think any general language can provide the functionality I need to do
these sorts of things. This was also an important reason that lead to
RainerScript, a language where I could define the level of granularity
myself. The idea is still not dead, but the implementation effort was done
wrongly. But I have become skeptic if a language at all is the right
approach.

Also note the difference between config and runtime engine. Whatver library /
script/ format/ language we use for the config will, for the reasons given
above, NOT be used during execution. It can only be used as a meta-language
to specify what the actual engine will do.

So if we go for Lua (for example), we could use Lua to build the rsyslog
config objects. But during actual execution, we will definitely not use Lua.
So we would need a way to express rsyslog control flow in Lua, what probably
would stretch the spec too far. Note that a Lua “if then” would not be
something that the engine uses, but rather be used to build a config object.
So we still have the issue how to specify an “rsyslog engine if then” inside
a Lua script”. Except, of course, if you think that Lua can do regular
processing, which I ruled out with argument above.

And a bit later I added:
The order of execution of the task inside a given configuration
can be viewed as a partially ordered set. Some of the tasks need to be
preceded by others, but a (large) number of tasks have no relationship. To
gain speed and scalability, the rsyslog engine tries to identify this partial
order and tries to run those task in parallel that have no dependency on each
other. Also, one must note that a config file is written with the assumption
of a single message traversing the engine, which is a gross simplification.
In practice, we now have batches (multiple messages at once) traversing
through the engine, where a lot of things are done concurrently and far
different from what one would expect when looking at the config file (but in
a functionally equivalent way). It is this transformation of in-sequence,
single-message view to partial execution order, parallel view that provides
the necessary speedup to be able to serve demanding environments.