Internet Security … back in 2000

Wanna travel back in time? I just dug out one of my Internet security presentations from May 2000. It was aimed at the executive level, so there is not much technical detail in it. It is unfortunately in German, but non-German-speakers can probably still grasp the ideas.

Looking at it 13 years later, I have the feeling we have not so much evolved. Many of the issues still persist, even fake mail is still used – but to a far-less extent than what it was in 2000. At least business users (and mail systems) handle this decently … most of the time ;)

 

Log Anonymization with rsyslog

Starting with version 7.3.7, rsyslog natively support anonymizing log records. This is done with the help of a new module called mmanon. In short words, the module inspects the message and replaces IP-Addresses with an anonymized alternative representation.

This functionality is implemented via the (message modification module) action interface. Thus, the user has full control over when the anonymization happens. While it generally is desirable to anonymize as soon as possible, there can be situations where some data must be collected or processed un-anonymized. In that case, the anonymization can be deferred until after that processing.

The mmanon module basically has two operation modes: a faster but somewhat less secure and flexible mode (“simple”), in which specific octets are overwritten by user-configurable characters. For example, in this mode the IP address “10.1.65.123” could be replaced to “10.1.xx.xxx”. Note that the size of the specific octets is preserved.

In the other mode (“rewrite”), which is default, parts of the IP address are always zeroed out and the result is written in normalized form. The previous example would become “10.1.0.0” in this mode. This also means that the message size may shrink. In rewrite mode, the exact number of bits that shall be anonymized can be specified. The default is 16, but any other value may be selected. If, in the above example, 12 bits would have been selected, the output would have been “10.1.64.0”. This provides great flexibility to meet corporate and legal requirements. Note that this form still permits to use some coarse-grained analysis tools, like for example Geo-IP lookups (of course, depending on the number of removed bits).

Currently, mmanon does support IPv4, only. However, support for IPv6 is planned, we are just waiting for some feedback before going further. The new module is available immediately and can both be found in the source tarball as well as Adiscon-provided rsyslog RPMs and rsyslog Ubuntu Packages.

rsyslog TCP stream compression

I have begun to work on a way to “stream-compress” syslog messages over plain TCP syslog protocol, with the intent to support it over standard syslog as well if the idea works out.

Traditionally, rsyslog does message-level compression. That is each single message is compressed and if there is sufficient compression gain, the message is transmitted in compressed form. This works perfectly with UDP and TCP syslog, but the compression ratio is limited. The problem is that a single message does not offer much repetition to be shrinked. This mode still works surprisingly well.

However, we are now doing one step further: for TCP, we have a session, and so we are able to not only compress single messages but rather the full stream of them. That offers considerably larger compression potential. In its extreme end, it can be compared to gzip’ing a log file. Those of you who already did this note that we usually have very high compression ratios 5-to-1 or even 10-to-1 are not uncommon.

To gain these ratios, we need to run the compressor in a mode where it outputs data only when it decides it is ready to do so. This means that upon transaction completion, we may still have some data unsent (possibly even all data!). At the expense of compression ratio, this can be “solved” but forcing the compressor to flush at transaction end. This will degrade compression.

I have now done a first PoC to check the validity of the idea. It is implemented in omfwd and imptcp (NOT imtcp) only. Flushing at transaction end is currently not supported. We are right now practice testing this, and I hope to have some results when I am back from my trip to Tallinn.

rsyslog output plugin wrangling…

For some hours, I am fighting with parts of rsyslog design around 2006 (or so): initially, we thought that all actions will be terminated by canceling their thread WHEN they not shutdown within the queue shutdown timeout. Then, we saw that it was better to at least try it cooperatively (cancellation is still required if that does not work). Now, with imrelp, I have a situation where I need to pass some information down to librelp when it comes to termination. Supposedly a very simple thing to do (a single call). … unfortunately, the interface does not provide access to the action in question.

Hopefully, I finally found a work-around, via a “terminate immediately” pointer so far in use internally for the action engine. We’ll see…

Should I use rsyslog’s new or old config style?

I got a very interesting question on the rsyslog support forums, and I thought I share it, together with the answer, here at a more prominent spot:

After over a decade of using stock bsd syslog, I finally have a need to do some more complicated processing of logs (splitting off Postgres query logs from general Postgres logs), and after looking at other options (basically syslog-ng), I think rsyslog looks like a better fit. I’m mainly in it so I can use regex matching, but thinks like the log queueing and being able to easily move to db storage in the future look good.
Since I’m new, I’d considered that I might get a jump on things by sticking with the newest config syntax. But after doing some googling for examples and looking at the examples in the rsyslog wiki, it seems like maybe the newest syntax might be a bit too new for a beginner – I learn best by example.
Are there any serious downsides to NOT going with the most current syntax?

The answer is that the old syntax is still fully supported by the versions and will probably remain for quite some while (except for some very few exceptions, which we couldn’t carry over for good reasons – this is documented in the compatibility docs on the web site). Some parts of it are considered so important that they most probably never will go away. Actually, if you want to do simple things, the old syntax has its advantages. The more complex your processing gets, the more you benefit from the new syntax. But you can mix and match new and old style in almost all cases.

So my suggestion would be to get started using the old syntax and as soon as you begin to do more complex things, you can switch over to the new style. That’s actually the way it is designed ;) A good indicator of when it would be benefitial to move to new style is when you begin to use a lot of directives beginning with $, especially if they modify an action. Also, if you move to action queues, I would strongly suggest to use new style. It is far more intuitive an less error-prone.

To provide a bit more background information, there is an important non-technical reason why the classical syntax is remain for a long time: basic syslog.conf format is extremely well known, covered in a lot of text books, taught in numerous courses and used in a myriad of Internet tutorials. So if we would abandon it, we would thrash a lot of people’s knowledge and help resources. In short: we would make it much harder for folks that it would actually need to be. This has never been rsyslog philosophy. Providing the ability to changed gradually and with growing needs is a core goal.

multi-character field delimiters

On the rsyslog mailing list, the ability to use multiple characters as field delimiters had been requested recently. Today, I took some time off the my schedule and implemented that functionality. It is probably very useful for a number of cases. An important one is probably in combination with control character escaping, where rsyslog by default expands a single character into a four-byte escape “#ooo” with o being the octal character code (so  e.g. US ASCII HT [horizontal tab] becomes “#011”).

The new functionality is available for the RainerScript field() function. I do not intend to add it to template strings.

Some quick usage sample:

The following is the traditional way of single-byte delimiters, here with the comma character (US ASCII decimal code 44):

set $!usr!field2 = field($msg, 44, 2);
template (name=”fld” type=”string” string=”‘%$!usr!field2%’ — msg: %msg%n”)
action(type=”omfile” file=”/path/to/logfile” template=”fld”)

And this is the same with the string “#011” as delimiter:

set $!usr!field2 = field($msg, “#011”, 2);
template (name=”fld” type=”string” string=”‘%$!usr!field2%’ — msg: %msg%n”)
action(type=”omfile” file=”/path/to/logfile” template=”fld”)

Note that the field number (index) need not necessarily to be fixed. It can be derived from an appropriately formatted message. Here the first field contains the actual field to extract, delimiter is “#011” again:

set $!usr!idx = field($msg, “#011”, 1);
set $!usr!field = field($msg, “#011”, $!usr!idx);
template (name=”fld” type=”string” string=”‘%$!usr!field%’ — msg: %msg%n”)
action(type=”omfile” file=”/path/to/logfile” template=”fld”)

In that last sample the $msg of

“3#011val 1#011val 2#011val 32#val 4”

would return

“val 2”

Keep in mind that the first field is the field index, so the actual data fields start at 2 (field 1 is “3”, field 2 is “val 1”, field 3 “val 2” and so on).

This functionality is already present in git master head and will be released as part of 7.3.7 in the not so distant future. Some more details can be found inside the RainerScript documentation page.

performance of liblognorm/rsyslog parse tree

We are currently considering a new table lookup feature for rsyslog, and as usual, performance is a priority. As such, we intend to base the implementation on the liblognorm parse tree, which is highly efficient. In fact, the adaption (maybe a partial rewrite) I have in mind for rsyslog actually has O(1) time complexity. This drew some discussion, as parse trees are usually considered  to be O(log n). So I thought I quickly lay out why constant time complexity is possible.

To set some ground, let’s define the use case where this applies.We are talking about string matches, not regular expressions. In the case of liblognorm, we are also talking about exact string matches, and let me assume this for rsyslog’s lookup table functionality as well (at least in “string” mode). Note that this may be extended, and should be no problem, but I have yet to properly evaluate the effect on the parse tree. So let’s assume exact matches in this blog post. So basically, we have a set of strings and would like to see if there is an exact match (and pick up some value associated with that match). For example, we may have the strings “Berlin”, “Boston”, “Bogota” and “London”. That example is obviously extremely simplified, yet it will help us understand.

Whenever we talk about time complexity, we need to make sure what exact n we are talking about. When it comes to matching values, n is usually the number of different values. In our example, this n would be four. Of course, one could also look at string lengths, as the length obviously influences the complexity of the comparison operation. Let’s call this m, then in the above sample m would be six (length of longest string). If we take the unusual step to include that m, a linear list would have time complexity O(n*m), a binary tree O(log n*m) and rsyslog’s parse tree O(m). HOWEVER, there usually is an upper bound on string length, and for that reason m is taken as a constant in almost all cases. By definition, it will then no longer be mention in O notation, and so we end up with the usual view that a linear list is O(n), a binary tree O(log n) … and rsyslog’s parse tree O(1).

This is possible because rsyslog’s parse tree is based on the radix tree idea and kind of “compresses” the strings rather than including them in every node. The conceptual in-memory representation of a tree based on above values is as follows:

[root] - B --- O --- GOTA
  |      |     + --- STON
  |      +---- ERLIN
  + ---- LONDON

As you can see, the tree is actually based on single characters that are different between the strings. So we do not have each string at the tree level, but only the rest of what is different. On this works a simple state machine. On a conceptual level, the search string is processed character by character, so each character represents a state transition. For obvious reasons, there can be only as many state transitions as there are characters inside the longest string. This is what we called m above. However, the number of different strings is totally irrelevant for the search, because it does not influence the number of state transitions required. That number is what we called n. So in conclusion, the time complexity is only depending on m, which is kind of constant. Consequently, we have O(1) complexity.

Just to stress the point once again: we do not iterate over the individual strings in a tree-like manner, instead we iterate over the index string characters themselves. This creates a very broad and shallow tree, even for a large number of index values.

On real hardware, the number of strings of course has some slight effect, especially when it comes to cache hit performance. While a small tree may fit into the available cache, a much larger tree obviously will not. However, from a theoretical point of view, this does not matter, we still have constant time for the lookup. From a practical perspective, this also does not matter so much: any larger tree will have a larger memory requirement and thus a lower cache hit rate. In fact, rsyslog’s parse tree will probably perform better because of its compressed structure, which makes cache presence of at least top-level elements more likely.

For rsyslog, that’s all I need to tell. For liblognorm, the story is a bit longer. It is only “almost” O(1). While the string matches themselves can be carried out in constant time, liblognorm also supports parsers to pull out specific parts of a log message. The current parsers are all O(1) as well, so this does not change the picture. However, state transitions involving a parser are no longer simple operations where the transition is immediately clear. If, for a given prefix, multiple parsers may apply, liblognorm must iterate through them until a matching one is found. This is no longer O(1), but depends on the number of parsers. Of course, we can claim that the number of parsers is still a (even small) constant, and so we can still say all of this is O(1) without violating the definition. While this is true, there unfortunately comes another complexity with parsers: With constant string, backtracking is never required, as we immediately know the exact state transition. With parsers, we have dynamic elements, and later text matches may turn out to be a non-match. In that case, we need to go back to parser detection and see if another parser also fits. If so, we need to give it a try and need to re-evaluate the parse tree from that point on. This process is called backtracking and can ruin the runtime to become as worse as O(n^2), just like regular expressions. Obviously, the problem exists primarily of very generic parsers are used, like “word”. If, for example, parsers as specific as IPv4 are used, there is no chance for more than one parser matching. So the problem cannot occur. This is also why I caution against using too-generic parsers. Thankfully, very generic parsers are not used too frequently, and so we usually do not see this problem in practice. This is why I call liblognorm to be “almost” O(1), even though it may degrade to O(n^2) in extreme cases. Heuristically said, these extreme cases will not occur in practice.

The rsyslog parse tree does not use parsers inside the parse tree, just the fixed state transitions. As such, it actually is O(1). Oh, and yes: we should probably call the rsyslog “parse tree” better a “lookup tree” or “search tree” but for historical reasons it currently is named the way it is…

On backporting patches

Every now and then, I get (polite) requests to backport some patch to older versions. First of all, let me say that I usually patch things in the oldest affected (and supported) version and then “upport” these to the newer releases. So in those cases, everything is already covered (it pays to look into the ChangeLog file in the project root directory). However, “supported” is a keyword. I usually do not backport patches to older, unsupported versions (except if there exists good reasons and the version is in widespread use).

Let me elaborate a bit why I don’t do that: backporting all patches to the many (unsupported) versions of rsyslog we have is a very time-consuming task (even though it may be rather simple, as probably in this case). So the more I backport, the less new stuff I get done. I went through this last year and it resulted in a *very considerable* productivity reduction. I am glad I got rid of all that in the later half of 2012, and so I am hesitant to go back to that business again. Just for the records: we of course support *all* versions for support contract customers, but then it is of benefit to the project because it provides funding.  Whereas it probably is even counter-productive in other cases: the more I backport, the less the interest in moving to new releases and so we stay with all that outdated stuff.   Also, if it is very simple for package maintainers to stick to outdated stuff, they probably have few incentive to move to new ones (to the disadvantage of many users). So if they want to stick with the old stuff, let them *they* pay the price for that and not *the project* ;) There are exceptions of course, and the maintainers I work with know that very well ;)

rsyslog disk queues: refactor or redesign? – part I

I am currently thinking about the design of rsyslog’s disk queues. There are some things that I really like about them, but there are also lots of things I dislike. This blog post is meant to save as a basis for discussing whether the disk queue system should better be refactored or completely redesigned. The actual discussion should please be kept on the mailing list. I just blogged because this makes it easier to have a stable reference on where that whole thing started. This is the first in a series of ? blog posts. It describes the current state as of now (rsyslog 7.3.1 and below).

First of all, it must be noted that both serious refactoring as well as redesign are rathter costly. So I don’t know when I can actually do that – it probably depends on the fact if there are some sponsors at least for parts of the work available. With that said, let’s get down to technology:

The disk queue system is designed to be used primarily in exceptions (like remote destination being down for a longer period of time) or for high-reliability environments. A main design goal was that they should not take too much disk space, at least not more than necessary at any point in time. Also, human operators should be able to fix things (or at least understand them), when things go really wild.

To solve these needs, queue data is stored in pure US-ASCII text form, without any kind of compression or binary storage. The encoding used is rather simple it’s basically name, type and value of any property, one per line (the details can be found in the relevant source files). The “not too much space” requirement is solved by using a set of rotating sequential files, where each file has a configured maximum size (10MB by default) [to be technically precise, the maximum is not totally enforced, a complete message is persisted to the file, even if that goes slightly above the maximum]. There is a kind of head and tail pointer into that rotating file set: the head is used by new messages coming in. They are always appended to the most recent file. The tail pointer is where messages are dequeued from, it alwass points to the oldest file inside the file set. Originally, there were just these two pointers, and a file was deleted as soon as the tail pointer moved past the end of the current file. However, this was modified to cover the ultra-reliable use case in v5: The problem with two pointers was that when messages were dequeued at the end of the tail file, that file was immediately deleted. When now something went wrong, the potentially unprocessed messages were lost. To prevent this, v5+ has actually three pointers: the head pointer (as before) a read pointer (former tail) and a delete pointer (kind of “real tail”). The read pointer is now used to dequeue the messages, and the delete pointer is used to re-read messages from the file after they have been processed and delete queue files, if they go out of the current file set while doing so (this actually is exactly the processin that the former tail pointer did). The bottom line is that each message is written once but read twice during queue processing.  The queue also maintains a state (or “queue info” – .qi) file, which among others contains these pointers. The user can configure how often data shall be synced to disk. The more often, the more robust the queue is, but it also gets slower as part of that process. To flush, both the current write needs to be flushed as well as the state file. For the state file, this currently means it is opened, written and closed again. At the extreme, this happens for each record being written (but keep in mind that the queue file itself is always kept open). It is also worth mentioning that all records are of variable size, so there is no random access possible to messages that reside inside the queue file.

From the threading point of view, queue operations are naturally serial (as sequential files are involved). As such, a queue usues exactly one worker, no matter how many worker threads the user has configured. In case of a disk assisted (DA) queue, there are as many workers as configured for the in-memory part of the queue and one worker for the disk part of the queue. Note that all of these workers run in parallel. That also means that message sequence will change if disk and in-memory mode run at the same time But this comes at no surprise: for obvious reasons, this is always the case when multiple workers process queue messages in parallel (otherwise, the parallel workers would need to serialize the, effectively running a single worker thread…). To keep the code simple later versions of rsyslg (late v5+) keep the disk queue open once it has been started. First of all, this removes some complex synchronization issues inside the rsyslog core, which would otherwise not only complicate the code but slow things down, even when no disk queue is active. Secondly, there is a good chance that disk mode is needed again if it was initiated once, so it probably is even very smart to make this as painless as possible. Note that during rsyslog shutdown, the disk queue files are of course fully be deleted.