rsyslog template plugins

As I have written yesterday, I am evaluating the use of “template modules” in rsyslog.

In that post, I mentioned that I’d expect a 5% speedup as proof that the new plugin type was worth considering. As it turns out, this method seems to provide a speedup factor of 5 to 6 percent, so it seems to be useful in its own right.

After I had written yesterday’s post, I checked what it would take to create a test environment. It turned out that it was not too hard to change the engine so that I could hardcode one of the default templates AND provide a vehicle to activate that code via the configuration file. Of course, we do not yet have full loadable modules, but I was able to create a proof of concept in a couple of hours and do some (mild) performance testing on it. The current code provides a vehicle to use a c-function based template generator. It is actiated by saying

$template tpl,=generator

where the equal sign indicates to use a C generator instead of the usual template string. The name that follows the equal sign that will probably later become the actual module name, but is irrelevant right now. I then implemented a generator for the default file format in a very crude way, but I would expect that a real loadable module will not take considerably more processing time (just a very small amount of calling overhead after the initial config parsing stage). So with that experimental code, I could switch between the template-based default file format and the generator based format, with the outcome being exactly the same.

Having that capability, I ran a couple of performance tests. I have to admit I did not go to a real test environment, but rather used my (virtualized) standard development machine. Also, I ran the load generator inside the same box. So there were a lot of factors that influenced the performance, and this for sure was no totally valid test. To make up for that, I ran several incarnations of the same test, with 1 to 10 million of test messages. The results quite consistently reported a speedup between 5 and 6 percent achieved by the C template generator. Even though the test was crude, this consistently seen speedup is sufficient proof for me that native template generators actually have value in them. I have to admit that I had expected improvements in the 1 to 2 percent area, so the 5 and more percent is considerable.

I committed the experimental branch to git, so everyone is free to review and test it oneself.

Now that I am convinced this is a useful addition, my next step will be to add proper code for template plugins (and, along that way, decide if they will actually be called template plugins — I guess library plugins could be used as well and with somewhat less effort and greater flexibility). Then, I will convert the canned templates into such generators and included them statically inside rsyslog (just like omfile and a couple of other modules are statically included inside rsyslog). I hope that in practice we will also see this potential speedup.

Another benefit is that any third party can write new generator functions. Of course, there is some code duplication inside such functions. But that should not be a bit issue, especially as generator functions are usually expected to be rather small (but of course need not be so). If someone intends to write a set of complex generator functions, these can be written with a common core module whom’s utility functions are accessed by each of the generators. But this is not of my concerns as of now.

Note that I will probably use very simple list data structures to keep track of the available generators. The reason is that after the initial config file parsing, access to these structures is no longer required and so there is no point in using a more advanced method.

I expect my effort to take a couple of days at most, but beware that Thursday is a public holiday over here in Germany and I may not work on the project on Thursday and Friday (depending, I have to admit, a little bit on the weather ;)).

rsyslog speed & output formatting

I’d like to reproduce a mailing list post here, both because I would like to retain it for easy reference and I consider it important enough to achieve better visibility. It originated by asking a question about output formatting options, but took as down to important roadmap questions for rsyslog.

For simplicity, I simply copy over the text of my relevant posting:

> On Mon, 31 May 2010, david@lang.hm wrote:
>
> > On Mon, 31 May 2010, Rainer Gerhards wrote:
> >
> >>> I agree that doing it in the output would be far better in many
> ways,
> >>> but
> >>> since there isn’t a way to do a plugin there (at least not as far
> as I
> >>> know, it would be good to get confirmation or a better idea)
> >>
> >> David, can you tell me what you have on your mind for this
> functionality? I
> >> have thought a bit about it, and I probably have one approach
> myself. But I
> >> would prefer to hear your idea before I push you into a direction.
> >
> >
> > two options
> >
> > 1. something that would work similar to the existing format
> > string, but would call a C subroutine that could read the existing
> > properties and would create the output string in a buffer
> >
> > 2. something that could also modify the exisitng properties (more
> > powerful, but also more dangerous and could involve locking to
> prevent
> > other things from trying to read properties at the same time)
> >
> > we haven’t gone too far down the road of researching the output
> > performance (since the input and queue locking has dominated so far),
> but
> > it is clear that the output currently takes significantly more CPU
> time
> > than input, it may be that being able to use C to define the output
> format
> > instead of interpreting the format string may be a noticable
> improvement.
> > Is there a relativly easy way to test this? (say, hard-code a format
> or
> > two and test writes to file and network with the hard-coded format vs
> a
> > format string that produces the same output?)
>
> for the traditional output formats the difference may not be that much,
> but if there is extensive parsing involved (as the initial poster is
> doing, or what I would expect is common for specific log types into a
> database) the difference can be much more significant since it can
> replace
> multiple regex statements with a much faster single pass that looks for
> word breaks and inserts standard filler in those spots.
>
> With the new syslog format where the data is ‘supposed to be’ in a
> series of name=value tuples, something like this would be a pretty
> efficiant way of extracting particular portions of the data to be
> output
> (although the properties could be extended to do this sort of thing by
> providing something similar to a perl hash)

You are looking in the same direction I am, and I think this is good news ;)

The current engine supports functions coded in C, but not yet as real plugins
nor in an easy to see way. It is done via a crude function interface library
module, and only within the script engine. My original plan (over a year, or
even two, ago) was to generalize these library plugins, so that it is easy to
add new code and load them as plugins. Actually, making them available as
plugins should not be too much work given the already existing
infrastructure. There already exist a handful of “function modules”, the
control structure is just statically created during compile time, much as
some of the output plugins are statically linked.

Then the original plan was to enable templates to call scripts and enable
scripts to define templates (kind of). Unfortunately, I got distracted by
more important things before I could complete all of this.

HOWEVER, at this time performance was not a major concern. With what has
evolved in the mean time, I do not like the original approach that much any
longer. At least the script engine must become much faster before I can take
a real look at that capability. Right now, scripts generate a interim code
that then is interpreted by a (kind of) virtual machine. A script invocation
inside a template would mean that a VM must be instantiated, the script
interpreted and the resulting string be used as template contents. Clearly,
this is not for high-performance use. Still, however, it may be useful to
have that capability for those cases, where performance is not the #1
consideration. But given that everything would need to be implemented, it
does make limited sense to look into something known to be too slow in the
long run. BTW, this is one reason that I have not yet continued to work on
the script engine, knowing that some larger redesign is due to fit it into
the now much tighter runtime constraints.

On the performance of the output system: I think the system in general is
quite fast and efficient, with only ONE important exception: that is, if
multiple replacements need to happen. Still, the algorithm is quite
efficient, but it is generic and needs to run though a number of steps. Of
course, it is definitely faster to permit a C plugin to look at the message
and then format, in an “atomic” way the resulting custom string. Thus, you
need to write multiple C codes instead of using a generic engine, but can do
so in a much higher performance way. I would assume, however, that this
approach cannot beat the simple templates we usually use (maybe by less than
5% and, of course, there may be cases where this matters).

As you know, my current focus is speed, together with some functional
enhancements. I was looking at queue operations improvements, but the
potential output speed improvements may be more interesting than the queue
mode improvements (and apply to more use cases). So it may make sense to look
into these, first. My challenge here is to find something that is

a) generic enough to be useful in various (usual) cases
b) specific enough to be rather fast

and it should also be able to implement within a few weeks at most, because I
can probably not spend much more time on a single feature/refactoring.

One solution may be to create “template modules”. I could envision a template
module to be something that generates the template string *as a whole* from
the input message.

That is, we would have

$template current-style,”%msg%n”

but also (**)

$modload tplcustom
$template custom,tplcustom

where tplcustom generates the template string.

While this sounds promising, we have some issues. One immediately pops up my
mind: we will probably be able to use the same template for file writing or
forwarding, but for file writing we need a LF at the end, while for
forwarding we do not need it.

So the most natural way would be to have the ability to embed a “custom
template” into a regular template, like suggested by this syntax:

$template both,”%=tplcustom%n”

however, this brings us down to the slippery slope of the original design. As
a next thing to be requested, I could ask for using not the msg object (with
its fixed unmodified properties), but rather of a transformation of the
message object. So we would end up with something like this:

$template cmplx,”%=tplcustom(syslogtag & msg)%”

Which would require a much more complex logic working behind the scenes.

Of course, depending on the format used, the engine could select different
processing algorithms. Doing this on the fly seems possible, but requires
more work than I can commit in one sequence.

Also, it would be useful to have the ability to persist already-generated
properties with the message while it is continued to be processed in the rule
engine. So far, we do not have this ability, and the reason is processing
time (plus, as usual, implementation effort): for that, we would need to
maintain a list (or hash, …) of name/value pairs, store them to disk for
disk queues and shuffle them through the rule engine as processing is carried
out. As I said, quite doable, but another big addition.

So I am somewhat stuck with things that sound interesting, but are a bit
interdependent. Doing them all together is too big to be useful, and it will
probably fail because I can probably not keep focus on all of the for the
next, say, 9 to 12 month that it would require to complete everything.

So I am again down to picking what is most useful. Out of this discussion, it
looks like the idea I marked with (**), the plain C template generator could
be a useful route to take. I am saying this under the assumption that it
would be relatively easy to implement and cause at least some speedup in
standard cases (contrary to what I expect, I have to admit…). But that
approach is highly specialized, requiring a C module for each custom format.
So does it really serve the rsyslog community well – or just some very
isolated use cases?

Thinking more about it, it would probably be useful if it is both

a) relatively easy to implement and
b) causes some speedup in standard cases

But b) cannot be proven without actually implementing the interface. So, in
practice, the questions boils down to what we *expect* about the usefulness
of this utility.

Having said that, I’d appreciate feedback, both on the concrete question of
the usefulness of this feature as well as any and all comments on the
situation at large. I am trying to put my development resources, which
thankfully have been somewhat increased nowadays :) to the area where they
provide greatest benefit.

rsyslog now available on Solaris

Rsyslog has become the de-facto standard on modern Linux operating systems. It’s high-performance log processing, database integration, modularity and support for multiple logging protocols make it the sysadmin’s logging daemon of choice. The project was started in 2004 and has since then evolved rapidly.

Starting with today, rsyslog is not only available on Linux and BSD, but also on Sun Solaris. Both Intel and Sparc machines are fully supported under Solaris. Depending on operator need, rsyslog can replace stock Solaris syslogd or be used in conjunction with it. The later case provides enhanced rsyslog functionality without the need to change the system infrastructure.

Solaris is now a tier-one target platform. That means that all testing for major releases will be carried out on Solaris as well as on other platforms. The Solaris port was done very careful taking into account Sun’s somewhat specific syslogd handling via door files and preserving the full power of rsyslog. So it not only compiles and runs on Solaris but rsyslog is a good citizen in the Solaris environment.

As of usual rsyslog project policies, the project does not make installation packages other than the source distribution available. However, we work closely together with the Solaris community be able to provide them. We expect additional announcements soon.

The versions with initial solid Solaris support are 4.7.2 and 5.5.4. Rsyslog’s Solaris port was made possible by a generous contribution of hardware and some development funding by a sponsor which preferred to remain anonymous. We from the rsyslog project would like to express our sincere appreciation. Contributions of any kind are always very welcome.

v4-devel is back again

I intended to focus on new development for rsyslog v5 exclusively, but as it turns out, I announce a new v4-devel build today: version 4.7.0.

So what is happening? Do I no longer believe in v5? Not at all! But we have thankfully received some grant to make rsyslog available on Solaris, and I wanted to have this killer-feature for v4 as well. It turned out to be impractical and somewhat confusing (for end users) to do that in v4-stable. After all, some code areas need to be touched and the Solaris-unique code is obviously brand new and far from being stable.

So I concluded that the best approach is to revive v4-devel, but in a controlled way. I will not put any new request into that version. Most of new feature requests will go in v5 only. However, there may be some exceptions. Already, with 4.7.0, I have included some things that I consider either very useful or easy to integrate. Note that much of this easiness is related to the fact that in autumn 2009 the original plan was to release 4.7.0, and thus some new features were already integrated. I have not removed them, and now v4 users can gain the benefits. Note, however, that I did not maintain these changes. So there may be bugs inside them that I already fixed in v5. Quite honestly, I did not do any extra testing to rule that out. However, the testbench runs without any problems on 4.7.0.

Note that the new v4-devel branch will bring some select enhancements in the future. One thing I have on my mind is a small, but potentially troublesome, enhancement to the background log writer that can potentially boost performance pretty much.

However, the majority of new work will go into v5, exclusively. But you will not see that much happen there in the next couple of days/weeks as my next milestone is strongly focused on getting full stable Solaris support into both v4 and v5 (note that v5-devel, the master branch, also already supports Solaris).

syslog normalization

I am working on syslog normalization for quite some years now. A couple of days ago, David Lang talked to me about syslog-ng’s patterndb, an approach to classify log messages and extract properties from it.

I have looked at this approach, and it indeed is promising. One ingredient, though, is missing, that is a directory of standard properties (like bytes sent and received in traffic logs). I know this missing ingredient very well, because we also forgot it until recently.

The aim to normalize log data is far from being new. Actually, I think it is one of the main concerns in log analysis. Probably one of the first folks who thought seriously about it was Marcus Ranum, who coined the concept of “artificial ignorance”, meaning that we can remove those messages from a big pile of logs that we know to be uninteresting. But in order to do that correctly, you need to know how exactly they look. And this is where log normalization comes in. I have written an in-depth paper in 2004, title “On the nature of syslog data“. The version officially published claims “work in progress”, but it still has all the juicy details.

Internally, we implemented this approach in our MonitorWare products a little bit later. For example, it is used inside the “Post Process Action” in WinSyslog (Michael also wrote a nice article on how to parse log messages with this action). While this was a great addition (and is used with great success), I failed to get enough community momentum to build a larger database of log messages that could be used as a basis for large scale log normalization. One such – largely failed for syslog – approach is the event knowledge base.

However, I did not give up on the general idea and proposed it wherever appropriate. The last outcome of this approach is the soon-to-be-released Adiscon LogAnalyzer v3, which uses so-called message parsers to obtain useful information from log entries. Here, I hope we will be able to gain more community involvement. We already got two message parsers contributed. Granted, that’s not much, but the ability to have them is so far little known. With the release of v3, I hope we get more and more momentum.

The syslog-ng patterndb approach brings an interesting idea to this space: as far as I have heard (I generally do NOT look at competing code to prevent polluting my code with things that I should not use), they use radix trees to parse the log messages. That is a clever approach, as it provides a solution for much quicker parsing large amounts of parse templates. This makes the approach suitable for real-time normalization of an incoming stream of syslog data.

Adiscon LogAnalyzer, by contrast, uses a regex-based approach, but that primarily for simplicity in an effort to invite more contributions (WinSyslog has a far more sophisticated approach). In Adiscon LogAnalyzer we began to become serious with identifying what a property actually means. While we have a fixed set of properties, with fixed semantics, in both WinSyslog, MonitorWare Agent and rsyslog, this set is rather limited. The Windows product line supports ease of extension of the properties, but does not provide standard IDs for those properties.

In Adiscon LogAnalyzer, we have fixed IDs for a larger set of properties, now about 50 or so. Still, that set is very small. But we created it with the intention to be able to map various “semantic objects” from different log entries to a single identity. For example, most firewall logs will contain a source and destination IP address, but almost all firewalls will use different log message formats to do that. So we need to have different analyzers to support these native formats, for example in reports. In Adiscon LogAnalyzer, we can now have a message parser “normalize” these syslog entries and map the vendor-specific format to the generic “semantic object”. Thus the upper layers (like views and reports) then work on these normalized semantic objects and do not need to be adopted to each firewall. This needs only be done at the parser level.

Such a directory of semantics objects would be very useful in my humble opinion. We are currently working on making it publicly available, all this in the hope for a community to involve itself ;) If we manage to get a large enough number of log and/or parser contributions, we may potentially be able to make Adiscon LogAnalyzer an even better free tool for system administrators.

And as there is hope that this will finally succeed, I have begun to think about a potential implementation inside rsyslog. It doesn’t sound very hard, but still requires careful thinking. One thing I would like to see is a unified approach that covers at least rsyslog and Adiscon Loganalyzer, and hopefully the Windows tools as well.

Another very good thing is that there already is a standard for providing standard semantical objects: during the IETF syslog standardization effort, I pressed hard for so-called structured data elements. I managed to get them into the final RFC. These structured data elements are now the key for conveying the log information once it is normalized: the corresponding name-value pairs can easily be encoded with it.

I hope we will finally able to succeed on this road, because I think this would be of tremendous benefit for the syslog community.

phpLogCon becomes Adiscon LogAnalyzer

I have blogged the past days about Adiscon LogAnalyzer. We are now gradually rolling out the new site. So I thought it is a good idea to reproduced my “official announcement” on the blog as well:

As in all things, there is a certain fashion in open source project names as well. For a long time, “php*” was a great name for php-based open source solutions. However, nowadays these somewhat bulky names have been replaced by “more streamlined” names.

I personally think that dropping the “php” part makes it somewhat easier to speak and write about these projects. So we decided it was right to drop “php” from “phpLogCon”. But was “LogCon” the ultimate name for a tool to search, analyze and (starting with v3) report on network event logs? A quick discussion within our group as well as with some external buddies turned out that “LogCon” is probably pretty meaningless. Even if one deciphers “Con” for “Console” – what does it mean to be a “Console” in this context? Not an easy to answer question. Bottom line: “LogCon” is pretty meaningless.

So we thought we do “the right thing” and rename the project before it becomes even more widely spread. The later you do a name change, the more painful it is. That made us think about good names. We ended up with “LogAnalyzer”, because analysis is the dominant use case for this tool (especially if you think of reports as being part of the analysis ;) ). Another quick search made us aware that there are (of course) lots of “LogAnalyzers”. And, of course as well, all second level domains where taken.

Bare of an expensive legal adviser, we made the decision to boldly name the project “Adiscon LogAnalyzer“, aka. “the log analyzer (primarily) written by Adiscon”. With that approach we use our company name (which obviously legally belongs to us) together with the generic term “LogAnalyzer”. That is done in the hope that it will resolve any legal friction that otherwise may occur. For the very same reason you will see us consistently referring to “Adiscon LogAnalyzer”.

We are aware, however, that this implies some other cost: A project with a company name inside it does sound a bit like a purely commercial project. On the other hand, that seems to be no problem with the big players, like “Red Hat Linux” or “SuSe Linux”. So we hope that the company part inside the name will not have a too-bad effect on this project.

We pledge that Adiscon LogAnalyzer will always be a free, open source project. And the GPLv3 we use is your guarantee for that.

In addition to the core Adiscon LogAnalyzer, Adiscon will also provide some non-GPLed components in the future. And we hope that others will do that as well. Our sincere hope is that Adiscon LogAnalyzer will evolve to a framework where many third parties can plug in specific functionality. Consequently, we have added a plugin directory to the new site, and some third-party written message parsers already populate it.

So – phpLogCon has not only a new name and a new site, it is also more active than ever and eager to solve the log analysis and reporting needs for a growing community. Please help spread the word!

Why is writing good user doc such a problem…?

… for me, I should add. Today, I ran about a post on the rsyslog mailing list where a user (rightfully!) complains that rsyslog documentation is confusing.

I really don’t like the idea that users are having a hard time because they can not get pretty basic things done. Unfortunately, there are a number of reasons for this: one, of course, is lack of time. I am rather busy developing new functionality and besides rsyslog I have also other chores today at Adiscon, like helping with the next and really great release of Adiscon LogAnalyzer, a free and open source solution for searching, analyzing and reporting on network event data and syslog (yeah, and creating buzzwords, of course…). But there is a more subtle issue:

I am doing logging and syslog for over 10 years now (close to 15, if I remember correctly). I have seen so much in the logging world, that I can hardly think of the time when I did not know what PRI or TAG or even MSG was, what are the (disadvantages) or simplex vs. duplex comm modes, and what makes 3195 better (or worse) than 3164 or 5424 ;)

In short: it is pretty hard for me to go back to the roots and envision what somebody new to syslog needs to know AND in what order! I am trying my best, but writing basic-level articles (and documentation) requires considerable effort. A good article, well-thought out (like a 4-page journal article) can easily take 4 to 5 days to create. Even then, I need help from other folks when I need to write for entry level folks (and there is nothing bad with being entry-level: everybody is at some point in time). Here, of course, the time resource problem hits again: I usually can not afford this effort “just” to create doc.

With the rsyslog cookbook I started another approach: there I focus on very specific environments. I don’t really like this idea, because it does not tell people what exactly they are doing. But still the past weeks have proven this to be a useful approach. But I also notice that the cookbook is only useful if the configuration matches exactly what the user wants – otherwise users are lost. I guess that’s due to not really understanding what happens. The good thing about the cookbook is that it requires relatively little effort. Most samples were created within an hour, which seems to be acceptable for something that can be reused.

The ultimate solution would be that users write content themselves. The rsyslog knowledge base (or forum, as you may call it) is most successful in this regard. But it is hard to navigate and hard to find a solution – you often need to wade through various posts before you get to the (often simple) solution. The rsyslog mailing list is another excellent resource, especially as other folks actively help supporting rsyslog. This is a very important for me and the project, and I appreciate it very much. Unfortunately, the con again is that the mailing list makes it hard for new users to find already existing solutions (that is is being mirrored to various aggregators helps a bit, but only so much…).

The ultimate solution, I thought, was the rsyslog wiki and we see some very nice article inside it. Unfortunately, very few users contribute to the wiki. Just think an how enormous knowledge reservoir this could be if only every 5th user who got help would take a few minutes off his time to craft a quick wiki article describing what he does, why, and how it works. Unfortunately, most users seem to not have this time. I can understand that, I guess they have pressing schedules at well. And these schedules may already be stressed by the extra time they needed to find the solution for an obviously simple thing…

So this is not a good situation, but I can currently not do much more than to keep working on the cookbook and ask everyone to contribute documentation. For the long-term success, I think it is vital for rsyslog to make it power available to all users. Good doc is one necessity, a better config format another one (but I won’t elaborate about this in today’s post again ;)).

syslog data modeling capabilities

As part of the IETF discussions on a common logging format for sip, I explained some sylsog concepts to the sip-clf working group.

Traditionally, syslog messages contain free-form text, only – aimed at human observers. Of course, today most of the logging information is automatically being processed and the free-form text creates ample problems in that regard.

The recent syslog RFC series has gone great length to improve the situation. Most importantly, it introduced a concept called “Structured Data”, which permits to express information in a well-structured way. Actually, it provides a dual layer approach, with a corase designator at the upper layer and name/value pairs at the lower layer.

However, the syslog RFC do NOT provide any data/information modeling capabilities that come with these structured data elements. Their syntax and semantics is to be defined in separate RFCs. So far, only a few examples exist. One of them is the base RFC5424, which describes some common properties that can be contained in any syslog message. Other than that, RFC5674, which describes a mapping to the Alarm MIB and ITU perceived severities and RFC5675, which describes a mapping to SNMP traps. All of them are rather small. The IHE community, to the best of my knowledge, is currently considering using syslog structured data as an information container, but has not yet reached any conclusion.

Clearly, it would be of advantage to have more advanced data modeling capabilities inside the syslog base RFCs, at least some basic syntax definitions. So why is that not present?

One needs to remember that the syslog standardization effort was a very hard one. There were many different views, “thanks” to the broad variety of legacy syslog, and it was extremely hard to reach consensus (thus it took some years to complete the work…). Next, one needs to remember that there is such an immense variety in message content and objects, that it is a much larger effort to try define some generic syntaxes and semantics (I don’t say it can not be done, but it is far from being easy). In order to get the basics done, the syslog WG deciced to not dig down into these dirty details but rather lay out the foundation so that we can build on it in the future.

I still think this is a good compromise. It would be good if we could complement this foundation with some already existing technology. SNMP MIB encoding is not the right way to go, because it follows a different paradigm (syslog is still meant to be primarily clear text). One interesting alternative which I saw, and now evaluate, is the ipfix data modeling approach. Ideally, we could reuse it inside structured data, saving us the work to define some syslog-specific model of doing so.

The most important task, however, is to think about, and specify, some common “information building blocks”. With these, I mean standard properties, like source and destination ID, mail message id, bytes sent and received and so on. These, together with some standard syntaxes, can greatly relieve problems we face while consolidating and analyzing logs. Obviously, this is an area that I will be looking into in the near future as well.

It may be worth noting that I wrote a paper about syslog parsing back in 2004. It was, and has remained, work in progress. However, Adiscon did implement the concept in MonitorWare Console, which unfortunately never got wider exposure. Thinking about it, that work would benefit greatly from the availability of standardized syslog data models.

new phplogcon site

Today, I received a first more or less complete link to what will become the new phplogcon site. The site is not yet live, but will provide some of the new features.

If you look at it, you’ll probably notice a couple of things. First of all, the name “phpLogCon” is no longer spelled out. The reason is that we considered a bit bulky and meaningless. “Loganalyzer” is exactly what the tools is about. But, of course, there are a myriad of (trademark) problems related to that name. So we try to avoid all confusion by calling it “Adiscon loganalyzer”, hoping that the company name as dominant part of the product name will rule out all problems. For that very reason, you’ll also see me to refer to Adiscon Loganalyzer in the future. If you wonder why I stress that “Adiscon” part, you now know why.

Secondly, you will notice the fresh design. While I am not a visual guy, I have to say that I like it very much. I think it removes much of the clutter and makes it easier to find the information you need quickly. We also have changed the content management system in the background. The new sites uses WordPress, which seems to be highly approprioate for what the site needs. Of course, the wiki and forum will remain as they are – they have proven to be quite well as they are.

If you look more closely, you will also note that Adiscon LogAnalyzer gets an important new component: a reporting module. I managed to convince my peers at Adiscon to move some of our MonitorWare Console closed source technology into Adiscon LogAnalyzer. My long-term vision is that reporting capabilities will much enhance the utility of this tool. In order for Adiscon to get something back, we will begin to develop some enhanced reports, which will be non-free for commercial users. However, the base product as well as some base reports, will always remain free!

I hope you consider this to be good news, just as I think! Thanks to everyone who made this possible.

Some thoughts on reliability…

When talking syslog, we often talk about audit or other important data. A frequent question I get is if syslog (and rsyslog in specific) can provide a reliable transport.

When this happens, I need to first ask what level of reliability is needed? There are several flavors of reliability and usually loss of message is acceptable at some level.

For example, let’s assume the process writes out log messages to a text file. Under (allmost?) all modern operating systems and by default, this means the OS accepts the information to write, acks it, does NOT persist it to storage and lets the application continue. The actual data block is usually written a short while later. Obviously, this is not reliable: you can lose log data if an unrecoverable i/o error happens or something else goes fatally wrong.

This can be solved by instructing the operating system to actually persist the information to durable store before returning back from the API. You have to pay a big performance toll for that. This is also a frequent question for syslog data, and many operators do NOT sync and accept a small message loss risk to save themselves from requiring a factor of 10 servers of what they now need.

But even if writes are synchronous, how does the application react? For example: what shall the application do if log data cannot be written? If one really needs reliable logging, the only choice is to shutdown the application when it can no longer log. I know of very few systems that actually do that, even though “reliability” is highly demanded. Here, the cost of shutting down the application may be so high (or even fatal), that the limited risk of log data loss is accepted.

There are a myriad of things when thinking about reliability. So I think it is important to define the level of reliability that is required by the solution and do that in detail. To the best of my knowledge, this is also important for operators who are required by law to do “reliable” logging. If they have a risk matrix, they can define where it is “impossible” (for technical or financial reasons) to achieve full reliability and as of my understanding this is information auditors are looking for.

So for all cases, I strongly recommend to think about which level of reliability is needed. But to provide an answer for the rsyslog case: it can provide very high reliability and will most probably fulfil all needs you may have. But there is a toll in both performance and system uptime (as said above) to go to “full” reliability.