Some cleanup in upcoming rsyslog v8

Historically, the rsyslog source tree contains a lot of seldomly-used and exotic modules. Some of them even don’t work at the moment. I kept them inside the tree so that they could serve as a sample for folks trying a similar things. However, there has been discussion on the rsyslog mailing list that all of this clutters up rsyslog and makes it a bit hard to understand which modules are well maintained, which are not, and which actually do not work or just serve an exotic border case.

I think these concerns are valid. As a consequence, I will go through the codebase and remove what is not in actual use. I will keep contributed modules which are only occasionally maintained, but I will move them to their own directory (./contrib) so that folks more easily see this is not a project-maintained plugin. Actually, we gain clarity from this move, but we don’t loose anything: if someone decides to base some new code on the then-removed code, it’s still available in older git versions. So it can still be used as a template. Besides clarity, getting rid of the cruft also eases the work of maintaining the source tree and hopefully also releases work of distro packagers.

To get you an idea of what kind of things I will remove: there are some java programs inside the code, which were used in early versions of the testbench (around v5). They are no longer in any use at all. There is omoracle, which is orphaned for quite some while, and does not work any longer since the days of v6. There is obviously no interest in this plugin, otherwise folks would have stepped up and maintained it during the past 3 or 4 years that it does not work. There is sm_cust_bindcdr, which was done as part of a custom project. While we asked for permission to include this into the project (and got it ;)), the actual module is so specific that it is extremely unlikely someone else can use it. We just integrated it as an example. These kinds of things we will remove.

Note that this step probably also helps us in moving rsyslog as whole over to ASL 2.0, which is our long-term goal since long. Some of the things now being removed (omoracle, for example) would be problematic, as they are under GPL and we cannot contact the author any longer. This is a nice additional benefit of the cleanup.

rsyslog under ASL 2.0: why I can’t simply do that

The ASL 2.0 topic boiled up again due to a much-appreciated IBM contribution to make rsyslog 5.8.6 work on AIX. Unfortunately, this contribution was done under GPLv3+. I tried to work with IBM to have it released under ASL 2.0, but their legal department is of the opinion that this is not possible. This resulted in some restrictions, which can be found in the git branches’ README file. Most importantly, it’s a dead-end branch branch which cannot be merged up to new versions.

As an option, IBM said if I would release rsyslog 5.8.6 under ASL 2.0, they could release their patch under ASL 2.0 as well. Unfortunately, I cannot do this by just declaring so.

You need to keep in mind that I do not own the complete copyright! Actually, there are a couple of hundreds contributors that I can find in git history … and there are even more in the original sysklogd, which I can’t even identify all. As such, it is simple impossible for me to change the license “on my own”.
To reach the current state, I did quite intense research on who contributed what some time (two years maybe?) ago. I concentrated on “easy” files (new ones without sysklogd code) and contacted all contributors if they agree on ASL 2.0. Those files that I got agreement on are now under ASL 2.0. For the “hard” files, I even did some research which of them were still from the original syslogd, which was released under BSD. I replaced these files /code sequences with the BSD version (which is identical), and so could put that under ASL 2.0. But still there is a notable body of code left that needs to be under GPLv3+. Some can probably be changed with more author contacting (especially Red Hat contributions, where I have a general “go” for most things), but some code definitely needs to be rewritten. I guess the additional research takes at least some weeks, with the rewrite taking maybe another month or so.

Bottom line: it’s far from being easy and there is no pressing need. But I don’t want to fallback on the effort just because of the IBM contribution. I would need to rewrite it in any case, so there is no point in merging mainstream.

rsyslog will remain GPLv3 licensed

Licensing is though topic. I tried to explain some of the upcoming rsyslog license changes with yesterday’s blog post. While I tried to cover all aspects, I have probably manged to create some confusion. I try to cleanup this mess today. In doing so, I will leave out some of the fine details but focus on the prime visible facts. 

First of all, the rsyslog project as whole will stay under GPLv3. If you dig down into the details, the current version is licensed under GPLv3, with a large body of code (the rsyslog runtime) being licensed under LGPLv3. In straight words, this means that rsyslog as whole can not be included in a commercial product while the rsyslog runtime can be. So if someone intends to provide rsyslog-like functionality, he or she can use the runtime and rewrite the rest of the system (but not copy it). Also, it is possible to create commercial rsyslog plugins with the current system, but then one relies either on a “creative” interpretation of the GPL or use message passing e.g. via pipes between core and plugin.

With the intended changes these basic facts, in regard to rsyslog as whole, are not changed at all. However, the details change: the body of code that is licensed under a permissive license (allowing use inside commercial applications) will be increasing. Still, some key files will remain under GPLv3, and so will the overall rsyslog project. Also, in the future the permissive license used by rsyslog will probably be the Apache software license (ASL 2.0). This open source license is used by a myriad of well-known software products, with the Apache http server being a prime example. It is not sure if ASL 2.0 will totally replace LGPLv3 inside the rsyslog runtime, this depends on contributor reactions. For many of the same reasons, it is also not yet clear what exactly the GPLv3 core of rsyslog will be.

Why is this change benefitial to the project? Maintaining and developing rsyslog is costly. In the typical open source business model, these costs are covered by the sales of project-related services. Unfortunately, the rsyslog project received relatively little funding via this way and is still heavily sponsored by Adiscon, which can not bear the majority of the cost for all time to come.

One problem with receiving funding is that some potential customers – especially large ones who could considerably contribute to funding – do not like to license under GPLv3 for one reason or the other. One “solution” to that problem would have been to dual-license rsyslog in its current form. We actually considered that (blog posting) but stepped back from the initial approach after discussion with key community members. As described in the mentioned blog posting, it would not have been very hard for Adiscon as the main copyright holder to change the licensing model. However, this would probably have meant that a commercial and a non-commercial fork of rsyslog would have been created with potentially large differences in the code base.

The move of more code under a permissive license prevents this problem. With the new model, only relatively few key files would need to be dual-licensed. This prevents large diversity inside the code base just for licensing reasons. Also, the ASL is far more appealing to many large users, so we hope to gain additional deployments – and thus potential customers – from that move.

Finally, this model facilitates the ability to provide commercial plugins. Commercial plugins were always OK with the project, and as said above, can even be written and distributed under the current licensing scheme. The new licensing scheme makes it easier to support such plugins and encourages technical superior solutions. How exactly the licensing in this regard will be is not yet fully thought out. One solution might be to add a special exemption to the then-smaller GPLv3 core, that explicitely permits plugins (getting the wording right may be somewhat tricky). Another one is that someone who intends to ship commercial plugins must rewrite the rsyslog GPLv3 core, which is no longer that hard even for external entities as the GPLv3 core is smaller (one may argue that Adiscon as the main contributor has an advantage here over others; I can’t decline it but I don’t find it unfair either – after all, Adiscon has spent considerable effort on rsyslog, so why not reap a small benefit in this situation?). These solutions are the extreme ends of the solution spectrum – it could probably also be anything in between.

Why is this useful to the community? Obviously, the changes help fund the project, and thus help to keep it not only maintained but well-enhanced as well (there are many cool things I have on my mind, but the current time constraints do not permit me to work on them). This is even more important as the journald project will create a new, mostly commercial, environment for rsyslog. In the future, rsyslog needs to compete with syslog-ng primarily in that part of the Linux ecosystem. Balabit, which traditionally dual-licenses syslog-ng under a proprietary license has a big advantage regarding this customer base from that fact alone. So far, rsyslog could make up its disadvantage by the fact that it was installed on each system, an advantage that journald will very probably remove from rsyslog. The other benefit is that the more permissive licensing model will probably attract additional software vendors and maybe additional parties, especially if we can move the full rsyslog runtime to ASL. There seems to be a movement inside the industry towards this type of licenses, at least for projects playing in the enterprise environment. If all works out well, we may even get some more contributors and thus the ability to include additional features into the project.

In short, the licensing change will not affect that much of what actually can be done with rsyslog code, but it provides rsyslog with some additional options that benefit both the project and the community.

I hope I have expressed myself clearly enough this time. Again, it has become a long post, even though I omitted some detail information given in yesterday’s post. Licensing obviously is tough ;) I hope to soon be able to use my time for more productive things, again…

funding rsyslog development

To be honest, funding the rsyslog project is not easy these days. It never was, but has seen an extra hit by the current economic crisis. Rsyslog, in its initial phase, has been sponsored exclusively by Adiscon as part of its open source involvement. In 2007, we added rsyslog professional services with things like support contracts or custom development. While some customers used these services, Adiscon was still required to sponsor the project and is so until now. Unfortunately, professional services are not doing extremely well (to phrase it politely) and the global crisis is having a hit on Adiscon’s customers. As a consequence, I have been more involved with paid work during the past weeks and could not work as much on rsyslog as I had liked to. The shift in Linux logging that probably will be brought by journald (read blog posting) doesn’t strengthen my position inside Adiscon either and works as an accelerator for change…

We have been discussing for quite some while how to improve this situation. While I don’t like the idea, we probably need to think about a dual licensing approach for rsyslog. Please keep reading, you can be upset when I have made the rest of my argument ;-). First of all, I really don’t like dual-licensing. In fact, syslog-ng’s dual licensing approach was one reason that made me start working on rsyslog (blog post). I also know that rsyslog’s simple GPL license was one of the major “buying points” that made rsyslog become the default syslogd on Fedora and later many other distributions. In order to permit reuse of rsyslog technology in some other tools, in 2008 we created a licensing model that puts the so-called runtime – a large part of rsyslog – under LGPL (see “licensing rsyslog” and a previous blog post outlining the change). Syslog-ng later cloned this licensing model, but it seems like they put a couple of more things under LGPL than we did (so there seem to be rather weak “product driver” with most of the “real meat” being under LGPL – in rsyslog larger parts are GPL, only). There is an interesting article on that tells about this development, and does so from a syslog-ng point of view. The most interesting fact I got from this article was that syslog-ng faced quite the same problems we have with rsyslog — and could not solve them without a commercial fork. Bare other options, it looks like this is a path that rsyslog needs to go, too. If so, of course this needs to be done as careful as possible.

After dual-licensing finally surfaced as something hard to avoid yesterday evening, I have done git log review today. I have to admit it was a bit scary: we have had some excellent and larger code contributions by Fedora folks in rsyslog’s infancy (and continuous support since them), we have had some larger chunks of code in form of modules contributed and there is Michael Biebl, who not only creates great Debian packages but always helps with autotools and smoothing some edges. Finally, we have a couple of folks who sent in very specific patches. But I have to admit that the very vast majority of code was written by myself ;) As of today, we have 2819 git commits. Out of them 2676 were made by me (and another 50 or so by other Adiscon folks). These number need to be taken with a grain of salt: rsyslog was initially kept in a CVS archive, and all contributions at that time were logged with my user account. The early Fedora patches were in that timeframe. That have been around 20 or so. Also, my commit count is a bit higher due to automatic merges. On the other hand, the difference in code lines is probably even a bit higher than the difference in commit count. I have not done any in-depth analysis, bu an educated guess is that more than 98% of code lines were written by me (after all, I have worked a couple of years on this project…).

I am now tasked with actually looking at the code. I will try to differentiate addon user contributions (like omoracle) from core files. This is useful anyway, because it makes clearer to users what is directly supported by the project and what not. Then, I will probably look into contributions and see which code remains at which locations. After that is know, I need to have another set of talks with my peers at Adiscon (and probably the top contributors) and see where we can head from here.

This is, honestly, how the state of affairs in regard to the rsyslog project currently is. Most probably we need to move to some commercial licensing model. I know this is not ideal. I know many of you will not really like it. On the other hand, it is plain fact that many for-profit organizations greatly benefit from rsyslog without ever contributing anything. While they can continue to do so, it is probably a good idea to help them find an offering that funds the project. As final remark for today, let me introduce you to a blog post that IMHO very nicely describes the problems, and needs, around dual licensing. I am not affiliated with the author, do not even know him.

I hope that the ideas described here will enable us to keep pushing forward with rsyslog technology, something I would really like to do!