more on syslog TLS, policies and IETF efforts…

I am still working hard on TLS support, but on the design level. Here, I’d like to reproduce a message I sent to the IETF syslog WG’s mailing list. It outlines a number of important points when it comes to practical use of TLS with syslog.

I am quite happy that syslog-transport-tls got new momentum right at the time when I finished my TLS implementation in rsyslog and turned into fine-tuning it. The IETF discussion on authentication and policies actually is touching right those places where it in practice really hurts. For the initial TLS implementation, I decided to let rsyslog work in anonymous mode, only. Iit was clear that -transport-tls section 4 as in version 11 would not survive – just as we have now seen).

The next steps in rsyslog are to enable certificate based access policies and this is exactly what the IETF discussion is focusing on. Of course, I try to finish design and try to affect the standard in a positive way, so that the rsyslog implementation can both be standards-compliant and useful in practice.

And now – have an interesting read with my mailing list post. Feedback is highly appreciated.

Rainer


Hi all,

I agree to Robert, policy decisions need to be separated. I CC Pasi because my comment is directly related to IESG requirements, which IMHO cannot be delivered by *any* syslog TLS document without compromise [comments directly related to IESG are somewhat later, I need to level ground first].

Let me tell the story from my implementor’s POV. This is necessarily tied to rsyslog, but I still think there is a lot of general truth in it. So I consider it useful as an example.

I took some time yesterday to include the rules laid out in 4.2 into rsyslog design. I quickly came to the conclusion that 4.2. is talking about at least two things:

a) low-level handshake validation
b) access control

In a) we deal with the session setup. Here, I see certificate exchange and basic certificate validation (for example checking the validity dates). In my current POV, this phase ends when the remote peer can positively be identified.

Once we have positive identification, b) kicks in. In that phase, we need to check (via ACLs) if the remote peer is permitted to talk to us (or we are permitted to talk to it). Please note that from an architectural POV, this can be abstracted to a higher layer (and in rsyslog it probably will). For that layer, it is quite irrelevant if the remote peer’s identity was obtained via a certificate (in the case of transport-tls), a simple reverse lookup (UDP syslog), SASL (RFC 3195) or whatever. What matters is that the ACL engine got a trusted identify from the transport layer and verifies that identity [level of trust varies, obviously]. Most policy decisions happen on that level.

There is some grayish between a) and b). For example, I can envision that if there is a syslog.conf rule (forward everything to server.example.net)

*.* @@server.example.net

The certificate name check for server.example.net (using dNSName extension) could probably be part of a) – others may think it is part of b).

Also, even doing a) places some burden onto the system, like the need to have trust anchors configured in order to do the validation. This hints at at least another sub-layer.

I think it would be useful to spell out these different entities in the draft.

Coming back to policy decisions, one must keep in mind that the IESG explicitly asked for those inside the document. This was done based on the -correct- assumption that today’s Internet is no longer a friendly place. So the IESG would like to see a default policy implemented that provides at least a minimum acceptable security standard. Unfortunately, this is not easy to do in the world of syslog. For the home users, we cannot rely on any ability to configure something. For the enterprise folks, we need to have defaults that do not get into their way of doing things [aka “can be easily turned off”]. There is obviously much in between these poles, so it largely depends on the use case. I have begun a wiki page with use cases and hope people will contribute to it. It could lead us to a much better understanding of the needs (and the design decisions that need to be made to deliver these). It is available at

http://wiki.rsyslog.com/index.php/TLS_for_syslog_use_cases

After close consideration, I think the draft currently fails on addressing the two use cases define above properly. Partly it fails because it is not possible under the current IESG requirement to be safe by default. We cannot be fully safe by default without configuration, so whatever we specify will fail for the home user.

A compromise may be to provide “good enough” security in the default policy. I see two ways of doing that: one is to NOT address the Masquerade and Modification threats in the default policy, just the Disclosure threat. That leads us to unauthenticated syslog being the default (contrary to what is currently implemented) [Disclosure is addressed in this scenario as long as the client configs are not compromised, which I find sufficiently enough – someone who can compromise the client config can find other ways to get hold of the syslog message content].

An alternative is to use the way HTTPS works: we only authenticate the server. To authenticate, we need to have trusted certificate inside the server. As we can see in HTTPS, this doesn’t really require PKI. It is sufficient to have the server cert singed by one of few globally trusted CAs and have this root certificates distributed with all client installations as part of their setup procedure. This is quite doable. In that scenario, a client can verify a server’s identity and the above sample (*.* @server.example.net) could be verified with sufficient trust. The client, however, is still not authenticated. However, the threats we intended to address are almost all addressed, except for the access control issue which is defined as part of the Masquerade threat (which I think is even a different beast and deserves its own threat definition now that I think about it). In short we just have an access control issue in that scenario. Nothing else.

The problem, however, is that the server still needs a certificate and now even one that, for a home user, is prohibitively expensive. The end result will be that people turn off TLS, because they neither know how to obtain the certificate nor are willing to trade in a weekend vacation for a certificate ;) In the end result, even that mode will be less useful than anonymous authentication.

The fingerprint idea is probably a smart solution to the problem. It depends on the ability to auto-generate a certificate [I expressed that I don’t like that idea yesterday, but my thinking has evolved ;)] OR to ship every device/syslogd with a unique certificate. In this case, only minimal interaction is required. The idea obviously is like with SSH: if the remote peer is unknown, the user is queried if the connection request is permitted and if the certificate should be accepted in the future. If so, it is added permanently to the valid certificate store and used in the future to authenticate requests from the same peer. This limits the security weakness to the first session. HOWEVER, the problem with syslog is that the user typically cannot be prompted when the initial connection happens (everything is background activity). So the request must actually be logged and an interface be developed that provides for user notification and the ability to authorize the request.

This requires some kind of “unapproved certificate store” plus a management interface for it. Well done, this may indeed enable a home user to gain protection from all three threats without even knowing what he really does. It “just” requires some care in approving new fingerprints, but that’s a general problem with human nature that we may tackle by good user interface desig but can’t solve from a protocol point of view.

The bad thing is that it requires much more change to existing syslogd technology. That, I fear, reduces acceptance rate. Keep in mind that we already have a technically good solution (RFC 3195) which miserably failed in practice due to the fact it required too much change.

If I look at *nix implementations, syslogd implementers are probably tempted to “just” log a message telling “could not accept remote connection due to invalid fingerprint xx:xx:…” and leave it to the user to add it to syslog.conf. However, I fear that for most home setups even that would be too much. So in the end effect, in order to avoid user hassle, most vendors would probably default back to UDP syslog and enable TLS only on user request.

From my practical perspective this sounds even reasonable (given the needs and imperfections of the real world…). If that assessment is true, we would probably be better off by using anonymous TLS as the default policy, with the next priority on fingerprint authentication as laid out above. A single big switch could change between these two in actual implementations. Those users that “just want to get it running” would never find that switch but still be somewhat protected while the (little) more technically aware can turn it to fingerprint authentication and then will hopefully be able to do the remaining few configuration steps. Another policy is the certificate chain based policy, where using public CAs would make sense to me.

To wrap it up:

1. I propose to lower the default level of security
for the reasons given.
My humble view is that lower default security will result in higher
overall security.

2. We should split authentication policies from the protocol itself
… just as suggested by Robert and John. We should define a core
set of policies (I think I described the most relevant simple
cases above, Robert described some complex ones) and leave it
others to define additional policies based on their demand.

Policies should go either into their own section OR into their own documents. I have a strong favor of putting them into their own documents if that enables us to finally finish/publish -transport-tls and the new syslog RFC series. If that is not an option, I’d prefer to spend some more work on -transport-tls, even if it delays things further, instead of producing something that does not meet the needs found in practice.

Rainer

> —–Original Message—–
> From: syslog-bounces@ietf.org [mailto:syslog-bounces@ietf.org] On
> Behalf Of robert.horn@agfa.com
> Sent: Thursday, May 08, 2008 5:53 PM
> To: Joseph Salowey (jsalowey); syslog@ietf.org
> Subject: Re: [Syslog] I-D Action:draft-ietf-syslog-transport-tls-12.txt
>
> Section 4.2 is better, but it still needs work to separate the policy
> decisions from the protocol definition. Policy decisions are driven by
> risk analysis of the assets, threats, and environment (among other
> things). These are not uniform over all uses of syslog. That makes it
> important to separate the policy from the protocol, in both the
> specifications and in the products.
>
> In the healthcare environment we use TLS to protect many of our
> connections. This is both an authentication protection and a
> confidentiality protection. The policy decisions regarding key
> management
> and verification will be very similar for a healthcare use of syslog.
> Some
> healthcare sites would reach the same policy decision as is in 4.2, but
> here are three other policy decisions that are also appropriate:
>
> Policy A:
> The clients are provided with their private keys and the public
> certificates for their authorized servers by means of physical media,
> delivered by hand from the security office to the client machine
> support
> staff. (The media is often CD-R because it’s cheap, easy to create,
> easy
> to destroy, and easy to use.) During TLS establishment the clients use
> their assigned private key and the server confirms that the connection
> is
> from a machine with one of the assigned private keys. The client
> confirms
> that the server matches one of the provided public certificates by
> direct
> matching. This is similar to the fingerprint method, but not the same.
> My
> most recent experience was with an installation using this method. We
> had
> two hours to install over 100 systems, including the network
> facilities.
> This can only be done by removing as many installation schedule
> dependencies as possible. The media method removed the certificate
> management dependencies.
>
> Policy B:
> These client systems require safety and functional certification
> before
> they are made operational. This is done by inspection by an acceptance
> team. The acceptance team has a “CA on a laptop”. After accepting
> safety
> and function, they establish a direct isolated physical connection
> between
> the client and the laptop. Then using standard key management tools,
> the
> client generates a private key and has the corresponding public
> certificate generated and signed by the laptop. The client is also
> provided with a public certificate for the CA that must sign the certs
> for
> all incoming connections.
>
> During a connection setup the client confirms that the server key has
> been
> signed by that CA. This is similar to a trusted anchor, but not the
> same.
> There is no chain of trust permitted. The key must have been directly
> signed by the CA. During connection setup the server confirms that the
> client cert was signed by the “CA on a laptop”. Again, no chain of
> trust
> is permitted. This policy is incorporating the extra aspect of “has
> been
> inspected by the acceptance team” as part of the authentication
> meaning.
> They decided on a policy-risk basis that there was not a need to
> confirm
> re-inspection, but the “CA on a laptop” did have a revocation server
> that
> was kept available to the servers, so that the acceptance team could
> revoke at will.
>
> Policy C:
> This system was for a server that accepted connections from several
> independent organizations. Each organization managed certificates
> differently, but ensured that the organization-CA had signed all certs
> used for external communications by that organization. All of the
> client
> machines were provided with the certs for the shared servers (by a
> method
> similar to the fingerprint method). During TLS connection the clients
> confirmed that the server cert matched one of the certs on their list.
> The
> server confirmed that the client cert had been signed by the CA
> responsible for that IP subnet. The server was configured with a list
> of
> organization CA certs and their corresponding IP subnets.
>
> I do not expect any single policy choice to be appropriate for all
> syslog
> uses. I think it will be better to encourage a separation of function
> in
> products. There is more likely to be a commonality of configuration
> needs
> for all users of TLS on a particular system than to find a commonality
> of
> needs for all users of syslog. The policy decisions implicit in
> section
> 4.2 make good sense for many uses. They are not a complete set. So a
> phrasing that explains the kinds of maintenance and verification needs
> that are likely is more appropriate. The mandatory verifications can
> be
> separated from the key management system and kept as part of the
> protocol
> definition. The policy decisions should be left as important examples.
>
> Kind Regards,
>
> Robert Horn | Agfa HealthCare
> Research Scientist | HE/Technology Office
> T +1 978 897 4860
>
> Agfa HealthCare Corporation, 100 Challenger Road, Ridgefield Park, NJ,
> 07660-2199, United States
> http://www.agfa.com/healthcare/
> Click on link to read important disclaimer:
> http://www.agfa.com/healthcare/maildisclaimer
> _______________________________________________
> Syslog mailing list
> Syslog@ietf.org
> https://www.ietf.org/mailman/listinfo/syslog

The MonitorWare Knowledge Base

There’s currently a lot brewing over here. With the release of phpLogCon, I finally got to a stage where we have a decent web front-end that enables us to support admins with troubleshooting right while they look at their log data. Of course, such a system not only deserves careful design, but it requires a knowledge base so that people troubleshooting can find solutions.

If we look at our web sites, one notices there are lots of troubleshooting resources already available. Just have a look at forum.adiscon.com and you see what I mean. This forum not only offers product specific support, but has a number of quite generic discussion forums (covering Windows Events and syslog messages). We also have other troubleshooting databases. Then, each product site (like rsyslog) has its own support forum.

While all of this is great, it does not play well with the idea of the central, one-stop troubleshooting resource we intend to build. So our first step towards this resource is putting the existing resources under a single umbrella and placing them into a single system. That, of course, requires some redesign and won’t be perfect from day one.

The initial step is to consolidate all forums into a single one. My friend Andre is right now doing that. He has set up the new site kb.monitorware.com which in the future will provide all troubleshooting resources in an easy to find way. That site will be highly integrated with phpLogCon, which will be able to pull troubleshooting info from the central repository while looking at the local logs. I am very excited in seeing this become a reality (though I have to admit we are several month away from the ultimated goal).

The knowledge base site is currently in experimental operation and being finalized for production use. I hope to be able to go officially online next week.

rsyslog work log 8

Past day’s rsyslog work log:
2008-05-06
– added documentation for TLS
– merge TLS branch into main devel branch
– released 3.19.0 (MILESTONE, we now have TLS! :-D )
– some cleanup (gotten rid of some more plain chars)
– fixed some issues thanks to darix
– fixed problem with man pages thanks to Michael Biebl’s help
2008-05-07
– fixed some issues in liblogging (thanks to darix for pointing out)
– limited number of unavoidable compiler warnings when compiling with
GnuTLS
– released 3.19.1
2008-05-08
– bugfix: gtls netstram driver did not specify threading model (can
possbly lead to “interesting effects” ;))
– server’s X509 cert fingerprint is obtained by client on connect
– thought hard about syslog-transport-tls-12 (obviously, this does
not manifest in code ;))

rsyslog 3.19.0 released – world’s first syslog-transport-tls implementation

I am very pleased to announce rsyslog 3.19.0.

It is the first release that natively supports TLS for plain TCP syslog. Actually, it is the world’s first implementation of the upcoming syslog-transport-tls IETF standard. As this standard is not yet finished, the implementation is obviously experimental.

Native TLS is a big improvement over existing functionality. For example, rsyslog can now be used without the help of stunnel, which relieves us of some problems from those configurations. To the best of my knowledge, rsyslog is the first open-source syslogd offering TLS support natively.

The current TLS functionality is limited to the bare minimum. During the next few weeks, I will improve it based on my own spec and feedback (hopefully received). My hope is to have a production-grade implementation by summer at latest. Please note rsyslog’s premium and ultra-reliable RELP protocol does not yet support TLS (but can be used with stunnel without the real problems legacy tcp had with it). My plan is to let TLS mature with legacy syslog and then move it to RELP. Thus I can limit my development to one major use case, which I think will facilitate things.

There is some documentation on how to use the new TLS mode:

http://www.rsyslog.com/doc-rsyslog_tls.html

Currently, TLS is provided via GnuTLS. As I outlined earlier on the list, GnuTLS offered much more support to getting started (documentation and sample-code wise). I will focus on GnuTLS until I am fully satisfied with the TLS implementation). I’ll then see that I can also integrate NSS. Advise in this regard would be highly welcome, so if you have some knowledge in this area, please contribute.

In order to support TLS (and multiple libraries!), a major rewrite of the networking components has been done. Rsyslog now supports a so-called “network streams” (netstreams) driver interface. This interface enables app-level functionality (like the legacy tcp syslog sender and receiver) to work with dynamically selectable netstream drivers (like plain (unencrypted) TCP) and TLS). This interface will enable rsyslog to utilize other TLS drivers (and even other protocols) in the future. Different drivers can even be used concurrently.

Rsyslog now has been split into a runtime system and tools (with currently rsyslogd being the only tool). This design will further strengthen modularization and help make rsyslog functionality available in small parts.

Finally, the RFC 3195 input has been rewritten in the form of an input plugin. It can now be build as part of the normal build procedure.

Please note that there were a couple of major changes. I expect the initial 3.19.0 to be quite Unstable. I recommend it for testing environments, only. Even those parts that were not directly touched may have become a bit destabilized due to the runtime split. So please use it with care. Feedback, however, would be more than welcome, because I need to start somewhere to stabilize this release. That can only be done with your help. So please use it on test systems, try to break it and file bug reports when it fails.

Download:

http://www.rsyslog.com/Downloads-req-viewdownloaddetails-lid-102.phtml

Changelog:

http://www.rsyslog.com/Article221.phtml

File your bug reports here ;) :

http://bugzilla.adiscon.com/rsyslog-bugs.html

I hope this release is useful. Feedback is much appreciated.

rsyslog work log 9

Yesterday’s rsyslog work log:
2008-05-05
– made imgssapi work with new netstrm driver model
there were a couple of things where imgssapi was not compatible
with the new encapsulation. I did a somewhat dirty fix. The real
solution would be to turn gssapi functionality into a netstream
driver, which is too much for now (after all, we want to release
some time AND we need to have the code mature in practice
– added $DefaultNetstreamDriverCAFile config directive
– added $DefaultNetstreamDriverCertFile config directive
– added $DefaultNetstreamDriverKeyFile config directive
– added $ActionSendStreamDriver config directive

rsyslog work log 26

Yesterday’s rsyslog work log:
2008-03-04
– changed module interface to support querying obj interface (stage work)
– changed module interface version, as the interface change is quite large
– tweak omsnmp doc a bit (to cover Andre’s changed oid)
– did some portability changes to make rsyslog compile on HP UX
I couldn’t resist: I finally found a suitable HP UX machine on HP’s
testdrive system. So I looked at what it takes to make rsyslog
compile. Got this going after a relatively short while. The core
engine also seems to run, but there seem to be some issues. So far,
rsyslog seems to compile but it is questionable if it can acutally
run. I’ll look into this later (or as need arises), but will now
focus again on new features (portability, as a side-effect, also
often shows code that can be improved, so it is useful to look
at different platforms even if we do not eagerly need to support
them).