Following up on my post on rsyslog’s new native email capability, an interesting conversation arose. I’d like to share it with you:
> > I promise to listen very carefully and try to implement anything that is
> > doable and makes sense in the rsyslog context.
> >
> One thing springs to mind – I think “sendmail” support is more important
> than you give it credit.
>
> What if you’ve got an alert rule in rsyslog to email you when your
> network link fails – but your SMTP server is at the other end of the
> link? :-) If you used sendmail – you get requeuing and retrying for free
> – I don’t think you want to have to add that to your SMTP support…
Well, that’s actually not an issue at all in rsyslog. The rsyslog core
engine is reliable [to be precise: can be configured to be reliable,
it’s not by default] in a way that exactly handles this situation. In
rsyslog, any action, including now mail, can run on its own queue. When
an action fails, it tells the rsyslog core that it could not
successfully complete. Then, the rsyslog core schedules retries until it
finally succeeds. While doing so, the messages are kept inside a queue.
This queue is in memory as long as that’s sufficient and is moved to
disk if there is demand (e.g. rsyslog shutdown, running out of
configured in-memory queue space). A sample of such a configuration
(this time with the database writer), can be found at:
http://www.rsyslog.com/doc-rsyslog_high_database_rate.html
Bottom line – rsyslog is designed to work with failing destinations and
automatically recover these. So there is nothing special needed to make
it handle a failing smtp connection.
In fact, I consider the SMTP direct mode more reliable than the sendmail
mode, exactly because of that feature. With sendmail, I hand over the
message to an external entity but do not know if delivery succeeded.
With SMTP direct, I know at least it made its way to the SMTP sever.
Granted, I don’t know if the SMTP server will ultimately deliver it, but
I have a bit more control over what’s going on.
For example: rsyslog also has a mode where it can use backup actions if
things fail (after n retries). So let’s consider the example above.
Let’s say we have an urgent alert, but the smtp server is down. With
sendmail, I hand the message over to sendmail but do not know that
sendmail actually queues it. With smtp direct, I *know* that the smtp
server is unresponsive. Depending on the urgency, I may either do a few
retries or I may immediately switch to another delivery method. For
example, I may than go to try SNMP. Or I may do another email action in
this case and try to contact a email-to-sms gateway so that this can be
delivered.
Please note that in rsyslog one can have multiple actions chained
together. So a probable scenario to handle such a case could be
1. try to email via the corporate server
2. if that fails, try to email via a public gateway
3. if that fails, start a program to do some automagic action
All of this is possible because of I do not use sendmail. But, again, I
of course do not know if the mail server I used with rsyslog succeeds in
its delivery attempt. One weak spot always remains ;)
To use yesterday’s sample, one could use a backup SMTP server with just
a little bit of configuration as follows:
$ModLoad ommail
$template mailSubject,”disk problem on %hostname%”
$template mailBody,”RSYSLOG Alertrnmsg=’%msg%'”
# primary action
$ActionMailSMTPServer mail.example.net
$ActionMailFrom rsyslog@example.net
$ActionMailTo operator@example.net
$ActionMailSubject mailSubject
# make sure we receive a mail only once in six
# hours (21,600 seconds ;))
$ActionExecOnlyOnceEveryInterval 21600
# the if … then … mailBody mus be on one line!
if $msg contains ‘hard disk fatal failure’ then :ommail:;mailBody
# begin backup action, carried out if primary fails
$ActionExecOnlyWhenPreviousIsSuspended on
$ActionMailSMTPServer mail2.example.net
$ActionMailFrom rsyslog@example.net
$ActionMailTo operator@example.net
$ActionMailSubject mailSubject
$ActionExecOnlyOnceEveryInterval 21600
& :ommail:;mailBody