Martin Schütte posted an interesting approach to solving the syslog/plain TCP unreliability issue in his blog:
Reliable TCP Reconect made Easy
In short, he tries to do a non-blocking recv() from the connection to see if the remote peer has shut it down. This may work and I will give it a try. However, as of my understanding it will NOT solve the issue of unreliability because of broken connections. I have to admit that I also think there is still a race condition (what if the server closes the connection after the client has done the recv() but before the send()…
I’ll report back as soon as I have some real-life data. It’s an interesting approach in any case and good to know somebody else is working on the same issues. That will hopefully bring us to a better overall solution :)