Why we use Static Code Analysis

We use static code analysis for two reasons. Both of them should probably be well-know, but discussions show that that’s not always the case. So I thought writing a small blog post makes sense.

The first reason is obvious: static analyzers help us catch code problems in early stages, and they do so without any special effort needed by test engineers. The analyzer “thinks” about many cases a human being does not think about and so can catch errors that are sometimes embarrassingly obvious – albeit you would have still overlooked them. Detecting these things early saves a lot of time. So we try to run the analyzers early and often (they are also part of our CI for that reason).

The benefits come at an expense, and this expense is named “false positive“. They happen and I always get asked if I can’t make an exception to cover such a thing. Unfortunately, I cannot. If I would allow one static analyzer fail into the QA system, all further builds would fail, triggering static analysis unusable. So, sorry, if you run into a false positive, you need to find a way to work around it. In my experience the “const” keyword in C is a little gem that not only helps secure against accidental variable modification but also gets you going a long way in regard to static analyzers. But, granted, sometimes it’s hard to work around false positives. It’s worth it, so just do it ;-)

The second reason for using static analysis also seems obvious, but in my experience is often overlooked: humans tend to forget some important test cases. It is well-known and accepted that test should be crafted by QA engineers instead of the folks that wrote the code (because if the developer would otherwise only test what he had thought about in the first place). For smaller projects, that’s not always possible, but even more important QA folks also can overlook necessary test cases. The risk is reduced by using specific test crafting methodology, but it still exists. This is especially true as due to combinatorical explosion┬ánot all configuration setting interactions can be tested. So picking dynamic tests is always a compromise between what is a) seen at all, b) desirable, and c) possible. Static analysis helps with these problems. While it obviously can also fail, I have seen static analysis more than once detect things that we did not cover in dynamic tests. That way it introduces an additional layer of protection. It also sometimes brought up the need for additional dynamic tests.

It should be mentioned that fuzzing is also a great thing to have inside a QA system, but I unfortunately did not yet have the opportunity to deploy it on some real project. But even fuzzing when done by Google is limited by the same combinatorical explosion problem in regard to configuration settings. For example, rsyslog has many more than 250 config settings, so we have more than 2^250 = 18092513943330655534932966407607000000000000000000000000000000000000000000000 configurations we would need to fuzz – simply impossible [yes, an approach would be to fuzz the tuple (config,data), but that’s a different topic ;-)].

Static analysis is not the answer to the software QA problem. But it is an extremely valuable building block!

The clang thread sanitizer

Finding threading bugs is hard. Clang thread sanitizer makes it easier. The thread sanitizer instruments the to-be-tested code and emits useful information about actions that look suspicious (most importantly data races). This is a great aid in development and for QA. Thread sanitizer is faster than valgrind’s helgind, which makes it applicable to more use cases. Note however that helgrind and thread sanitizer sometimes each detect issues that the other one does not.
This is how thread sanitizer can be activated:
  • install clang package (the OS package is usually good enough, but if you want to use clang 5.0, you can obtain packages from http://apt.llvm.org/)
  • export CC=clang // or CC=clang-5.0 for the LLVM packages
  • export CFLAGS=”-g -fsanitize=thread -fno-omit-frame-pointer”
  • re-run configure (very important, else CFLAGS is not picked up!)
  • make clean (important, else make does not detect that it needs to build some files due to change of CFLAGS)
  • make
  • install as usual
If you came to this page trying to debug a rsyslog problem, we strongly suggest to run your instrumented version interactively. To do so:
  • stop the rsyslog system service
  • sudo -i (you usually need root privileges for a typical rsyslogd configuration)
  • execute /path/to/rsyslogd -n …other options…
    here “/path/to” may not be required and often is just “/sbin” (so “/sbin/rsyslogd”)
    “other options” is whatever is specified in your OS startup scripts, most often nothing
  • let rsyslog run; thread sanitizer will spit out messages to stdout/stderr (or nothing if all is well)
  • press ctl-c to terminate rsyslog run

Note that the thread sanitizer will display some false positives at the start (related to pthread_cancel, maybe localtime_r). The stack trace shall contain exact location info. If it does not, the ASAN_SYMBOLIZER is not correctly set, but usually it “just works”.
Doc on thread sanitizer ist available here: https://clang.llvm.org/docs/ThreadSanitizer.html