The Real Scope Behind the rsyslog Documentation Overhaul

For a concise Computer Science summary of this effort, see the section at the end of this article.

When I began the current documentation overhaul, the objective was never limited to cleaning up a few pages. From the beginning, the plan was to prepare rsyslog for the AI era. And the truth is simple: without modern AI tooling, this work would not have been feasible at this depth or speed.

Symbolic illustration showing documentation, an AI head, and a graph structure representing RAG.

The rsyslog documentation had grown into a dense, highly technical reference manual over many years. It reflected historical layering, incremental contributions, and corner cases. Experts could navigate it; newcomers often could not. And because rsyslog has a non-trivial architecture, the documentation effectively is the user interface to the system. If the docs are opaque, the system is opaque. That alone meant this could not remain a simple clean-up project.

The real shift came once AI was able to meaningfully assist restructuring. In my July 2025 article I wrote that AI helped turn “chaos into a plan”, and that is still accurate. AI lets me analyze, reorganize, rewrite, and evaluate documentation at a scale that was impossible manually. It also helps detect missing context, inconsistent terminology, and broken conceptual boundaries. The project would not have reached its current momentum without AI support.

Core Ideas behind the documentation restructuring

From the start, the restructuring has been tied to retrieval-augmented generation and machine ingestion. If AI is going to act as an alternative user interface to rsyslog — and it clearly will — the documentation must be prepared accordingly: predictable chunking, clear section boundaries, stable anchors, consistent naming, and useful metadata. JSON-LD and structured headings are part of this.

We have also begun some exploratory work around graph-based RAG. This is very early-stage internal evaluation, and nothing from it appears in the documentation. One limitation is that JSON-LD alone is not expressive enough for all internal relationships in rsyslog. Still, we are testing whether partial or imperfect mappings could help later. For now, this work has no user-facing impact and no timeline.

We also tied the docs closer to the actual code. Code agents — manually triggered at this stage — help verify key examples and parameter semantics. They serve two roles:

  • They validate the documentation by executing or simulating the examples.
  • They consume the documentation as input. When a code agent tries to generate code based on documented behavior and detects mismatches, it can reveal subtle bugs or inconsistencies in both the code and the documentation.

This dual role is new compared to traditional documentation workflows.

A Multi-Disciplinary Approach

As expected, more disciplines became involved along the way. Documentation is an HCI topic because this is how users form their mental model of the system. Reducing cognitive load brings in UX and cognitive psychology. Structured metadata touches knowledge representation. Designing text and layout that work for vector retrieval and for human readers touches information retrieval and AI interaction design. None of this was planned as an academic exercise; it simply emerged because the problem demanded it.

The effort is ongoing. We have not reached a complete overhaul, and that is fine. The direction is clear, and the progress is visible. AI tools now generate better answers based on the improved structure. Even Google search snippets seem more coherent. This is anecdotal and difficult to measure automatically, but it matches what we observe day-to-day. We are getting better continuously, and AI is part of that feedback loop.

Underneath everything sits rsyslog’s core: a deterministic, high-throughput pipeline engine with layered semantics, queueing models, a configuration DSL (RainerScript), and distributed-systems concepts embedded in its design. Documenting this accurately involves systems programming, parsing, reliability models, concurrency, and structured data processing. Adding AI creates a hybrid architecture: deterministic hot path here, asynchronous learning loops there.

This project spans HCI, UX, cognitive psychology, technical communication, knowledge representation, semantic web engineering, information retrieval, systems architecture, and AI-assisted tooling. Not because we set out to “use many disciplines”, but because this is what it takes to make a system like rsyslog understandable to both humans and machines. And realistically, only modern AI made this level of restructuring possible.

The docs are not finished and probably never will be in the traditional sense. But they are becoming clearer, more structured, more consistent — and increasingly suitable for the AI era. That has been the plan all along, and now the foundation is taking shape.


Concise CS Summary

Core goal: reshape rsyslog documentation into a structured, machine-navigable knowledge base suitable for both humans and AI.

Key technical components:

  • HCI/UX: documentation as the primary user interface; reduce cognitive load.
  • Knowledge representation: conceptual anchors, controlled vocabulary, structured linking.
  • Semantic Web: JSON-LD, entity modeling; exploratory graph-RAG evaluation (no user-facing impact).
  • Information retrieval: chunking, predictable boundaries, vector-friendly structure.
  • AI interaction: RAG alignment, LLM-ready layouts, hybrid semantic mapping.
  • Systems architecture: accurate description of pipeline design, queue semantics, delivery guarantees.
  • Tooling: manually triggered code agents verifying examples and detecting code-doc inconsistencies.

Outcome: documentation becomes a multi-purpose interface for users, developers, AI systems, and code agents.
Enabler: modern AI tools made this restructuring feasible at the required scale and quality.