Simple setup to have Living Documentation with Maven and Java

, Romain Manni-Bucau, 2019-10-09, 6 min and 45 sec read

Living documentation is one of the most important thing to setup in projects. There are a couple of area it is very efficient like to ensure the documentation of the configuration of your application, the artifacts (jar, war, docker image, ...) are correctly referenced with the last version, etc...

However, the first step is to enable to update the documentation during the build to ensure it is always up to date and potentially push documentation changes with your commit if you don't fully generate it at documentation build time - one word on that later.

Living Documentation and Maven

A living documentation task is generally a very custom task. You can find for some particular tools some existing plugin you can smoothly integrate in your build but it generally has two dependencies:

How you document your project (can be a simple README.adoc, a JBake, docusaurus, antora site, or any other technology),
How you extract from your code/project the information you want to inject into the documentation (Microprofile-Config annotations, custom logic, etc...)

It is rare a plugin handles both solution properly so you often end up using a script.

This is where this post is interesting. To be simple the script generally have these requirements:

It can reuse the application code - but we don't want to play with classloaders to rebuild project context,
It can be configured with maven variables (project.version for example),
It can access the documentation to update,
It can be integrated into the maven build as a plugin.

Write your first living documentation task

If you forget one second maven and assume you can mess up the project what would you do? Probably a plain Java main(String[]) which would enable you to run the application code and generate a String or write a file.

Question is then: how to do the link with Maven? The answer is quite simple: with exec-maven-plugin, this plugin will enable you to configure any main(String[]) and pass it arguments which can be maven variables.

Let's put it in practise to update a README.adoc. Goal will be to instantiate an application registry to list all the available entries in the readme - but you can also scan the application code to extract all @ConfigProperty using the same pattern.

The plugin configuration will be as simple as that:

<plugin>
  <groupId>org.codehaus.mojo</groupId>
  <artifactId>exec-maven-plugin</artifactId>
  <version>1.6.0</version>
  <executions>
    <execution>
      <id>update-README.adoc</id>
      <phase>prepare-package</phase>
      <goals>
        <goal>java</goal>
      </goals>
      <configuration>
        <mainClass>com.github.rmannibucau.sample.build.ReadmeUpdater</mainClass>
        <includeProjectDependencies>true</includeProjectDependencies>
        <classpathScope>compile</classpathScope>
        <arguments>
          <argument>${project.basedir}/README.adoc</argument>
        </arguments>
      </configuration>
    </execution>
  </executions>
</plugin>

Here the important part - except the binding of the goal to a phase executed during local build, are the following entries:

mainClass: it is the fully qualified name of the living documentation main(String[]),
includeProjectDependencies: we want to be able to use application classes,
classpathScope: default is runtime but it skips provided dependencies so can prevent your application code to run, therefore we set it to compile to include them,
arguments: here we pass the path to the README.adoc file but we can pass any maven variable which makes our main completely configurable.

Then our main will face another challenge: how to update the README.adoc? There are multiple options. The most obvious is to replace it completely but it has the drawback to prevent to edit it. Personally I prefer to inject into the README.adoc some markers to identify blocks which will be fully managed by the build. In the context of this post it can be as simple as a start and end comments:

= My Project

Some nice description

== Some subpart

Some nice text.

== List of available tasks

//begin:list_of_available_tasks

=== Task 1

The task 1 is cool

=== Task 2

This is another cool task

//end:list_of_available_tasks

== Some other subpart

We can see that //begin:list_of_available_tasks and //end:list_of_available_tasks will not be rendered but will be usable in our generation logic to replace what is in between.

At that stage we just need to write our generator to replace this block delimited by two comments by the new content:

@Slf4j
@NoArgsConstructor(access = PRIVATE)
public final class ReadmeUpdater {
    public static void main(final String[] args) throws IOException {
        final Path path = Paths.get(args[0]); // 1
        final String content = Files.lines(path).collect(joining("\n")); // 2
        final String output = replace(content, "list_of_available_tasks", findTasks()); // 3
        if (!output.equals(content)) { // 4
            Files.write(path, output.getBytes(StandardCharsets.UTF_8), StandardOpenOption.WRITE, StandardOpenOption.TRUNCATE_EXISTING);
            log.info("Updated '{}'", path);
        } else {
            log.info("'{}' didn't change", path);
        }
    }

    // 5
    private static String replace(final String document, final String marker, final String newContent) {
        final String startText = "//begin:" + marker + '\n';
        final int start = document.indexOf(startText);
        final int end = document.indexOf("//end:" + marker + '\n');
        if (start < 0 || end <= start) {
            throw new IllegalStateException("No block found for '" + marker + "'");
        }
        return document.substring(0, start) + startText + '\n' + newContent + '\n' + document.substring(end);
    }

    // 6
    private static String findTasks() {
        return new TaskRegistry()
                .findAll()
                .map(m -> "=== " + m.getIdentifier() + "\n\n" + m.getDescription() + '\n')
                .sorted()
                .collect(joining("\n"));
    }
}

We simply read program arguments normally and thanks to the plugin setup we did we get maven variables already resolves,
We load in memory the file we want to update,
We replace the block we want in the file (this is why we used the syntax //[begin|end]:<name>, to be able to identify multiple blocks and replace the same replace() method),
We update the original file only - and only if - the content changed,
The replace method is as simple as extracting the block, copying the outside parts and replacing the inside part by the new value,
the findTask() method is the one doing the link with the application code (here, TaskRegistry would be an application class).

TIP: if you don't know how to handle the blocks or don't want to recode that part you can rely on the swizzle library.

Now when you build your project, the documentation is auto-updated :).

There are a few last tips which can be important to note:

If you need a library only for documentation purposes, you can add it to the project in scope provided. Note however that more you will add of libraries, more conflicts you can get with the application code. It will still be harnessed with your tests but ensure to not impact too much your application with very hude conflicting dependency stacks.
You can put as much tasks of that kind as you want, you don't need to use a single main(String[]).
Living documentation works better with declarative codestyle (annotations vs programmatic style to caricature it). For instance for Microprofile Config, prefer @ConfigProperty over config.getValue().
You can ensure the documentation classes are not delivered in the final binary by putting all documentation classes under the same pattern (personally i use a dedicated package called build but nested classes ending with Documentation would work too) adding an exclusion in maven-jar-plugin and/or maven-war-plugin:

<plugin>
  <groupId>org.apache.maven.plugins</groupId>
  <artifactId>maven-jar-plugin</artifactId>
  <version>3.1.2</version>
  <configuration>
    <excludes>
      <exclude>**/build/*</exclude>
    </excludes>
  </configuration>
</plugin>

Commit or not generated files

This is a good question and mainly depends the requirements of your website and how it is generated.

Typically, if you use a static generator which handles a single source or version, then you can generate in a temporary folder and forget to commit the files.

However if you handle multiple versions, you want to update your "current" version and update only this version so it will be saner to commit the file in the right folder version. It is the same if you use a multi-source generator like antora which will clone multiple git repositories (or branches/tags) to generate a single website.

Finally, the last case is if you want to expose your documentation on github - like our README.adoc. You will also need to commit the file otherwise the people will need to go directly on the deployed version (generated with volatile/temporary files) but it will not user friendly.

Conclusion

This post showed that with very few tips you can reuse your application code to generate quickly an up-to-date documentation for anything which can be modified in your code.

The next steps are generally to launch the generation of the website on a CI and push it to github pages or any other hosting (including jenkins publishHTML feature ;)) to expose it and let you and your users get immediately the information without any risk for it to not be up to date.

From the same author:

Romain Manni-Bucau

In the same category:

Java/EE/Microprofile

Simple setup to have Living Documentation with Maven and Java

Living Documentation and Maven

Write your first living documentation task

Commit or not generated files

Conclusion

Navigation