This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Mímir

The Maven workstation and LAN cache

Maveniverse Mímir started as solution of my own problem: workstation hopping, and a LOT of download when picking up where I left on.

A Maven 3/4 extension that offers “global cache” on workstation, and shares cache via LAN to other Mimir running nodes.

Mímir is on Github: https://github.com/maveniverse/mimir

1 - What is it?

The Maven workstation and LAN cache

Mímir is a Maven 3 and Maven 4 extension that offers global caching on workstations. More precisely, Mímir is a Resolver 1.x and 2.x extension (is a RepositoryConnector) that is loaded via Maven extensions mechanism and extends Resolver.

As you may know, Maven historically uses “local repository” as mixed bag, to store cached artifacts fetched from remote along with locally build and installed artifacts. Hence, your local repository usually contains both kind of artifacts.

Mímir alleviates this, by introducing a new workstation wide read-through cache (by default in ~/.mimir/local) and placing hardlinks in Maven local repository pointing to Mímir cache entries. This implies several important things:

  • you have separated pure cache (of immutable artifacts; Mimir handles release artifacts only), unlike existing local repository, that is a mixed bag on your disk.
  • because of hardlinks, ideally you have only one copy of any cached artifact on your disk (as opposed to as many, as many local repositories you use).
  • is more compatible than “split local repository” as it is in reality “invisible” for Maven and Maven goals.

Also some consequences are:

  • you can easily adhere to “best practices” and delete your local repository often, as you still have it all locally (in Mímir caches). You will not lose you precious time by waiting to populate local repository.
  • backup or caching (like in CI case) is simple also: instead of tip-toeing and doing trickery with your local repository, just store and restore Mímir caches instead, you may forget local repository.

Advanced features of Mímir is LAN-wide cache sharing, so if you hop from workstation to workstation on same LAN, you don’t need to pull everything again, as your build will get it from the workstation that already has it. Nowadays with Gigabit LANs combined with modern WiFi networks, doing this is usually faster, than going to Maven Central.

Architecture

Mimir is composed of multiple components, and works in the following way:

classDiagram
    Node <|-- LocalNode
    Node <|-- RemoteNode
    LocalNode <|-- SystemNode

    class Node{
      locate(URI)
    }

    class LocalNode{
      store(Path)
    }

    class SystemNode{
      store(Entry)
    }

Where node name represents the content “locality”: local node is “local” to the user workstation, and hence, has access to OS filesystem. Remote node on the other hand, needs to retrieve the content from somewhere else (remote).

In other words: every node is able to “locate”, but local node is able to “store” local filesystem backed content, and system node is able to “store” Entry instances, usually originating from other Node instances.

One peculiarity of system nodes is that they can be “published” as well. Mimir supports two LAN publishers out of the box: http that uses Java built in HTTP server and socket that uses plain ServerSocket.

Most important Node implementations are (note: there are more, but they are usually aggregating nodes):

Node nameLocalNodeSystemNodeRemoteNodeDescription
fileyesyesnoUses OS filesystem
minioyesyesnoUses S3 storage
bundleyesnonoRead-only local node, backed by “bundle” ZIP files
daemonyesnonoCommunicates via UDS to Mimir Daemon process
jgroupsnonoyesUses JGroups messaging w/ publishers to LAN share cache content
ipfsnonoyesUses IPFS to WAN share cache content (not implemented yet)

Mimir Session

MimirSession (sharing lifecycle with MavenSession; lives within Maven process) requires one LocalNode and exposes two methods: locate(RemoteRepository, Artifact) and store(RemoteRepository, Artifact) (where Artifact must be resolved, hence backing file set).

Using MimirSession, the MimirConnector, that wraps original RepositoryConnector resolver would use, implements caching: it asks MimirSession to locate the required artifact. If local node has the artifact, the request is “short-circuited” and artifact with content is returned to resolver. If not, the MimirConnector falls back to original connector, and if the artifact is successfully retrieved, it is stored/cached for future use and then returned to resolver.

Out of local nodes, the interesting one is DaemonNode: this node in reality starts (unless not already running) a Mímir Daemon process, and uses Unix Domain Sockets to communicate with it. On workstation, there is only one Mímir Daemon running, and it is shared by all Maven processes (DaemonNodes in each separate Maven sessions).

Mimir Daemon process

Mímir Daemon requires one SystemNode and zero or more RemoteNodes. It implements “round-robin” strategy to locate the requested artifact: it first tries to locate it in SystemNode (which is local to the workstation), and if not found, it tries to locate it in RemoteNodes (which are remote, hence on LAN or WAN). If artifact is found in RemoteNode, it is retrieved and cached in SystemNode for future use.

Mimir without daemon process

In case of CI usage, daemon is usually unwanted overhead, as usually there is only one Maven process, hence one “client” of Mimir cache. In such cases, Mimir session can be configured to use other local node than DaemonNode is. In this case, a ~/.mimir/session.properties with content of mimir.session.localNode=<node name> can be used, for example file (remember, SystemNode extends LocalNode).

Mimir LAN cache sharing

It is currently implemented using JGroups cluster messaging. In short, when JGroups backed RemoteNode is asked for URI, a message is broadcasted in cluster. Receiving cluster nodes will respond do they have the wanted content or not. If one or more node has it, it will create once-usable “token”, and access URI and send back as response. Finally, the originator will fetch the wanted content identified by “token” from the received URI from randomly chosen node that responded positively.

Mimir WAN cache sharing

Not implemented yet

Most probably will use IPFS.

2 - Why would I use it?

The Maven workstation and LAN cache

In short: you want to use it for proper local repository maintenance, but it also helps with disk space usage as well, and real workstation wide caching, irrelevant of how many local repositories you use.

Finally, if you workstation-hop a lot (like I do) on same LAN, it makes pretty much sense to pick up on the new workstation where you left off on old workstation.

On CI-like setups it also simplifies caching between jobs, as all you need is to store Mímir cache after job finishes, and on subsequent job runs just restore it.

3 - How to use it?

The Maven workstation and LAN cache

Simplest way to use Mímir is with Maven 4, it supports user wide extensions. Just create ~/.m2/extensions.xml with following content (adjust Mímir version as needed):

<?xml version="1.0" encoding="UTF-8"?>
<extensions>
    <extension>
        <groupId>eu.maveniverse.maven.mimir</groupId>
        <artifactId>extension3</artifactId>
        <version>${mimirVersion}</version>
    </extension>
</extensions>

You can add it to your (parent) POM as well, as build extension:

    <extensions>
      <extension>
        <groupId>eu.maveniverse.maven.mimir</groupId>
        <artifactId>extension3</artifactId>
        <version>${mimirVersion}</version>
      </extension>
    </extensions>

Using it with Maven 3 is also possible and completely fine and compatible, but there you will need to set up per-project extensions in .mvn/extensions.xml file instead of one user-wide one.

One extra step is needed, in case you have non-trivial networking (like Docker, Tailscale or alike): you need to “help” Mimir a bit to figure out which networking interface belongs to your LAN. To achieve that, you need to create ~/.mimir/daemon.properties file with following content (use your LAN IP address):

mimir.localHostHint=match-address\:192.168.1.*

This will help JGroups and Mimir system node publishers to properly bind to interface that is used on your LAN.

With these, you are fully set up. Now just go and fire up a Maven or Maven Daemon build.

4 - Use case: Maven CI

The Maven workstation and LAN cache

TBD: explain how is Mímir used in Maven GH CI