What is it?

The Maven workstation and LAN cache

Mímir is a Maven 3 and Maven 4 extension that offers global caching on workstations. More precisely, Mímir is a Resolver 1.x and 2.x extension (is a RepositoryConnector) that is loaded via Maven extensions mechanism and extends Resolver.

As you may know, Maven historically uses “local repository” as mixed bag, to store cached artifacts fetched from remote along with locally build and installed artifacts. Hence, your local repository usually contains both kind of artifacts.

Mímir alleviates this, by introducing a new workstation wide read-through cache (by default in ~/.mimir/local) and placing hardlinks in Maven local repository pointing to Mímir cache entries. This implies several important things:

you have separated pure cache (of immutable artifacts; Mimir handles release artifacts only), unlike existing local repository, that is a mixed bag on your disk.
because of hardlinks, ideally you have only one copy of any cached artifact on your disk (as opposed to as many, as many local repositories you use).
is more compatible than “split local repository” as it is in reality “invisible” for Maven and Maven goals.

Also some consequences are:

you can easily adhere to “best practices” and delete your local repository often, as you still have it all locally (in Mímir caches). You will not lose you precious time by waiting to populate local repository.
backup or caching (like in CI case) is simple also: instead of tip-toeing and doing trickery with your local repository, just store and restore Mímir caches instead, you may forget local repository.

Advanced features of Mímir is LAN-wide cache sharing, so if you hop from workstation to workstation on same LAN, you don’t need to pull everything again, as your build will get it from the workstation that already has it. Nowadays with Gigabit LANs combined with modern WiFi networks, doing this is usually faster, than going to Maven Central.

Architecture

Mimir is composed of multiple components, and works in the following way:

classDiagram
    Node <|-- LocalNode
    Node <|-- RemoteNode
    LocalNode <|-- SystemNode

    class Node{
      locate(URI)
    }

    class LocalNode{
      store(Path)
    }

    class SystemNode{
      store(Entry)
    }

Where node name represents the content “locality”: local node is “local” to the user workstation, and hence, has access to OS filesystem. Remote node on the other hand, needs to retrieve the content from somewhere else (remote).

In other words: every node is able to “locate”, but local node is able to “store” local filesystem backed content, and system node is able to “store” Entry instances, usually originating from other Node instances.

One peculiarity of system nodes is that they can be “published” as well. Mimir supports two LAN publishers out of the box: http that uses Java built in HTTP server and socket that uses plain ServerSocket.

Most important Node implementations are (note: there are more, but they are usually aggregating nodes):

Node name	LocalNode	SystemNode	RemoteNode	Description
`file`	yes	yes	no	Uses OS filesystem
`minio`	yes	yes	no	Uses S3 storage
`bundle`	yes	no	no	Read-only local node, backed by “bundle” ZIP files
`daemon`	yes	no	no	Communicates via UDS to Mimir Daemon process
`jgroups`	no	no	yes	Uses JGroups messaging w/ publishers to LAN share cache content
`ipfs`	no	no	yes	Uses IPFS to WAN share cache content (not implemented yet)

Mimir Session

MimirSession (sharing lifecycle with MavenSession; lives within Maven process) requires one LocalNode and exposes two methods: locate(RemoteRepository, Artifact) and store(RemoteRepository, Artifact) (where Artifact must be resolved, hence backing file set).

Using MimirSession, the MimirConnector, that wraps original RepositoryConnector resolver would use, implements caching: it asks MimirSession to locate the required artifact. If local node has the artifact, the request is “short-circuited” and artifact with content is returned to resolver. If not, the MimirConnector falls back to original connector, and if the artifact is successfully retrieved, it is stored/cached for future use and then returned to resolver.

Out of local nodes, the interesting one is DaemonNode: this node in reality starts (unless not already running) a Mímir Daemon process, and uses Unix Domain Sockets to communicate with it. On workstation, there is only one Mímir Daemon running, and it is shared by all Maven processes (DaemonNodes in each separate Maven sessions).

Mimir Daemon process

Mímir Daemon requires one SystemNode and zero or more RemoteNodes. It implements “round-robin” strategy to locate the requested artifact: it first tries to locate it in SystemNode (which is local to the workstation), and if not found, it tries to locate it in RemoteNodes (which are remote, hence on LAN or WAN). If artifact is found in RemoteNode, it is retrieved and cached in SystemNode for future use.

Mimir without daemon process

In case of CI usage, daemon is usually unwanted overhead, as usually there is only one Maven process, hence one “client” of Mimir cache. In such cases, Mimir session can be configured to use other local node than DaemonNode is. In this case, a ~/.mimir/session.properties with content of mimir.session.localNode=<node name> can be used, for example file (remember, SystemNode extends LocalNode).

It is currently implemented using JGroups cluster messaging. In short, when JGroups backed RemoteNode is asked for URI, a message is broadcasted in cluster. Receiving cluster nodes will respond do they have the wanted content or not. If one or more node has it, it will create once-usable “token”, and access URI and send back as response. Finally, the originator will fetch the wanted content identified by “token” from the received URI from randomly chosen node that responded positively.

Not implemented yet

Most probably will use IPFS.

What is it?

Architecture

Mimir Session

Mimir Daemon process

Mimir without daemon process

Mimir LAN cache sharing

Mimir WAN cache sharing