ratarmount Archive Mount Tool

Faster goes the archive mount.

I previously wrote about the use of archivemount, which creates compressed archives with common tools such as tar and gzip, so everything for a user project is contained in a single file. I’ve found these archives to be very handy in keeping projects together and saving storage space, but in my research for that article, I ran across another tool for mounting archives named ratarmount, which I like to call Reptar.

The author of ratarmount found that archivemount was too slow for large archives that contained lots of files – especially when various files are read in a way that looks like random access to the mounted archive. However, ratarmount goes well beyond just faster file access, with the inclusion of other really useful features, such as:

  • multiple cores for parallel compression tools (you can do this with archivemount, but you have to use an alias);
  • recursive mounting (“it's full of TARs”);
  • read-only mountings;
  • union mounts, which allow you to combine multiple archives and bind-mounted folders or directories with the same mountpoint (very cool capability if you need it); and
  • accommodating remote files and directories by FTP, HTTP, HTTPS, SFTP, SSH, Git, GitHub, S3, Samba v2 and v3, Dropbox, and possibly others.

One aspect of ratarmount that I don’t like is that it mounts the archive as read-only and then keeps any changes or new files in a separate file or directory. You have to keep both to have a complete view of the archive as it should be. If you lose or corrupt one of them, you lose all of the data. Alternatively, you have to uncompress the archive and then run through several ratarmount commands to commit the changes and additions to the original archive, followed by compressing the archive again (a long and tedious task).

I prefer the archivemount approach that just rebuilds the complete archive for you when you unmount it. It might be slower than ratarmount, but I find it an easier solution. Nonetheless, given these features plus improved speed for large numbers of files in an archive, I thought I would give ratarmount a try. However, I don’t really have archives or projects with a large number of files, so I can’t test the performance in this article.

Installation

Installing can be very easy with pip or conda, or, if things don’t go well, you can build it from source with a few package requirements. For this article, I go the conda route. Listing 2 shows some of the output from the installation.

Listing 1: Installing ratarmount

$ conda install -c conda-forge ratarmount
Retrieving notices: done
Channels:
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

environment location: /home/laytonjb/miniconda3

added / updated specs:
- ratarmount

The following packages will be downloaded:
...

Proceed ([y]/n)? y

Downloading and Extracting Packages:
Preparing transaction: done

Verifying transaction: done

Executing transaction: done

To mount the archive, I use:

$ ratarmount data1_08022025.tar.gz /home/laytonjb/DATA_STORE/DATA1

This command works fine, but the archive is mounted read-only (I haven’t shown this). To be able to change or add files to the archive, you must specify a directory where files can be stored. For this example, I create a new directory named /home/laytonjb/DATA_STORE/DATA1_WRITE. You can put this directory anywhere you want in the filesystem, as long as you can write to it, but be careful, because you don’t want to write or change anything in that directory without using the ratarmount command. After this, you can mount the archive locally with the ratarmount command in Listing 2.

Listing 2: Mounting the Archive Locally

(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ mkdir DATA1_WRITE
(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ ratarmount -w /home/laytonjb/DATA_STORE/DATA1_WRITE data1_08022025.tar.gz /home/laytonjb/DATA_STORE/DATA1
Creating new SQLite index database at :memory:
Creating offset dictionary for /home/laytonjb/DATA_STORE/data1_08022025.tar.gz ...
Creating new SQLite index database at /home/laytonjb/DATA_STORE/data1_08022025.tar.gz.index.sqlite
Creating offset dictionary for /home/laytonjb/DATA_STORE/data1_08022025.tar.gz took 0.05s
Writing out TAR index to /home/laytonjb/DATA_STORE/data1_08022025.tar.gz.index.sqlite took 0s and is sized 245760 B
Building cache for union mount (timeout after 60s)...
Cached mount sources for 5 folders up to a depth of 3 in 0.0264s for faster union mount.
Created mount point at: /home/laytonjb/DATA_STORE/DATA1

The mounting process gives you some information about what it’s just done. Notice that it creates an SQLite index database to make jumping to files much faster (it’s an index of the files in the archive). It then creates a union mount of the archive and the writable directory and then mounts that union mount.

To see if the files are there, take a look (note that I work from (base) laytonjb@laytonjb-MINI-S:~/DATA_STORE/DATA1/DATA1 for the following commands):

$ ls -s
total 0
0 cats_dogs_light
$ cd cats_dogs_light/
cats_dogs_light$ ls -s
total 0
0 test 0 train
cats_dogs_light$ cd test/
cats_dogs_light/test$ ls -s
total 9222
31 cat.9818.jpg 28 cat.9892.jpg 23 cat.9965.jpg 43 dog.9815.jpg 13 dog.9888.jpg
...

Everything seems to be there. As an experiment, add a zero-length file to the cats_dogs_light directory:

cats_dogs_light$ ls -s
total 0
0 test 0 train
cats_dogs_light$ touch test.py
cats_dogs_light$ ls -s
total 0
0 test 0 test.py 0 train

The output shows that a new file has been created successfully in the new archive.

To unmount a ratarmount archive, you can use a couple of commands (from the (base) laytonjb@laytonjb-MINI-S:~/DATA_STORE directory), such as fusermount or just umount. I personally just stick to umount because it is the universal command:

$ umount /home/laytonjb/DATA_STORE/DATA1/

After unmounting, look at the files to see what's there (Listing 3). The first thing to notice is the file ending in sqlite, which is the archive index that ratarmount uses to improve performance when accessing various files in the archive. It has the same name as the archive but adds .index.sqlite to the end. When you remount the archive, this file will be read rather than going through the archive to build the index.

Listing 3: Archive File Content

(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ ls -s
total 31400
4 DATA1 240 data1_08022025.tar.gz.index.sqlite
31152 data1_08022025.tar.gz 4 DATA1_WRITE

The second thing to notice is the directory DATA1_WRITE, which was created to store the additions and changes to the archive. It is just a directory, so it’s easy to find this file.

If you want to make the additions and changes part of the original archive, you have to follow several steps that are outlined in the README file in the GitHub repository. You could create a script to do this for you, but I would recommend thoroughly testing it before putting it into production.

Summary

The author of ratarmount found a different way to mount an archive as a user with an eye to performance, specifically for archives with a large number of files. Although I like having the ability to make additions and changes to the archive that are automatically added when I unmount it, the author of ratarmount uses this tool to achieve other objectives (see the useful features mentioned at the top of this article), as well.

Related content