Mounting Archives with ratarmount

I previously wrote about the use of archivemount, which creates compressed archives with common tools such as tar  and gzip , so everything for a user project is contained in a single file. I’ve found these archives to be very handy in keeping projects together and saving storage space, but in my research for that article, I ran across another tool for mounting archives named ratarmount, which I like to call Reptar.

The author of ratarmount  found that archivemount  was too slow for large archives that contained lots of files – especially when various files are read in a way that looks like random access to the mounted archive. However, ratarmount  goes well beyond just faster file access, with the inclusion of other really useful features, such as:

  • multiple cores for parallel compression tools (you can do this with archivemount , but you have to use an alias);
  • recursive mounting (“it's full of TARs”);
  • read-only mountings;
  • union mounts, which allow you to combine multiple archives and bind-mounted folders or directories with the same mountpoint (very cool capability if you need it); and
  • accommodating remote files and directories by FTP, HTTP, HTTPS, SFTP, SSH, Git, GitHub, S3, Samba v2 and v3, Dropbox, and possibly others.

One aspect of ratarmount  that I don’t like is that it mounts the archive as read-only and then keeps any changes or new files in a separate file or directory. You have to keep both to have a complete view of the archive as it should be. If you lose or corrupt one of them, you lose all of the data. Alternatively, you have to uncompress the archive and then run through several ratarmount  commands to commit the changes and additions to the original archive, followed by compressing the archive again (a long and tedious task).

I prefer the archivemount  approach that just rebuilds the complete archive for you when you unmount it. It might be slower than ratarmount , but I find it an easier solution. Nonetheless, given these features plus improved speed for large numbers of files in an archive, I thought I would give ratarmount  a try. However, I don’t really have archives or projects with a large number of files, so I can’t test the performance in this article.

Installation

Installing can be very easy with pip  or conda, or, if things don’t go well, you can build it from source with a few package requirements. For this article, I go the conda  route. Listing 2 shows some of the output from the installation.

Listing 1: Installing ratarmount

$ conda install -c conda-forge ratarmount
Retrieving notices: done
Channels:
- conda-forge
- defaults
Platform: linux-64
Collecting package metadata (repodata.json): done
Solving environment: done

## Package Plan ##

environment location: /home/laytonjb/miniconda3

added / updated specs:
- ratarmount

The following packages will be downloaded:
...

Proceed ([y]/n)? y

Downloading and Extracting Packages:
Preparing transaction: done

Verifying transaction: done

Executing transaction: done

To mount the archive, I use:

$ ratarmount data1_08022025.tar.gz /home/laytonjb/DATA_STORE/DATA1

This command works fine, but the archive is mounted read-only (I haven’t shown this). To be able to change or add files to the archive, you must specify a directory where files can be stored. For this example, I create a new directory named /home/laytonjb/DATA_STORE/DATA1_WRITE. You can put this directory anywhere you want in the filesystem, as long as you can write to it, but be careful, because you don’t want to write or change anything in that directory without using the ratarmount  command. After this, you can mount the archive locally with the ratarmount  command in Listing 2.

Listing 2: Mounting the Archive Locally

(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ mkdir DATA1_WRITE
(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ ratarmount -w /home/laytonjb/DATA_STORE/DATA1_WRITE data1_08022025.tar.gz /home/laytonjb/DATA_STORE/DATA1
Creating new SQLite index database at :memory:
Creating offset dictionary for /home/laytonjb/DATA_STORE/data1_08022025.tar.gz ...
Creating new SQLite index database at /home/laytonjb/DATA_STORE/data1_08022025.tar.gz.index.sqlite
Creating offset dictionary for /home/laytonjb/DATA_STORE/data1_08022025.tar.gz took 0.05s
Writing out TAR index to /home/laytonjb/DATA_STORE/data1_08022025.tar.gz.index.sqlite took 0s and is sized 245760 B
Building cache for union mount (timeout after 60s)...
Cached mount sources for 5 folders up to a depth of 3 in 0.0264s for faster union mount.
Created mount point at: /home/laytonjb/DATA_STORE/DATA1

The mounting process gives you some information about what it’s just done. Notice that it creates an SQLite index database to make jumping to files much faster (it’s an index of the files in the archive). It then creates a union mount of the archive and the writable directory and then mounts that union mount.

To see if the files are there, take a look (note that I work from (base) laytonjb@laytonjb-MINI-S:~/DATA_STORE/DATA1/DATA1  for the following commands):

$ ls -s
total 0
0 cats_dogs_light
$ cd cats_dogs_light/
cats_dogs_light$ ls -s
total 0
0 test 0 train
cats_dogs_light$ cd test/
cats_dogs_light/test$ ls -s
total 9222
31 cat.9818.jpg 28 cat.9892.jpg 23 cat.9965.jpg 43 dog.9815.jpg 13 dog.9888.jpg
...

Everything seems to be there. As an experiment, add a zero-length file to the cats_dogs_light  directory:

cats_dogs_light$ ls -s
total 0
0 test 0 train
cats_dogs_light$ touch test.py
cats_dogs_light$ ls -s
total 0
0 test 0 test.py 0 train

The output shows that a new file has been created successfully in the new archive.

To unmount a ratarmount  archive, you can use a couple of commands (from the (base) laytonjb@laytonjb-MINI-S:~/DATA_STORE  directory), such as fusermount  or just umount. I personally just stick to umount  because it is the universal command:

$ umount /home/laytonjb/DATA_STORE/DATA1/

After unmounting, look at the files to see what's there (Listing 3). The first thing to notice is the file ending in sqlite, which is the archive index that ratarmount  uses to improve performance when accessing various files in the archive. It has the same name as the archive but adds .index.sqlite  to the end. When you remount the archive, this file will be read rather than going through the archive to build the index.

Listing 3: Archive File Content

(base) laytonjb@laytonjb-MINI-S:~/DATA_STORE$ ls -s
total 31400
4 DATA1 240 data1_08022025.tar.gz.index.sqlite
31152 data1_08022025.tar.gz 4 DATA1_WRITE

The second thing to notice is the directory DATA1_WRITE, which was created to store the additions and changes to the archive. It is just a directory, so it’s easy to find this file.

If you want to make the additions and changes part of the original archive, you have to follow several steps that are outlined in the README file in the GitHub repository. You could create a script to do this for you, but I would recommend thoroughly testing it before putting it into production.

Summary

The author of ratarmount  found a different way to mount an archive as a user with an eye to performance, specifically for archives with a large number of files. Although I like having the ability to make additions and changes to the archive that are automatically added when I unmount it, the author of ratarmount  uses this tool to achieve other objectives (see the useful features mentioned at the top of this article), as well.

Related content

  • Compressed Archives for User Projects

    The  archivemount tool lets you mount, code, test, and unmount compressed archives by project.

  • Filesystem Encryption

    The revelation of wide-spread government snooping has sparked a renewed interest in data storage security via encryption. In this article, we review some options for encrypting files, directories, and filesystems on Linux.

  • Using rsync for Backups

    Although commercial Linux backup tools are available, many people prefer open source to better understand and control the backup process. One open source tool that can do both full and incremental backups is rsync.

  • Read-only file compression with SquashFS
    If you are an intensive, or even a typical, computer user, you store an amazing amount of data on your personal computers, servers, and HPC systems that you rarely touch. SquashFS is an underestimated filesystem that can address that needed, but little used, data.
  • Read-only File Compression with SquashFS

    If you are an intensive, or even a typical, computer user, you store an amazing amount of data on your personal systems, servers, and HPC systems that you rarely touch. SquashFS is an underestimated filesystem that can address that needed, but little used, data.

comments powered by Disqus