Using SquashFS

Using SquashFS is not difficult, comprising only two steps. The first step is to create a filesystem image using the SquashFS tools. You can create an image of an entire filesystem, a directory, or even a single file. This image, then, can be mounted directly (if it is a device) or mounted using a loopback device (if it is a file).

The tool that creates the image is called mksquashfs. It has a number of options that allow control over virtually all aspects of the image. The man page is not very long, and it’s definitely worth a look at the various options. Any user can create an image of any part of their data they desire. However, mounting it requires root access (or at least sudo access).

As an example, I'll take a directory (/home/laytonjb/20170502) on my desktop where I have stored PDFs, ZIP files, and other bits of information and articles that I collect throughout the month (I’m a digital hoarder). I want to compress this directory and all its subdirectories and files. Then, I want to mount it read-only so I can access the information but still save some space.

Before compression the directory was about 358MB:

$ du -sh
358M    .

The first step is to create the image file, which can be done by the user as long as the resulting image is stored somewhere the user has permission:

$ time mksquashfs /home/laytonjb/20170502 /home/laytonjb/squashfs/20170502.sqsh
Parallel mksquashfs: Using 4 processors
Creating 4.0 filesystem on /home/laytonjb/squashfs/20170502.sqsh, block size 131072.
[================================================-] 2904/2904 100%
Exportable Squashfs 4.0 filesystem, gzip compressed, data block size 131072
        compressed data, compressed metadata, compressed fragments, compressed xattrs
        duplicates are removed
Filesystem size 335196.73 Kbytes (327.34 Mbytes)
        91.53% of uncompressed filesystem size (366234.01 Kbytes)
Inode table size 8424 bytes (8.23 Kbytes)
        50.01% of uncompressed inode table size (16846 bytes)
Directory table size 2199 bytes (2.15 Kbytes)
        63.72% of uncompressed directory table size (3451 bytes)
Xattr table size 54 bytes (0.05 Kbytes)
        100.00% of uncompressed xattr table size (54 bytes)
Number of duplicate files found 1
Number of inodes 94
Number of files 93
Number of fragments 5
Number of symbolic links  0
Number of device nodes 0
Number of fifo nodes 0
Number of socket nodes 0
Number of directories 1
Number of ids (unique uids + gids) 1
Number of uids 1
        laytonjb (1000)
Number of gids 1
        laytonjb (1000)

Notice that the command gives a reasonable amount of output without being too verbose.

I used the command defaults, which means a block size of 128KiB (131,072 bytes) and the use of gzip to compress the data. In the output, SquashFS states that it was able to compress the data to 91.53% of its uncompressed size, or to 328MB (327.34MB).

Notice that I used the time command to time how long it took to run the command. The results were:

real    0m7.675s
user   0m29.074s
sys     0m1.002s

This looks to be pretty fast for compressing 358MB of data (on an SSD).

The next step is to mount the SquashFS image as you would any other filesystem. Out of the box, root needs to do this because the user does not have access to the mount command.

$ mount -t squashfs /home/laytonjb/squashfs/20170502.sqsh /home/laytonjb/20170502_new -o loop
$ mount
/home/laytonjb/squashfs/20170502.sqsh on /home/laytonjb/20170502_new type squashfs (ro,relatime,seclabel)

It all looks good. Now to look at /home/laytonjb/20170502_new to make sure everything is there and permissions are as expected:

$ ls -lsat
  830 -rw-r--r--.  1 laytonjb laytonjb   848854 Jun 10 13:58 mesos.pdf
  535 -rw-r--r--.  1 laytonjb laytonjb   546505 Jun 10 13:58 Martins2003CSD.pdf
 8803 -rw-r--r--.  1 laytonjb laytonjb  9013307 Jun 10 13:58 Hwang2012c.pdf

I can look at the files, and they are owned by me.

Optimization Study

The two major options you are likely to use are -comp [comp] and -b [bsize]. The first option allows you to specify the compression algorithm used (from the current options listed earlier). The second option allows you to control the block size (from the default of 128KiB to the maximum of 1MiB). Larger block sizes can help improve the amount of compression.

The simple command that uses the lzma compression and a 1MiB block size would be:

$ mksquashfs /home/laytonjb/20170502 /home/laytonjb/squashfs/20170502.sqsh -comp lzma -b 1048576

The directory I’ve used in the examples is full of PDF and ZIP files. I didn’t expect it to compress too much, but I did get some compression. As an experiment, I tried all four compression techniques with the default block size, 128KiB, and the maximum block size, 1MiB. The results are shown in the table.

Compression Technique Block Size User Time Compression
gzip 128KiB 00:29.074 91.53%
lzo 128KiB 01:36.262 92.31%
xz 128KiB 03:14.064 90.49%
lzma 128KiB 03:10.494 90.48%
gzip 1MiB 00:31.050 91.35%
lzo 1MiB 01:47.967 92.08%
xz 1MiB 03:47.730 88.71%
lzma 1MiB 03:44.004 88.78%

Pretty obviously, the fastest compression technique is gzipwith little difference in the user time it took for either block size (two-second difference, or a little less than 10%). The large block size did give a very tiny bit of extra compression.

The xz and lzma algorithms result in the most compression and take the longest – much longer than gzip – but even for the default block size, they can compress the data by about 10%. Using the largest block size, they can get a little more compression: a little over 11%.

You might scoff at 10%, but remember that the files are binary. If you have 100TB of data, 10% is 1TB. Not too bad. If you have 1PB, then 10% is 100TB, which is quite a bit of space.