SSHFS is often overlooked as an HPC shared filesystem solution.

SSHFS for Shared Storage

HPC systems typically have some sort of shared filesystem (e.g., NFS, Lustre, BeeGFS), each with pros and cons. One solution often overlooked is sshfs, which belongs to a class of filesystems that work in userspace. For Linux, these filesystems are based on FUSE (Filesystems in Userspace) and have advantages and disadvantages, on which I won’t elaborate here.

The sshfs userspace client mounts and interacts with a remote filesystem as though it were local (i.e., shared storage). It uses the SSH File Transfer Protocol (SFTP) between hosts, so it’s as secure as SSH. (I’m not a security expert, so I can’t comment on the security of SSH and various other tools in the family.) SSHFS can be very handy when working with remote filesystems, especially if you only have SSH access to the remote system: You just need SSH active on both systems. Almost all firewalls are set up to allow port 22 access or have mapped port 22 to a different port that can accommodate SSHFS. All the other ports can be blocked by the firewall. Moreover, SSHFS can also be run by users without root or sudo access.

Installing SSHFS on Linux

Virtually all Linux distributions include FUSE and SSHFS. You can use your distribution package tool(s) to see whether the FUSE package, sometimes labeled libfuse-dev, and sshfs are installed. However, if you have to build it, you will need a couple of tools – Meson and Ninja – which come with almost all distributions. The SSHFS website has good instructions on how to do the build and install.

The simplest way to check whether sshfs is installed is to run the command,

$ sshfs -V
sshfs version 2.8
FUSE library version: 2.9.7
fusermount version: 2.9.7
using FUSE kernel interface version 7.19

(e.g., on Ubuntu 18.04).

Initial SSHFS Test

For these initial tests, I’m exchanging data between two servers in my very heterogeneous cluster and using passwordless SSH between these nodes to make testing easier. Configuring SSH not to use passwords is covered in a number of articles on the web, so I won’t cover it here. Before proceeding to the next step, be sure you can use SSH from one node to the next without passwords. Note that you don’t have to be passwordless: You can type in a password if you want (it won't change the functionality).

In addition to the previously mentioned system running Ubuntu 18.04, another system in my setup runs Ubuntu 20.04. I’ll just refer to them as the 18.04 system and the 20.04 system for convenience.

For the initial test, I’ll mount a subdirectory, NOTES, from the 18.04 system onto the 20.04 system. Remember that because SSHFS is in userspace, you can mount and unmount filesystems as a user whenever you want. The general form of the SSHFS mount command is:

$ sshfs user@home:[dir] [local dir]

The form of the command looks very much like mounting other filesystems.

The first thing to note in this command is that [local dir] has to exist and you must be able to access it; otherwise, you will get a mount error. Second, you must have access to the remote directory, user@home:[dir]. The section user@home refers to the user and the name or IP of the remote system.

Although SSHFS can be employed a number of ways, most users typically use it for their /home directory or other directories to which they have access – that is, not system directories.

For my example, the command I use on the 20.04 system is:

$ sshfs /home/laytonjb/NOTES2

As with mounting other filesystems, when you mount a remote filesystem, the directory has to exist. If the local directory NOTES2 does not exist on the 20.04 system, you get an error. Also, any files that are in the directory where you mount the remote filesystem are then “hidden” while the filesystem is mounted.

To check whether the filesystem is mounted, use the mount command:

$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
/dev/nvme1n1p1 on /home type ext4 (rw,relatime)
/dev/sda1 on /data type ext4 (rw,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=6579384k,mode=700,uid=1000,gid=1000)
gvfsd-fuse on /run/user/1000/gvfs type fuse.gvfsd-fuse (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000) on /home/laytonjb/NOTES2 type fuse.sshfs (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000)

Notice that the last line shows that the remote filesystem is mounted by sshfs.

After I checked that the filesystem was mounted, I then simply listed the files in the mounted directory:

$ ls -s
total 60
4 anaconda_notes.txt                4 continuum_notes.txt
4 git-notes.txt                     4 nv-docker-installation-notes.txt
4 run_firefox                       4 conda_notes.txt
4 cuda-notes.txt                    4 jeffy1.txt
4 nvidia-docker-notes.txt           4 ZFS_Lustre_notes.txt.gz
4 container_notes.txt               4 docker-installation-notes.txt
8 new_system_notes.txt              4 quotes_1.txt

These are the files on the remote system, so the remote filesystem has been mounted successfully.

Although this procedure might seem dull, mundane, and boring, it is quite powerful. All of this happens as a user. The root user does not need to be involved, nor do you need sudo permissions. In fact, you can install sshfs in your own account, and if port 22 is open and SSH is working, you can use SSHFS.

Moreover, all of the data traffic between servers is done over SFTP. Unless you change the standard SSH configuration, this means the data will be encrypted when in transit. If you used an encrypted filesystem for the remote directory and for the local directory, you would have end-to-end data encryption that you, the user, set up. Again, no involvement of the root user beyond just configuring user filesystems to be encrypted.

Another subtle thing to notice is that you don’t have to rely on a single server to host all of the data for “clients” as you would expect in an NFS configuration. One system, creatively named system1, could provide data to several others systems; a second server named system2 could provide a different set of data to another set of systems, and so on. The difficult task is remembering which system “owns” what data. Of course, don’t forget to backup your data – wherever it resides.

Unmounting the filesystem is very simple. For my example, I use the umount command on the 20.04 system; then I look at the mount output to make sure the filesystem is unmounted.:

$ umount /home/laytonjb/NOTES2
$ mount
sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
/dev/nvme0n1p1 on /boot/efi type vfat (rw,relatime,fmask=0077,dmask=0077,codepage=437,iocharset=iso8859-1,shortname=mixed,errors=remount-ro)
/dev/nvme1n1p1 on /home type ext4 (rw,relatime)
/dev/sda1 on /data type ext4 (rw,relatime)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=6579384k,mode=700,uid=1000,gid=1000)

One additional aspect you need to pay attention to is that if you have not modified your SSH settings and the SSH session is idle, it will automatically log out and unmount your SSHFS filesystems. If this is not the behavior you want, you can keep the connection active by adding the following line to your ~/.ssh/config file:

ServerAliveInterval 5

Setting this option sends a “keep alive” signal every five seconds, so the connection appears to be active. You can use a longer time period if you like, but be careful that it’s not too long.

An alternative to modifying your ~/.ssh/config file is to have the system administrator modify the /etc/ssh/ssh_config file in the same manner. This action will affect all users on the system.

Tuning SSHFS

SSHFS has a number of options that tune how the remote filesystem is mounted. A quick look at the man pages or the online man pages show that some options appear to influence performance. (Some options can influence security, but I do not cover those in this article.) Specifically, the following options could influence performance:

  • -o sshfs_sync – synchronous writes
  • -o no_readahead – synchronous reads (no speculative readahead)
  • -o cache=BOOL – enables caching {yes,no} (default: yes)
  • -o cache_timeout=N – sets timeout for caches in seconds (default: 20)
  • -o cache_X_timeout=N – sets timeout for {stat,dir,link} caches
  • -o compression=BOOL – enables data compression {yes, no}
  • -o direct_io – uses direct I/O
  • -o kernel_cache – caches files in kernel
  • -o [no]auto_cache – enables caching on the basis of modification times
  • -o max_readahead=N – sets the maximum readahead
  • -o async_read – performs reads asynchronously (default)
  • -o sync_read – performs reads synchronously

Previously, you had an option to specify the SSH cipher on the sshfs mount command line. In later versions of SSHFS, this option was removed, but you can still access it with the ssh_commands= option, which passes sshd_config options to sshfs.

The first option tells SSHFS that any write operation is synchronous, so the write operation doesn’t return success until the data is on disk. This option probably would not improve performance. In some situations, the disk will respond that the data is on disk when it isn’t, but the operating system can’t do anything about it.

Looking through the list, the following options can help performance:

  • The readahead options can help improve performance, particularly if the data is being read sequentially.
  • Caching can help performance.
  • Related to caching, timeouts allow you to adjust the cache to achieve good performance while reducing memory usage.
  • The kernel_cache option can help performance.
  • Data compression can help, but it involves the use of CPU resources at either end. It can also hurt performance if the amount of time to compress the data takes longer than it would to transmit the uncompressed data. However, typical processors have lots of cores, so using one to compress and uncompress data might not noticeably affect performance.
  • If direct_io is important for your application, it can have an effect on performance, so be sure you test this option.
  • The async_read flag can really improve read performance, because the read operation immediately returns even if the data is not yet returned to the requesting host, thus allowing system resources to be used while waiting for the data. However, you might experience resource conflicts and I/O failure.
  • The sync_read flag is similar to the sshfs_sync flag, but it affects reads.

OpenSSH has a number of ciphers, which are algorithms that do the encryption or decryption of data. The ciphers supported in OpenSSH 7.3 are:

  • 3des-cbc
  • aes128-cbc
  • aes192-cbc
  • aes256-cbc
  • aes128-ctr
  • aes192-ctr
  • aes256-ctr
  • arcfour
  • arcfour128
  • arcfour256
  • blowfish-cbc
  • cast128-cbc

As of OpenSSH 7.6, the arcfour, blowfish, and cast ciphers have been removed.

Some of these ciphers take more computational resources and more time than others, and some take much less time and fewer computational resources. The configurability of sftp allows you to choose a cipher that meets your needs. For example, arcfour is extremely fast – almost as fast as no encryption – but it doesn’t provide the best encryption because of known exploits. If you are very confident that the systems using the cipher won’t be compromised and you are interested in the fastest possible data transfer, then arcfour may be something you want to try.

In addition to the sshfs and sshd_config options, you can also tune TCP for better SSHFS performance. By default, TCP configurations in Linux are fairly conservative. You can change parameters, such as the MTU size and buffer sizes for reads and writes. Several articles about this can be found online, including one on the ADMIN HPC website for NFS tuning that you can use as a start.

In a previous article, I tested some of the sshfs, sshd_config, and TCP configuration options, as well as NFS to get see of how SSHFS compared. You’ll find that some of the options can have a huge effect on performance. The choice of cipher and data compression can also have very large effects on performance. Although not tested in the article, it is likely that asynchronous reads and writes will have a large effect on performance, as well.