Lead Image © Oleksiy Mark, 123RF.com

Lead Image © Oleksiy Mark, 123RF.com

Shared Storage with NFS and SSHFS

File Sharing

Article from ADMIN 47/2018
By
HPC systems require shared filesystems to function effectively. Two really good choices for both small and large systems are NFS and SSHFS.

Up to this point, my series on HPC fundamentals has covered PDSH, to run commands in parallel across a cluster's nodes, and Lmod, to allow users to manage their environment so they can specify various versions of compilers, libraries, and tools for building and executing applications. One missing piece is how to share files across a cluster's nodes.

File sharing is one of the cornerstones of client-server computing, HPC, and many other architectures. You can perhaps get away without it, but life just won't be easy any more. This situation is true for clusters of two nodes or clusters of thousands of nodes. A shared filesystem allows all of the nodes to "see" the exact same data as all other nodes. For example, if a file is updated on cluster node03, the updates show up on all of the other cluster nodes, as well.

Fundamentally, being able to share the same data with a number of clients is very appealing because it saves space (capacity), ensures that every client has the latest data, improves data management, and, overall, makes your work a lot easier. The price, however, is that you now have to administer and manage a central file server, as well as the client tools that allow the data to be accessed.

Although you can find many shared filesystem solutions, I like to keep things simple until something more complex is needed. A great way to set up file sharing uses one of two solutions: the Network File System (NFS) or SSH File System (SSHFS).

NFS

NFS, the most widely used HPC filesystem, is very easy to set up and performs reasonably well for small to medium-sized clusters as the primary storage. You can even use it for larger clusters if your applications don't read and write to it (e.g., /home).

The classic NFS approach to a shared directory is to export a directory or directories from the NFS server to compute nodes (clients). In general, any directory or directories can be exported. At a minimum, you should share /home. A special directory, such as /shared, might also be exported to the nodes. Given that I have already installed software to /usr/local/, I tend to export that directory in addition to /home.

A bonus of sharing /home is that the user's home directories include SSH keys. The cluster can be configured to use passwordless SSH to make running multinode applications much, much easier.

Installing NFS on your system varies by distribution. If it isn't installed by default, you can google for instructions. The compute nodes can be configured just as NFS clients, but the server that holds the filesystems to be exported should be configured as an NFS server.

On the NFS server, the first step is to specify the filesystems (directories) that are to be exported to the compute nodes. The /etc/exports file lists the filesystems and the permissions, such as:

/usr/local   192.168.0.1(ro) 192.168.0.2(ro)
/home        192.168.0.1(rw) 192.168.0.2(rw)

In this example, two filesystems are shared (first entry on each line), each to only two nodes – 192.168.0.1 and 192.168.0.2 – with a blank space between each host. Also, /usr/local is shared (exported) as a read-only filesystem to the nodes, and /home/ is shared as read-write. You can use IP addresses (which are best for static addresses) or hostnames.

A more advanced option is to export the filesystems to a range of IP addresses or to all IP addresses:

/usr/local   192.168.0.0/255.255.255.0(ro)
/home        192.168.*.*(rw,sync,no_root_squash) 192.168.0.2(rw,sync,no_root_squash)

For this case, the first line allows multiple IP address to be specified to all the machines with IP addresses between 192.168.0.0 and 192.168.0.255.

The second line uses wild cards in the IP addresses, along with some extra options. An explanation of the options include:

  • ro: The clients can mount the exported filesystem as read only.
  • rw: The clients can mount the exported filesystem as read-write.
  • sync: Forces the data from the clients to be stored on the NFS server before the acknowledgment is sent.
  • no_subtree_check: Prevents subtree checking. If the shared directory is a subdirectory, NFS performs a scan of every directory above it to verify its permissions and details. Disabling the subtree check might increase the reliability of NFS, but reduce security.
  • no_root_squash: Allows root to connect to the designated directory, which is useful if root access is needed on the clients.

The filesystems will be exported automatically if the NFS server is rebooted, but you can use the exportfs command to export or unexport the filesystems listed in /etc/exports manually.

On the NFS clients, you mount the exported filesystems with the mount command, as you would on any other filesystem. You can also list the filesystems that you want to mount in the /etc/fstab file. The format of the /etc/fstab entry for NFS filesystems is well documented.

Using /etc/fstab allows you to tune how the node mounts the filesystem, which means you can tune the filesystem on the client for performance. The list of NFS options [1] is fairly extensive, so I won't cover it here. Additionally, the fairly recent article Optimizing Your NFS Filesystem [2] covers many of these options.

One command that is very useful for managing and monitoring NFS filesystems is showmount, which allows you to list the client name or IP address of the client and the mounted directory in host:dir format. The command

showmount -e [host]

tells you what filesystems the NFS server is exporting from the specific host, which is useful when run from NFS clients.

NFS has been around a long time, has known failure modes, is a standard in the *nix world, and is easy to manage. It is very useful for small, medium, or even large clusters. You have a great deal of control over how the filesystems are exported to the clients. However, NFS versions before version 4 had no real security. Even NFSv4 requires security outside the NFS protocol. If you are concerned about intercepting data and general security, which all of us should be, then perhaps NFS isn't the best option. The next section presents an alternative that has more security.

SSHFS

Filesystem in Userspace (FUSE) [3] offer several attractive features. Such filesystems are easy to code because they are in user space (have you tried getting a filesystem into the kernel?) and provide lots of flexibility. SSHFS [4] uses FUSE to create a filesystem that can be shared by transmitting the data via SSH.

The SSHFS FUSE-based userspace client mounts and interacts with a remote filesystem as though the filesystem were local (i.e., shared storage). It uses SSH File Transfer Protocol (SFTP), so it's only as secure as SFTP. (I'm not a security expert nor do I play one on TV, so I can't comment on the security of SSH.) SSHFS can be very handy for working with remote filesystems, especially if you only have SSH access to the remote system. Moreover, you don't need to add or run a special client tool on the client nodes or a special server tool on the storage node; SSH just needs to be active on your system. Almost all firewalls allow port 22 access, so you don't have to configure anything extra (e.g., NFS or CIFS). Just be sure to open port 22 on the firewall, and all the other ports can be blocked.

SSHFS is not part of the typical installation, so you need to install and check that FUSE and SSHFS are working.

FUSE and SSHFS on Linux

The first step for any FUSE-based filesystem is to make sure that FUSE itself is included in the kernel or supplied as a module. Many distributions already come with FUSE built-in. A simple way to know whether FUSE has been built for your distro or kernel is to check whether the modules are loaded into memory.

$ lsmod
Module                  Size  Used by
fuse                   73530  0
...

The next step is to install SSHFS, either with your distribution's package management tools or from source. If you need to build it from source, the same simple build steps of many open source applications apply

$ ./configure --prefix=/usr
$ make
% make install

where the last step is performed as root. Just be sure to read the SSHFS page for package prerequisites. Notice that I installed SSHFS into /usr. The SSHFS binary is installed into /usr/bin, which is in the standard path (handy tip).

To make sure everything is installed correctly, you can simply try the sshfs -V command as a user (no root needed):

$ sshfs -V
SSHFS version 2.5
FUSE library version: 2.8.3
fusermount version: 2.8.3
using FUSE kernel interface version 7.12

This example is on an old desktop system, so the software versions are also old.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Sharing Data with SSHFS

    Sharing data saves space, reduces data skew, and improves data management. We look at the SSHFS shared filesystem, put it through some performance tests, and show you how to tune it.

  • Shared Storage with NFS and SSHFS

    HPC systems require shared filesystems to function effectively. Two really good choices for both small and large systems are NFS and SSHFS.

  • Combining Directories on a Single Mountpoint

    With some simple tuning, SSHFS performance is comparable to NFS almost across the board. In an effort to get even more performance from SSHFS, we examine SSHFS-MUX, which allows you to combine directories from multiple servers into a single mountpoint.

  • SSHFS for Shared Storage

    SSHFS is often overlooked as an HPC shared filesystem solution.

  • Small Tools for Managing HPC

    Several very sophisticated tools can be used to manage HPC systems, but it’s the little things that make them hum. Here are a few favorites.

comments powered by Disqus