Shared Storage with NFS and SSHFS

Summary

Shared filesystems are almost 100% mandatory for HPC. Although you can run applications without one, it is not pleasant. Shared filesystems allow a constant view of user data across all nodes in the cluster and can make admin tasks easier.

Two main options, particularly if you are just starting out, are NFS and SSHFS. NFS has been around a long time, and the failure modes are pretty well understood. It also comes with every distribution of Linux. NFSv4 is the latest and has lots of possibilities for clusters. It is also fairly simple to tune NFS for your workloads and your configuration.

SSHFS is a bit of a dark horse for clusters. However, it offers the possibility of a shared filesystem using SSH, which can help with security because only port 22 needs to be open (which you need for MPI application communications, anyway). SSHFS also uses SFTP encryption from one node to another. Combined with encrypted devices, you can have an end-to-end encrypted shared data service.

SSHFS is very tunable, and you can attain performance very close, if not equal, to NFS. I have not tested SSHFS at any scale larger than 32 clients and one data server, but it was quite stable and worked very well. If you need to go to a larger node count, I recommend you test SSHFS first before making a commitment.

The Author

Jeff Layton has been in the HPC business for almost 25 years (starting when he was 4 years old). He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales.