Simple HDF5 in Python and Fortran

Summary

HDF5 has many features that make it probably the most used standard file format in HPC today. It's flexible, it's multiplatform, it has a large number of language interfaces, and it's easy to use.

In this article, I showed how easy HDF5 is to use with a couple of languages – Python and Fortran. Python is not a language the HDF5 Group directly supports with their distribution, so I used a third-party interface, h5py, a Pythonic interface to HDF5 that is very, very simple to use, as demonstrated in the quick example.

I used Fortran to represent compiled languages. Fortran is one of the languages that the HDF Group supports with their distribution. Writing and reading data to an HDF5 file from Fortran is not difficult, although it takes a bit more work than Python.

In both cases, the hierarchical nature of the HDF5 format makes writing and reading data very natural for both Python and Fortran. Although the examples here are simple, it is possible to manipulate complicated data structures with either language and perform I/O very efficiently to and from an HDF5 file.

One of the key attributes of HDF5 that I didn't mention is that Parallel HDF5 is included in the source code and is available through a configure option. That is, several processes, either on the same node or on different nodes, can write to the same file at the same time. This capability can reduce the time an application spends on I/O, because all of the processes are performing a portion of the I/O. In the next article, I want to talk about using HDF5 with parallel applications.