Building a Virtual NVMe Drive

An economical and high-performing hybrid NVMe SSD is exported to host servers that use it as a locally attached NVMe device.

Profiling Python Code

Profiling Python code – as a whole or by function – shows where you should spend time speeding up your programs.

Linux device mapper writecache

Kicking write I/O operations into overdrive with the Linux device mapper writecache.

Porting CUDA to HIP

Give your proprietary CUDA code new life with an open platform for HPC.

High-Performance Python – Distributed Python

Scale Python GPU code to distributed systems and your laptop.

High-Performance Python – GPUs

Python finally has interoperable tools for programming the GPU – with or without CUDA.

High-Performance Python – Compiled Code and Fortran Interface

Fortran functions called from Python bring complex computations to a scriptabllanguage.

High-Performance Python – Compiled Code and C Interface

Although Python is a popular language, in the high-performance world, it is not known for being fast. A number of tactics have been employed to make Python faster. We look at three: Numba, Cython, and ctypes.

OpenMP – Coding Habits and GPUs

In this third and last article on OpenMP, we look at good OpenMP coding habits and present a short introduction to employing OpenMP with GPUs.

In the Loop

Diving deeper into OpenMP loop directives for parallel code.