AMD’s ROCm platform brings new freedom and portability to the GPU space.

Exploring AMD’s Ambitious ROCm Initiative

The open source ecosystem is a vast world of free tools and interacting components: drivers, APIs, compilers, and programming languages. The story of Linux and the open source movement has always been about building this ecosystem and completing the pieces of the puzzle to create a flexible, versatile, and all-free computing environment.

For many common scenarios, this constellation of free components is nearly complete; however, the crucial area of high-performance computing has had to contend with some non-free software components. In particular, the rise of the Graphics Processing Unit (GPU) has complicated a toolchain that had previously been focused on traditional CPU-based computing. Languages like CUDA evolved as a means for integrating the GPU into conventional C++ programming; however, CUDA was never really envisioned as a universal solution, and it is designed to support the GPU hardware of a single vendor. Other solutions, such as OpenCL, embrace the concept of open source but do not support a full range of programming alternatives.

As a leading vendor of both GPU and CPU technologies, AMD has taken up the challenge of bringing free, flexible, cross-platform, and language-independent computing to the GPU-accelerated HPC space. The result of this ambitious effort is the Radeon Open Compute Ecosystem (ROCm) platform, which AMD describes as “the first open-source HPC/Hyperscale-class platform for GPU computing that’s also programming-language independent.”

What is ROCm?

The ROCm developers wanted a platform that supports a number of different programming languages and is flexible enough to interface with different GPU-based hardware environments (Figure 1). As you will learn later in this article, ROCm provides direct support for OpenCL, Python, and several common C++ variants. One of the most innovative features of the platform is the Heterogeneous-Compute Interface for Portability (HIP) tool, which offers a vendor-neutral dialect of C++ that is ready to compile for either the AMD or CUDA/NVIDIA GPU environment.

Figure 1: ROCm is designed as a universal platform, supporting multiple languages and GPU technologies.

Lower in the stack, ROCm provides the Heterogeneous Computing Platform, a Linux driver, and a runtime stack optimized for “HPC and ultra-scale class computing.” ROCm’s modular design means the programming stack is easily ported to other environments.


At the heart of the ROCm platform is the Heterogeneous Compute Compiler (HCC). The open source HCC is based on the LLVM compiler with the Clang C++ preprocessor. HCC supports several versions of standard C++, including C++11, C++14, and some C++17 features. HCC also supports GPU-based acceleration and other parallel programming features, providing a path for programmers to access the advanced capabilities of AMD GPUs in the same way that the proprietary NVCC CUDA compiler provides access to NVIDIA hardware. AMD says it invested heavily in HCC because integrating GPU acceleration features directly into the compiler represents a chance to “approach computation holistically, on a system level, rather than as a discrete GPU artifact.”

C++ was not created for GPU-based parallel computing, and the standard forms of the language do not have the features necessary to capitalize on all the benefits of AMD’s GPU environment. A programmer who wants to engage the full range of parallel programming options needs to use some form of C++ language extension. In addition to its support for standards-based C++, HCC supports a pair of important parallel programming extensions:

  • C++ AMP (Accelerated Massive Parallelism) – Microsoft’s extension for HPC programming and GPU support.
  • HC (Heterogeneous Computing) – AMD’s own GPU-ready API.

Support for C++ AMP provides an easy transition for programmers who are accustomed to the Microsoft Visual Studio programming environment. Code written for C++ AMP can compile on HCC without the need to adapt.

According to AMD, the native HC API is “inspired” by AMP; however, “HC has some important differences from C++ AMP, including removing the ‘restrict’ keyword, supporting additional data types in kernels, providing more control over synchronization and data movement, and providing pointer-based memory allocation.” HCC with the HC and AMP extensions provide a complete solution for GPU-accelerated programming in AMD’s native hardware environment, but one piece of the puzzle remains. The goal of ROCm is to build a language- and vendor-neutral programming environment, and to do so, the ROCm developers knew they would need to build a bridge to the CUDA environment and the hundreds of programs and frameworks designed to work with CUDA. What they really wanted was a way to write code once and then compile it for either the CUDA environment or the HCC/AMD environment. The solution to this problem is perhaps the most innovative part of the ROCm stack: Heterogeneous-Compute Interface for Portability, also known as HIP.