Exploring AMD’s Ambitious ROCm Initiative


AMD calls HIP “… a C++ runtime API and kernel language that allows developers to create portable applications that can run on AMD’s and other GPUs.” Although the HIP dialect of C++ is not the same as either CUDA or AMD’s native HC, the HIP format was carefully created to allow easy conversion to either format with little or no intervention. As shown in Figure 2, code written once for HIP can pass to either the NVIDIA or the AMD development stack, and the resulting program delivers performance similar to what you would get from coding directly for the native APIs. HIP delivers architecture-specific optimizations through conditional compilation for either platform.


Figure 2: ROCm’s HIP format lets the vendor write the code once and compile it for different hardware environments.


HIP lets the programmer code once and compile for either the NVIDIA or AMD GPU environment. And, because HIP (and the rest of the ROCm stack) is all open source, another vendor could theoretically add their own stack to the environment shown in Figure 2.

The HIP format is ideal for new code that you only have to write once, but what about all that existing CUDA code? That’s the other ingenious thing about HIP. The inherent compatibility that makes it possible to easily transition HIP to CUDA (as shown in Figure 2) also makes it easy to transform CUDA code to HIP. AMD provides a tool that automatically converts CUDA code into HIP format (Figure 3). This “hipify” script will convert up to 99% of the code to HIP format automatically.


Figure 3: The hipify script converts proprietary CUDA code to the vendor-neutral HIP format. The HIP code can then compile for either the NVIDIA or GPU environment.


As a proof of concept, the ROCm team ported the whole Caffe machine-learning framework (with around 55,000 lines of code) from CUDA to HIP: 99.6% of the code went unmodified or was automatically converted, and the remaining 0.4% took less than a week of developer time to tie up loose ends. And, now that HIP is officially recognized as a target for the Clang compiler, users can generate HIP code conveniently using tools of the standard C++ programming environment.

Support for Other Languages

The versatile LLVM infrastructure means that HCC can work with a wide range of programmer preference and expertise within the C/​C++ language family, from standard C, to standard C++, to STL parallel extensions, to the turbo-charged GPU-based features embodied in C++ AMP and HC. HIP and the hipify conversion tool bring CUDA into the mix. Beyond C and C++, the ROCm platform also supports Python Anaconda. Anaconda is a specialized version of Python tailored for scientific computing and large-scale data processing. ROCm also provides native support for the OpenCL framework. OpenCL (Open Compute Language) is an open standard maintained by the non-profit Khronos group that was originally envisioned as a heterogeneous framework for supporting CPU-based and GPU-based computing in parallel programming environments. In other words, OpenCL has some goals that are very similar to ROCm. AMD is a member of the Khronos group and has invested heavily over the years in OpenCL as a framework for GPU-accelerated programming.

When they started to envision the ROCm platform, AMD’s engineers knew they would need something more than OpenCL, which was originally intended for C programming and does not have the broad scope and versatility needed for ROCm. However, instead of replacing OpenCL, the developers integrated it into the ROCm infrastructure. OpenCL is fully supported as part of the ROCm platform. If you are accustomed to programming in OpenCL, you can continue to use it to access the resources available through the ROCm runtime. AMD doesn’t just support OpenCL in ROCm – they are actually working on improving it.

The OpenMP parallel programming API supports offloading to Radeon GPUs through Clang, so developers can access the advanced capabilities of Radeon GPUs from within OpenMP.