Articles

News

Vendors

    Whitepapers

    Write for Us

    About Us

    Exploring AMD’s Ambitious ROCm Initiative

    Other Tools

    The ROCm environment comes with many additional tools for system maintenance and application support. The ROCm-docker repository, for instance, contains a framework for building the ROCm software layers into portable Docker container images. If you work within a containerized environment, ROCm’s Docker tools will allow you to integrate ROCm easily into your existing organizational structure. (Note that a Docker container does not include the Linux kernel itself, so you’ll need to be sure a ROCm-ready kernel is running on the host system.) The ROCm environment also comes with a collection of debugging tools, including a HIP debugger and ROCm-GDB, a version of the GDB debugger modified for the ROCm platform.

    A system management interface (ROCm-SMI) supports a number of functions related to system time and temperature settings. In GPU environments, clock speed is an important consideration, and AMD GPUs can operate at a variety of different clock levels to optimize speed and energy usage. As with all high-performance environments, the clock speed has an effect on energy use, which has an effect on the temperature of the system. ROCm-SMI has options for measuring temperature, controlling voltage, and managing the fan speed. You can integrate the commands of the system management interface into programs and scripts to build speed and temperature controls directly into the programming environment. See the ROCm documentation (Figure 4) for more information on ROCm management and development tools.

    Figure 4: See the ROCm documentation at rocm-documentation.readthedocs.io for more on ROCm tools, utilities, and supporting components.

    Conclusion

    AMD’s ROCm platform is a bold step toward portability and heterogeneous computing in the HPC space. AMD’s GPU product line now has an equivalent to the benefits available with NVIDIA’s GPUs through the CUDA framework, but ROCm goes a step further by creating a complete language-independent and hardware-independent path for GPU-accelerated programming. A developer can write the code once and then compile it for either the CUDA/NVIDIA or the ROCm/AMD environment.

    Porting math tools and machine learning frameworks like TensorFlow and CAFFE to the ROCm platform ensures immediate relevance for some important subject areas in the HPC space. ROCm’s vision for a GPU-based all-Free Software programming stack could affect the whole high-performance computing industry. The free and modular architecture means other vendors can easily integrate their own technologies into the ROCm stack, and the easy path for porting existing languages and frameworks to the neutral HC format will simplify the learning curve for programmers who want to stay within their preferred coding environment.

    Resources

    [1] ROCm

    [2] ROCm – A New Era in GPU Computing

    [3] ROCm documentation

    [4] ROCm video

    [5] HIP

    [6] HIP datasheet

    [7] MIOpen