Lead Image © Lucy Baldwin, 123RF.com

Finding Your Way Around a GPU-Accelerated Cloud Environment

Speed Racer

Article from ADMIN 63/2021

By Federico Lucifredi

We look at the tools needed to discover, configure, and monitor an accelerated cloud instance, employing the simplest possible tool to get the job done.

Raw compute performance horsepower has migrated from the central processing unit into dedicated chips over the last decade. Starting with specialized graphic processing units (GPUs), it has evolved into ever more specialized options for artificial intelligence use (tensor processing unit – TPU). Some emerging applications even make use of user-programmed field-programmable gate arrays (FPGAs) to execute customized in-silicon logic. These enhanced computing capabilities require adopting domain-specific data parallel programming models, of which NVidia's CUDA [1] is the most widely used.

The rise of the cloud has made access to the latest hardware cost effective even for individual engineers, because coders can purchase time on accelerated cloud instances from Amazon Web Services (AWS), Microsoft Azure, Google, or Linode, to name but a few options. This month I look at the tools needed to discover, configure, and monitor an accelerated cloud instance in my trademark style, employing the simplest possible tool that will get the job done.

Knock, Knock. Who's There?

On logging in to an environment configured by someone else (or by yourself a few weeks prior), the first question you would pose is just what acceleration capabilities, if any, are available. This is quickly discovered with the command:

$ ec2metadata | grep instance-type
instance-type: p3.2xlarge

Variations of the ec2metadata tool query the AWS metadata service, helping you identify the instance's type. Alon Swartz's original ec2metadata [2] is found in Ubuntu releases like Bionic (18.04), on which the Deep Learning Amazon Machine Image (DLAMI) is currently based [3]. It has been replaced since by

...

Use one of the options below to read the full article