Lead Image © ktsdesign, 123RF.com

Getting the most from your cores

Strong Core

Article from ADMIN 35/2016

By Jeff Layton

CPU utilization metrics tell you how well your applications are using your processing resources.

In a general sense, high-performance computing (HPC) means getting the most out of your resources. This translates to utilizing the CPUs (cores) as much as possible. Consequently, CPU utilization becomes a very important metric to determine how well an application is using the cores. On today's systems with multiple cores per socket and various cache levels that may or may not be shared across cores, determining CPU utilization might not be easy or simple to determine.

To explain this, a definition of CPU utilization is needed. As a starting point, I'll use the definition from Techopedia [1], which states:

CPU utilization refers to a computer's usage of processing resources, or the amount of work handled by a CPU. Actual CPU utilization varies depending on the amount and type of managed computing tasks. Certain tasks require heavy CPU time, while others require less because of non-CPU resource requirements.

The definition goes on to state:

CPU utilization should not be confused with CPU load.

This is a very important point in the quest for measuring CPU utilization of HPC applications – don't confuse CPU load and CPU utilization.

In the days of single-core CPUs, CPU utilization was fairly straightforward. If a processor was operating at a fixed frequency of 2.0GHz, CPU utilization was the percentage of time the processor spent doing work. (Not doing work is idle time.) For 50% utilization, the processor performed about 1 billion cycles worth of work in one second. Current processors have multiple cores, hardware multithreading, shared caches, and even dynamically changing frequencies. Moreover, the exact details of these components varies from processor to processor, making CPU utilization comparison difficult.

Current processors can have shared L3 caches across all cores or shared L2 and L1 caches across cores. Sometimes

...

Use Express-Checkout link below to read the full article (PDF).