- Page: 3 » ADMIN Magazine

40%

08.07.2018

Home » HPC » Articles »

The pdsh parallel shell tool lets you run a command across multiple nodes in a cluster. ... to run a command across a number of nodes in a cluster. A parallel shell is a simple but powerful tool that allows you to do so on designated (or all) nodes in the cluster, so you do not have to log ... The pdsh parallel shell tool lets you run a command across multiple nodes in a cluster.

37%

HPC fundamentals

16.08.2018

Home » Archive » 2018 » Issue 46: CMS S... »

to run a command across multiple nodes in a cluster. A parallel shell is a simple but powerful tool that allows you to do just that on designated (or all) nodes, so you do not have to log in to each node ... The pdsh parallel shell is a fundamental HPC tool that lets you run a command across multiple nodes in a cluster.

37%

Building a HPC cluster with Warewulf 4

04.04.2023

Home » Archive » 2023 » Issue 74: The F... »

A Warewulf-configured cluster head node with bootable, stateless compute nodes [1] is a first step in building a cluster. Although you can run jobs at this point, some additions need to be made ... Warewulf installed with a compute node is not really an HPC cluster; you need to ensure precise time keeping and add a resource manager.

37%

CPU affinity in OpenMP and MPI applications

03.02.2022

Home » Archive » 2022 » Issue 67: syst... »

socket: 32 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model ... Get better performance from your nodes by binding processes and associating memory to specific cores.

36%

Unleashing Accelerated Speeds with RAM Drives

02.08.2021

Home » Archive » 2021 » Issue 64: Bare... »

that other compute nodes can take advantage of the high speed. In the following example, I rely on the NVMe over Fabrics concept and, more specifically, the NVMe target modules provided by the Linux kernel ... Enable and share performant block devices across a network of compute nodes with the RapidDisk kernel RAM drive module.

36%

Remora – Resource Monitoring for Users

08.12.2020

Home » HPC » Articles »

Remora provides per-node and per-job resource utilization data that can be used to understand how an application performs on the system through a combination of profiling and system monitoring. ... whether applications were running correctly or incorrectly, I thought it would be great to get a snapshot of what was happening on all of the nodes involved in the job. I like to think ... Remora provides per-node and per-job resource utilization data that can be used to understand how an application performs on the system through a combination of profiling and system monitoring.

36%

Processor and Memory Metrics

12.02.2014

Home » HPC » Articles »

One goal of HPC administration is effective monitoring of clusters. In this article, we talk about writing code that measures processor and memory metrics on each node. ... to measure the 15-minute node load average every few seconds? If the application takes 12 hours to run, do you need to measure the CPU load every two to three seconds? The main result of frequent monitoring ... One goal of HPC administration is effective monitoring of clusters. In this article, we talk about writing code that measures processor and memory metrics on each node.

36%

Look for Bottlenecks with Open|SpeedShop

21.12.2011

Home » HPC » Articles »

Open|SpeedShop is an open source multiplatform Linux performance tool targeted at performance analysis of applications running on both a single-node and on large-scale platforms. ... Open|SpeedShop is an open source multiplatform Linux performance tool targeted at performance analysis of applications running on both a single node and on large-scale IA64, IA32, EM64T, AMD64, IBM

35%

Monitoring NFS Storage with nfsiostat

12.03.2013

Home » HPC » Articles »

Previously we talked about using iostat to monitor local storage on your server or compute nodes, but what if you use NFS in your compute nodes to run jobs? The nfsiostat tool can help you ... In my last article, Monitoring Storage Devices with iostat, I wrote about using iostat to monitor the local storage devices in servers or compute nodes. The iostat tool is part of the sysstat family ... Previously we talked about using iostat to monitor local storage on your server or compute nodes, but what if you use NFS in your compute nodes to run jobs? The nfsiostat tool can help you

34%

Correctly integrating containers

09.10.2017

Home » Archive » 2017 » Issue 41: Kuber... »

communication, the Kubernetes [1] network model does not use Network Address Translation (NAT). All containers receive an IP address for communication with nodes and with each other, without the use of NAT ... If you run microservices in containers, they are forced to communicate with each other – and with the outside world. We explain how to network pods and nodes in Kubernetes.

Search