Lead Image © Author, 123RF.com

Lead Image © Author, 123RF.com

Small-board computers

Think Small

Article from ADMIN 25/2015
Single-board computers, such as the Raspberry Pi, are very low cost and low power, yet are complete systems suitable for personal and educational projects. But are they HPC-worthy?

The Raspberry Pi [1], a simple, small, but complete system for about $35, has caught the attention of the world. Some people think it's just a cute system not suitable for serious applications, whereas others think the Raspberry Pi, or at least the same type of system, could be the next wave of HPC.

The Raspberry Pi was designed to excite the imagination of children in the field of computer science and electronics (see the "Rasp Pi Specs" box). The credit-card sized single-board computer (SBC) [2] has the basic components of any server, typically with everything on a single circuit board and few to no expansion slots built-in. Today's SBCs typically (but not always) come with the CPU, as well as the memory and other additions, soldered onto the board.

Rasp Pi Specs

The Raspberry Pi B+ has the following basic characteristics:

  • Broadcom BCM2835 system on a chip (SoC)
  • Single-core ARM1176JZF-S processor (32-bit processor) at 700MHz
  • Broadcom VideoCore IV GPU
  • 512MB of memory
  • Fast Ethernet (100Mbps)
  • MicroSD slot for local storage
  • Two USB ports
  • HDMI port
  • General-purpose I/O pins (GPIO)

This little SBC only consumes about 3W under load, but the performance isn't anything to call home about. A quick run of Linpack [3] shows that a Raspberry Pi Model B achieved about 0.065 giga-floating point operations per second (GFLOPS) for single-precision and 0.041GFLOPS for double-precision. This level of performance won't get a cluster of Raspberry Pi's on the TOP500 any time soon, but that's not the real point of the RPi. The point is to get fully featured systems into the hands of people to help teach them computer science.

Although SBCs have been around awhile, it was the Raspberry Pi that really got people excited, and many new SBCs have come out since then. This renewed call created a market that was quickly dominated by ARM processors [4]. It has also helped push Intel to develop lower power versions of x86 processors. Intel now has the single-core Quark processor [5], which has been incorporated into the Intel Galileo SoC [6]. Under load it uses about 15W, which is still pretty low. The price is a bit more than the Rasp Pi, costing about $60-$65 (EUR67-70), but it is pretty close.

Intel has also been developing the Intel Atom processor for a range of systems, including SoC's (System on chip) and a small-factor family of systems called Intel NUC (Next Unit of Computing) [7]. Many of these processors use less than 10W and come in dual-core and quad-core versions. Intel even has an eight-core server SoC called Avoton [8] that uses only 20W under load. These systems are more expensive than a Raspberry Pi, but they run faster and use just a little more power.

Raspberry Pis have been used in all sorts of projects from simple web servers [9], to robotics [10], to underwater ROVs [11], and yes, even clusters [12]. The appeal of the Rasp Pi is that it is cheap, uses almost no power, is easy to program, and is really small.

Wonderful World of SBCs

A plethora of SBCs cover a wide range of systems and price ranges. A couple of good articles give a summary of SBCs running Linux [13] and compare various systems [14].

For the sake of completeness, I've listed a range of SBC systems in Table 1 that might be of interest. The majority are 32 bit, particularly if they are ARM based, but some are x86 compatible. Both AMD and Intel have 64-bit SBCs.

Table 1

Single-Board Computer Specifications

Name OS Processor Cores GPU   Memory Ports Power Price URL
A10-OLinuxXino-Lime Linux All-winner A10 processor Single ARM Cortex-A8 @1GHz Mali-400   512MB DDR3 SATA connector, 2 USB, Fast Ethernet, USB OTG, HDMI 1.9W $44/EUR 30 https://www.olimex.com/wiki/A10-OLinuXino-LIME
A20-OLinuxXino-Micro Linux Allwinner A20 Dual ARM Cortex-A7 @1GHz Mali-400   1GB DDR3 SATA connector, USB, USB OTG, Fast Ethernet, HDMI, VGA 3W $67/EUR 55 https://www.olimex.com/wiki/A20-OLinuXino-MICRO
Arndale Octa Board Android 4.3 Jelly Bean Samsung Exynos 5420 Octa Quad ARM Cortex-A15 (32KB instruction/32KB data/2MB L2) @1.8GHz, Quad-core ARM Cortex-A7 (32KB/32KB/512KB) @1.3GHz Mali T-628 MP6   3GB LPDDR3e RAM (14.9GBps memory BW) Fast Ethernet, USB 2.0, USB 3.0, HDMI 3-4W $199 http://www.arndaleboard.org/wiki/
Creator CI20 Android 4.4 KitKat, Linux Ingenic JZ4780 Dual XBurst MPIS32 @1.2GHz (32KB/32KB/512KB) PowerVR SGX540   1GB DDR3, 4GB flash Fast Ethernet, 2 USB, USB OTG, HDMI 4W $65/EUR50 http://www.elinux.org/MIPS_Creator_CI20
Cubieboard2 Android 4.2, Cubieez Linux AllWinner A20 Dual ARM Cortex-A7 @1GHz (512KB L2) Mali-400   1GB DDR3 @480MHz, 4GB NAND flash Fast Ethernet, 2 USB, 1 SATA, HDMI, IR 5-6W $59 http://cubieboard.org/2013/06/19/cubieboard2-is-here/
CuBox-i4Pro Android 4.3/4.4, Linux Freescale i.MX6 ARM Quad ARM Cortex-A9 @1GHz Vivante GC2000   2GB DDR3 GigE Ethernet, eSATA II, 2 USB, MicroUSB, HDMI, IR 3W $139.99 http://www.solid-run.com/products/
Nvidia Jetson TK1 Linux 3.10.40 Nvidia Tegra K1 4-Plus-1 Quad ARM Cortex-A15 @2.3GHz 192-core Nvidia Kepler GK20A @950MHz (128KB L2) for 365GFLOPS with FP16 and FP32   2GB DDR3L (930MHz memory clock, 14.9GBps bandwidth), 16GB NAND flash (eMMC) SATA half-mini-PCIe, USB 2.0, USB 3.0, GigE Ethernet, HDMI, RS-232, GigE LAN 7-10W $192 https://developer.nvidia.com/jetson-tk1
Odroid-XU3 Android 4.4, Linux Samsung Exynos5422 Quad ARM Cortex-A15 @2.0GHz (32KB/32KB/2MB), Quad ARM Cortex-A7 @1.4GHz (32KB/32KB/512KB) Mali-T628 MP6   2GB LPDDR3 RAM (14.9GBps bandwidth) eMMC5.0 HS400 flash, Fast Ethernet (optional USB3.0 to GigE adapter), 4 USB 2.0, USB 3.0, USB 3.0 OTG, micro-HDMI, DisplayPort, MicroSD 10-20W $179.00/EUR119 http://www.hardkernel.com/
Gizmo 2 Linux, Windows Embedded 8 AMD G-series GX210HA Dual x86 @1GHz (1MB shared L2) for 85GFLOPS AMD Radeon HD 8210E discrete-class graphics (300MHz)   1GB DDR3 GigE Ethernet, 2 USB 2.0, 2 USB 3.0, HDMI 9W $199/EUR160 http://www.gizmosphere.org/products/gizmo-2/
Intel Galileo Yocto Linux, VxWorks (RTOS), Windows Intel Quark X1000 Single 32-bit Intel Pentium (x86) @400MHz Integrated Intel GPU   256 MB DDR3, 512KB embedded SRAM, 8MB NOR flash Fast Ethernet, mPCIe, USB 2.0, MicroUSB 2.0, MicroSD, other ports provided by add-on shields 2.5-4W $60/EUR57 https://www.sparkfun.com/products/12720
MinnowBoard Max Linux, Windows 8.1 Intel E3825 Dual x86 ATOM, 64-bit @1.33GHz (1MB L2) Intel Graphics @533MHz   2GB DDR3L GigE Ethernet, USB 2.0, USB 3.0, SATA2, MicroSD 6W+ $145/EUR149 http://www.minnowboard.org/meet-minnowboard-max/
ODROID-C1 Android 4.4 KitKat, Linux Amlogic S805 Quad ARM Cortex-A5 @1.5GHz Mali-450 MP2 @600MHz   1GB DDR3 GigE Ethernet, 4 USB 2.0, USB OTG, micro-HDMI, IR 10W $35/EUR44 http://www.hardkernel.com/
Parallella Linux Xilinx Zynq-7020 or -7010 Dual ARM Cortex-A9 @667MHz plus FPGA, 16-core Epiphany RISC coprocessor (32-bit)     1GB DDR3 GigE Ethernet, USB 2.0, micro-HDMI 1.9W + 2W $126/EUR119 https://www.parallella.org/board/
pcDuino3 Nano Android 4.2, Linux Allwinner A20 Dual ARM Cortex-A7 @1GHz Mali-400   1GB DRAM, 4GB flash GigE Ethernet, SATA, 2 USB 2.0, USB OTG, HDMI, IR 10W $40 http://store.linksprite.com/pcduino3-nano/
Udoo Quad Android, Linux Freescale i.MX6Quad Quad ARM Cortex-A9 @1GHz Vivante GC 2000 + Vivante GC 355 + Vivante GC 320   1GB DDR3 GigE Ethernet, SATA, 2 USB 2.0, MicroUSB serial, USB OTG, HDMI 3.7W idle $135/EUR99 http://shop.udoo.org/usa/product/udoo-quad.html

Processors now range from single, to dual, to quad, and even to octo-core processors, which comprise two different quad-cores – the big.LITTLE architecture [15]. All of the SBCs have GPUs for graphics, but Nvidia's Jetson TK1 system has 192 CUDA cores, so you can run HPC applications on the GPU as well as on the CPUs. The Parallela board currently has 16 cores connected by a mesh topology on-board that can be used to run parallel applications.

Ethernet is found throughout, including some Gigabit Ethernet, and all SBCs boot from some sort of flash storage (i.e., the root disk of the system), such as an eMMC card or an SD card, and have USB ports of some type, so you can plug an input device into the system or add external storage. Some SBCs have ports for SATA devices, and some have a mini-PCIe (mPCIe) slot, so you can add Gigabit Ethernet cards or even small SSDs.

The amount of memory varies widely from a low of about 256MB per core up to 1GB per core. The low memory capacity is attributable to the 32-bit processors, which have limited total addressable memory. Some of the systems also have a pretty good memory bandwidth of about 14.9GBps.

Two factors common across all of these SBCs are important: the low cost ($35-$192 per SBC in this list) and the low power usage, with the greatest being less than 20W.

Low Cost

Although price isn't always the most important objective in designing and building clusters, it's not to be ignored, especially when you are building your own system for personal use, or for education or research. When you factor in having to learn how to build and use clusters, buying conventional new or older used hardware often isn't an option. Consequently, inexpensive SBC-class hardware could be your best option. Moreover, a large number of inexpensive systems might turn out to be faster than a single more expensive system.

An SBC setup starting with one system, a simple network switch, and a boot flash card is fairly inexpensive (assuming you have a monitor and keyboard/mouse). For example, you can get a quad-core 32-bit ARM system with all of this for around $90 ($35 for an ODROID-C1, $20 for a simple GigE switch, and a few more dollars for an SD card, powered USB hub, power supply, etc.). Such a setup would allow you to start learning about parallel applications using threads (OpenMP) and MPI programming. After writing some new code or porting existing code, you could then over time inexpensively add more SBCs and continue working on the code.

Although SBCs might offer low performance at low cost with limited memory, which can limit the size of problems that can be addressed, an SBC cluster can minimize the money spent on hardware until a proof of concept or theory is proven or at least demonstrated.

Low Power

Power usage also can be a roadblock in building a cluster. When processors use 80W+ and you need to add memory and possibly a network card, you start worrying about power consumption. These SBCs use only 10W or less when under full load. I think I have Christmas tree ornaments that use more power than that. If I had 10 of the SBCs running under full load, the cluster would only use about 100W of total power, the equivalent of one "low-power" system.

Lower power consumption can also be important for various scenarios when using a cluster. For example, you might not have a great deal of extra power at your disposal for running higher powered systems, with no capacity to add more circuits. School environments, for example, typically don't have extra circuits for building clusters, especially if the building was constructed pre-1980s.

Power issues are also magnified in other areas of the world. Not everyone has access to stable, inexpensive power. Being able to build a cluster with a total power draw that is less than 50W is pretty advantageous. You can get a 50W solar panel for around $100. If you have enough sun, you could run a four-node cluster (you might need a bit more with an older CRT monitor). The total price for four nodes with quad cores, a GigE switch, the solar panel, cables, and various other bits needed, would come to about $450.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Moving HPC to the Cloud

    HPC has a unique set of requirements that might not fit into standard clouds. However, plenty of commercial options, including cloud-like services, provide the advantages of real HPC without the capital expense of buying hardware.

  • Getting Started with HPC Clusters

    Getting started in the HPC world requires learning to write parallel applications and learning to administer and manage clusters. We take a look at some ways to get started.

  • Exploring the Xeon Phi

    The Xeon Phi accelerator card from Intel takes an unusual approach: Instead of GPUs, the Xeon Phi features a cluster of CPUs for easier programming.

  • The History of Cluster HPC

    Stepping Back to Move Forward: Breaking the rules could offer some new avenues for the future of HPC.

  • ClusterHAT

    Inexpensive, small, portable, low-power clusters are fantastic for many HPC applications. One of the coolest small clusters is the ClusterHAT for Raspberry Pi.

comments powered by Disqus