Lead Image © artqu, 123RF.com

Lead Image © artqu, 123RF.com

Scalable network infrastructure in Layer 3 with BGP

Growth Spurt

Article from ADMIN 40/2017
Large environments such as clouds pose demands on the network, some of which cannot be met with Layer 2 solutions. The Border Gateway Protocol jumps into the breach in Layer 3 and ensures seamlessly scalable networks.

Large-scale virtualization environments have ousted typical small setups. Whereas a company previously purchased a few physical servers to deploy an application, today, the entire workload of a new setup ends up on virtual machines running on a cloud service provider's platform.

A physical layout often is based on a tree structure (Figure 1), with the admin connecting all the servers to one or two central switches and adding more switches if the number of ports on a switch is not sufficient. Together, the switches and network adapters form a large physical segment in OSI Layer 2.

Figure 1: Classic network architectures that follow a tree structure are not suitable for scale-out platforms.

In this article, I describe how you can build an almost arbitrarily scalable network for your environments with Layer 3 tools. As long as two hosts have any kind of physical communication path, communication on Layer 3 works, even if the hosts in question reside in different Layer 2 segments. The Border Gateway Protocol (BGP) makes this possible by providing a way to let each server know how to reach other servers; "IP fabric" describes data center interconnectivity via IP connections.


New setups in virtual environments deliver much shorter time to market for the customer: Admins no longer need to order the hardware and suffer annoying waits for delivery, installation, and roll-outs. Many benefits also arise for cloud operators: Virtual environments such as public clouds are far more uniform than a variety of individual setups and can be managed more efficiently. Also, horizontal scaling is easier because these platforms can be expanded almost at will.

The changes also affect planning in the IT environment. Previously, IT designed a single setup, built it, and operated it until a new solution replaced the old one. In contrast, massively scalable environments are designed not only for the next five years, but well into the future.

Add the size factor: A cloud environment starts life as a basic setup and grows continuously as the corresponding user demand increases. When planning a public cloud, the planners do not know the target size and must be suitably cautious. If a company makes an error, the consequences that appear later in everyday business can be fatal, making the company put considerable effort into building workarounds to compensate for the flaw in the design of the solution.

Conventional wisdom says the earlier a design flaw is identified while planning a platform, the cheaper it is to remedy. According to a speech by Barry Boehm at EQUITY 2007 [1], the cost of working around design bugs after the requirements have been specified increases non-linearly as the project moves through design (5x), coding (10x), development testing (20x), acceptance testing (50x), and production (>150x): If the design bug is identified and removed in the design phase, the costs are manageable, but if the fault only becomes apparent when the platform is in production, the costs multiply [but see [2] for a dissenting opinion].


On the software side, admins can now access a toolkit to help build large environments. Clouds like OpenStack or container-based solutions such as Kubernetes are factory-built for scalability. Off-the-shelf hardware that is not directly designed for horizontal scaling out of the box is in many cases nevertheless integrated into a scale-out setup by the software: Ceph, for example, easily turns ordinary servers into a scalable object store that can provide a capacity of multiple petabytes.

Scaling, however, still has one major challenge: the network. Clouds like OpenStack make demands on both the logical network and the physical network on the hardware side that are virtually unsolvable with conventional network designs. Whereas software-defined networking (SDN) has long since asserted itself in several variants for logical networks, the physical level can be a tight squeeze for several reasons.

Conventional Tree Structure

Typical network layouts do not work in massively scalable environments because, if an enterprise is planning the network for a classic standalone setup, the maximum target size is known and usually limited to a certain number of servers. If more ports are required, the switch cascade continues on the underlying switch levels, illustrating the disadvantages of the tree structure. On the one hand, the admin is confronted sooner or later with the Spanning Tree Protocol (STP) – long-suffering networkers can tell many a tale of this – and on the other hand, only a fraction of the performance that the main switch could provide actually reaches the final members of such a cascade.

In massively scalable environments, the central premise on which the tree approach described here is founded falls away – the target scale-out is completely unknown. A new customer might want to launch 600 virtual machines on the fly. Depending on the configuration, for the provider, this means they need to add dozens of servers to the racks virtually overnight because the customer will otherwise lease from Amazon, Microsoft, or Google.

At least the total number of required ports is a known value on which planners can base their calculations for the setup. If dozens of servers suddenly find their way into the data center, the network infrastructure needs to grow at the same rate, which cannot be done with tree-like setups and switch cascades.

Admins come under attack from another corner: It is by no means certain three years after the original setup that you will still be able to buy the same network hardware on which you initially relied. Even with devices by the same manufacturer, later models are not guaranteed to be compatible with their predecessors. Detailed tests are therefore needed in such scale-out cases, as they are in cases where admins are looking to replace legacy network hardware with newer, more powerful components.

If you need to install devices from other manufacturers for a later scale-out, you risk a total meltdown: Although all the relevant network protocols are standardized, if you have ever tried to combine devices from different manufacturers, you are well aware that the interesting thing about standards is that there are so many of them.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • OS10 and Dell's open networking offensive
    Dell's OS10 is a Linux-based operating system for network hardware that is designed to free network admins from the stranglehold of established manufacturers. We look at what it is, how the system works, and what it can do for you.
  • Software-defined networking with Windows Server 2016
    Windows Server 2016 takes a big step toward software-defined networking, with the Network Controller server role handling the centralized management, monitoring, and configuration of network devices and virtual networks. This service can also be controlled with PowerShell and is particularly interesting for Hyper-V infrastructures.
  • Useful tools for automating network devices
    Armed with the right tools, you can manage your network infrastructure both automatically and effectively in a DevOps environment.
  • Spanning Tree Protocol
    Ethernet is so popular because it simply works and is inexpensive. However, the administration side looks a bit more complicated: For the network to run smoothly, the admin might need to make important decisions about the Spanning Tree protocol.
  • Network overlay with VXLAN
    VXLAN addresses the need for overlay networks within virtualized data centers accommodating multiple tenants.
comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=