Successful protocol analysis in modern network structures
Hunting the Invisible
A primary focus in virtualization is on simple processes, which can mean that actual validation and monitoring of critical parameters is not sufficiently considered. Virtualization also comes with the drawback of no longer being able to use the analysis tools you used to use because of the lack of data visibility. In this article, I show how to establish meaningful monitoring and analysis functions on the network, even when virtualization is involved.
A legacy protocol analysis tool (e.g., the open source utility Wireshark) is a standalone device or a piece of software on a PC that identifies problems, errors, and events relating to the network. Additionally, these tools contribute to determining the reasons for poor network performance by visualizing protocol information and the corresponding network activities.
Measuring Methods for Networks
Unfortunately, network and data center virtualization creates blind spots in your server infrastructure, as well as invisible networks. Because a major part of the traffic is routed via cloud infrastructure (in the form of virtual tunnel endpoints), this traffic does not even touch the physical networks in many cases. This means that administrators lose visibility into their data and, consequently, control over communication flows. For this reason, the data on the computer systems and networks need to be made visible again – and various tools are available for doing this.
On switched networks, the data required for data analysis is not transferred to every port. The switch only forwards broadcasts and packets with unknown receiver addresses to all ports. If the switch has the MAC address of the receiver in its switch table, the packets in question are only sent to the port on the target device.
This necessitates new troubleshooting strategies. For this reason, most switches support mirroring with the help of the port mirroring function, which means that the link to be investigated is mirrored on another port of the switch to which the analysis device is connected. Some manufacturers even output the traffic from multiple switch ports to a single mirror port, also known as the SPAN port (Switch Port ANalyzer) or maintenance port (Figure 1).
You should only resort to forwarding the data for analysis to the mirror port if this port can handle the data volume of the mirrored ports. If this is not the case, packets will be dropped. To keep on the safe side, the mirror port should have the same bandwidth as the source port. Additionally, mirroring also affects switch performance, because the switch needs to duplicate all of the packets for mirroring. The mirrored port can also suffer performance hits, which means that troubleshooting in this way can cause more problems than it solves. For another thing, port mirroring falsifies the results because the switch automatically drops defective packets. Thus, in practice, SPAN ports are only rarely used as supplementary measuring points for ad hoc analysis.
To ensure precise acquisition of the measurement data, Test Access Points (TAPs, also known as link splitters) are used now for testing. These devices are directly connected to the network connection to be monitored. TAPs have a completely passive mode of operation; they do not generate any errors and keep working in case of a power outage. A TAP (high-ohm interface connection) duplicates all packets and breaks down a full-duplex link into two half-duplex data streams with the Rx and Tx traffic. For this reason, the network analysis device also needs two network interface cards. The analysis software then merges the two streams to create a single trace.
The TAPs are connected in series to a network medium (e.g., copper, fiber). This ensures that all packets, including defective ones, are fed to the analysis or monitoring system. Complex filtering functions help these tools to improve the application performance substantially and only forward traffic flows that are genuinely relevant. In particular, the use of high-speed connections (e.g., 10Gbps, 40Gbps, or higher) means that the cost of traffic analysis can skyrocket. If an error occurs on the TAP, a relay bridges the TAP and ensures that the connection to be analyzed remains in place in the idle state. Either the main route has no interruptions, or they are compensated for on the transport layer.
Buy this article as PDF