Tracking down problems with Jaeger


Understanding Open Telemetry

If Open Telemetry is to be used, the developers of an application need to take this into account as early as the programming phase. Open Telemetry offers bindings for Go, for example, and Go is used in a large number of applications in today's cloud-ready universe. The first step on the way to effective tracing is to import the Open Telemetry bindings into the components of your own application.

Now is a good time to take a closer look at the terms used by Open Telemetry. The documentation for the standard in particular uses them so excessively that sooner or later you can no longer see the forest for the trees because of the large numbers of spans and traces. However, the topic is not as complex as the documentation makes it seem. The two basic terms, "spans" and "traces," are a good way to illustrate the fundamentals of Open Telemetry.

Spans and Traces

In Open Telemetry-speak, a span is a set of data produced by running an arbitrary operation. If an application is prepared by a client library for Open Telemetry, each function generates a span with each call. The span has a unique ID (which includes the name of the function that was called), the exact time of the call, and information about the time taken to complete the task. Spans can be connected or nested within each other. Thus, if function A calls function B, two spans logically connected by nesting are created.

Traces, on the other hand, describe a group of spans that are directly causally related. You can probably already see why Open Telemetry is practically indispensable for debugging microarchitecture apps. Traces are not limited to individual components of an application. If event 1 in application A causes event 2 to occur in application B, then several spans are created, depending on the application type and function, but each case only has one trace.

The practical benefits of Open Telemetry become apparent when you use a tracing framework to relate spans and traces visually, making it possible to see which applications called which functions where and when. You can also see the resulting events, the available data, and what happened to the data.

Expanding the Focus

Today, anyone looking at Open Telemetry and Jaeger for the first time will find significantly more functionality than was the case just a few months ago. The Open Telemetry developers have continuously expanded the scope of their standard. While the focus was orignally on tracing information, today the standard also specifies formats for collecting metrics data and logfiles. Tracing frameworks based on the Open Telemetry standard can therefore be used to expand the database for debugging. Having the traces available, as well as the corresponding data processing error messages logged by the application, helps to simplify troubleshooting further.

Open Telemetry also jumps into the monitoring and logging gap for microarchitecture apps described earlier. For both types of data, the standard comes with formats that allow applications to generate suitable data. The tracing frameworks needed for analysis have also grown in number and become more diverse. Some of the data generated by Open Telemetry can be processed, for example, by Prometheus (metrics data) or Loki (log data).

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus