OPNids: Suricata with built-in machine learning

Packet Checker

Verbose Logging

As mentioned earlier, the issue of rigid rules for the IPS and IDS is somewhat problematic. On the one hand, such a system forces you to update the signatures continuously; on the other, attacks may already be taking place in the wild that have not yet been systematically described and for which no signatures are available.

It would be practical if Suricata could automatically teach itself to recognize attacks on the basis of incoming data. The idea is not completely absurd – it works with spam, for example. One of the central features of tools like SpamAssassin is that they self-adapt to incoming messages and automatically adjust their rules to reflect new situations on the basis of statistics and with user support. Nobody would talk about AI and machine learning in spam filters yet, but the principle is the same.

The bad news is that Suricata does not currently provide such functions, even though the program at least provides the basis to build such a system. The magic word is EVE (extensible event format) [5], which describes Suricata's extremely flexible log function.

EVE Lists Most Details

As a rule, when you enable EVE output in Suricata (Figure 2), the log entries in JSON format end up in a central file named eve.json. EVE is not easily fooled. It defines the traffic type or central details like the size of a request in a JSON field for different types of incoming traffic. Suricata is also able to identify and interpret the most important protocols. If someone tries to take control of a website with an HTTP request, Suricata extracts this information from the packet stream and records it in a logfile.

Figure 2: An EVE interface lets you transfer Suricata events to other systems. © Elastic

However, EVE is by far not the only logging function Suricata offers. Moreover, the program can discover specific information about packet flows and store it in a flow logfile. Suricata also maintains a separate file containing nameserver requests for DNS. In other words, it has all the details that an MLE needs to train Suricata with defined parameters. This is where Dragonfly MLE for Suricata comes into play.

Dragonfly MLE as Quasi-AI

Dragonfly MLE was created by the authors of OPNids. It works in three phases: First, it attaches itself to one or more data sources, such as EVE logs from Suricata. Second, it analyzes the events found in these logs and correlates them. Third, it outputs the result of its deliberations to a sink, which can be a simple file or socket in Lua. The trick is that other tools that also use Lua can tap into those sinks and derive their sets of rules from them. This functionality is exactly what Suricata offers. If you throw a Lua sink [6] to the tool, it automatically adopts it, including the ruleset.

The process I just described sounds a bit theoretical, but a practical example will clarify what I mean. To begin, assume that you have a running Suricata system that writes incoming events to an EVE log at regular intervals. At the same time, you install an instance of the Dragonfly MLE system, which runs as a daemon in the background, taps directly into the EVE stream in the Suricata log, and discovers everything Suricata is doing.

Once Dragonfly MLE has assimilated the incoming data, it passes through the analyzer layer. The analyzer's central task is to investigate the information provided by Suricata and create a weighting according to the results. Dragonfly MLE already comes with some analyzers, but admins and users are encouraged to write and use their own variants. The factory-supplied analyzers evaluate traffic by unusual countries of origin or strange time stamps.

In principle, it works like SpamAssassin. A specific number of points is assigned for individual criteria. If an event has a sufficiently high score at the end of the calculation, Dragonfly forwards this information to the Lua sinks mentioned above. If you configure Suricata in the intended way, it retrieves the information from the sink provided by Dragonfly MLE, creating a kind of cycle of events and resulting in actions and extended rules on the Suricata side.

Referring to this process as machine learning would be inaccurate. Unlike complex neural networks, the system does not improve its results autonomously from feedback, which is the core of a learning process. It is more about filters that work a bit more intelligently because the user weights them.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our ADMIN Newsletters
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs

Support Our Work

ADMIN content is made possible with support from readers like you. Please consider contributing when you've found an article to be beneficial.

Learn More”>


		<div class=