Coming to grips with grep


Master of Many Forms

The various incarnations of grep have taken many forms. Back in the day, grep was actually a function of the somewhat old-school editor ed. The easy-to-use ed was very similar to vi and vim in the sense that it used the same set of colon-based commands. From within this clever editor, you could run a command starting with g that enabled you to print (with p) all the matches of a pattern:

# g/chrisbinnie/p

From these humble beginnings, a standalone utility called grep surfaced, toward which you could pipe any data.

Thanks to grep's pervasive acceptance, the sys admin's vernacular even includes some highly amusing references in its lexicon. For example, are you familiar with vgrep, or visual grep [2], which takes place when you quickly check something with your eyes instead of using software to perform the task?

Back in the realm of the command line, you might have come across the excellent and regular-expression-friendly network grep , or ngrep [3], which offers features similar to tcpdump, as well as a few additional features on top.

The agrep, or approximate grep , command utilizes fuzzy logic to find matches. If you employ the slick regex matching library TRE, you can install a more powerful version of agrep called tre-agrep. To install libtre5 and tre-agrep on Debian or Ubuntu, use:

# apt-get install tre-agrep

You could try out tre-agrep with the word "gold" on the simplefile example file. A numeric switch between   and 9 tells tre-agrep how many errors to accept, so adding -2

# tre-agrep -2 -i "you" simplefile

means it will accept patterns within two errors of the original pattern. Therefore, it would return mistakes like "göld" with a diacritical. Figure 10 shows the output from this case-insensitive (-i) command. The first line is within one error of "gold" (good), the second line is within two errors (volu), and the third line matches exactly (gold).

Figure 10: The tre-agrep command outputs three lines in simplefile that match within a distance of two (-2) the word "gold," with case ignored (-i).

Not Stable

As if all that information about grep isn't enough, a Debian-specific package offers even more grep derivatives in debian-goodies . To install, enter:

# apt-get install debian-goodies

If you rummage deeper into this bag of goodies, you will discover a tiny utility called which-pkg-broke, which lets you delve into the innards of a package's dependencies and when they were updated. This utility along with the mighty dpkg can help you solve intricate problems with packages that aren't behaving properly. To run the utility, all you do is enter

which-pkg-broke binutils

where binutils is the package you are investigating.

Another goodie is dgrep, which allows you to search all files in an installed package using regular expressions. In a similar grep-like format, dglob dutifully generates a list of package names that match a pattern. (The apt-cache search command has related functionality.) Both goodies are highly useful for performing maintenance on servers.

All Clear

My intention within this article was to provide enough information related to grep to assist you in building useful command lines and creating invaluable time-saving shell scripts. There's nothing like getting back to basics sometimes.

Many other possible applications of the clever grep are there to explore in much more detail, such as back-references [4], but I hope your appetite has been sufficiently whetted for you to invest more time in brushing up on the basics and less time referencing instruction manuals. As you can imagine, you can launch all sorts of weird and wonderful quests by combining the functionality that these utilities provide.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy ADMIN Magazine

Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus