Everybody loves grep, but sometimes we need to go beyond just looking for a string inside a file.

POSIX grep Can Have Many Options

Plain ol’ POSIX grep can be pretty boring to use, especially if you’re looking for a string in a source code tree. It doesn’t even support recursion in sub-directory, which would be the very bare minimum for this activity (unless one is willing to use find, of course). It also has a lot of restrictions, but it’s useful to know it’s there.

GNU grep Is A De-Facto Standard

With GNU grep we are already taking a step ahead, because it provides not one but two different alternatives for doing recursion in a tree of files. From the help:

  -d, --directories            how to handle directories;
                               ACTION is 'read', 'recurse', or 'skip'
  -r, --recursive              like --directories=recurse
  -R, --dereference-recursive  likewise, but follow all symlinks

It also helps that it provides a lot of additional features, like e.g. also look at some context around the match (options -B for lines before, option -A for lines after), use Perl compatible regular expressions (option -P) and only print out the stuff that is actually matched (option -o).

Alas, if you have a big project in a language that compiles source code into some kind of target (e.g. C, C++, or Java), GNU grep will happily search through them all, potentially consuming a lot of time looking for a string in binary files that you would otherwise skip. So, it’s time to go beyond that, too.

Enter ack

Ack was created by Andy Lester to go past a few of grep’s shortcomings, including also GNU grep. In particular, looking only in the right files was a driving principle, but the author listed the Top 10 reasons to use ack for source code, which can illuminate on the other 9 reasons.

I usually don’t need anything fancy, just looking for a pattern like this:

$ ack '(?mxs: \A\s* sub \s+ (?: foo | bar) \b)'

Ack is written in Perl, which makes it very portable around and also explains why the regular expressions are expected to be compatible with Perl.

Ack can be installed by downloading a single file and setting as executable. As such, it’s a wonderful candidate for the #toolbox.

Beyond ack

If you judge a software from the amount of its emulators, I would say that ack is a huge success. Many people found it useful, although lacking in some sense or other, hence decided to rewrite it with some different goals in mind. You can find a few of the alternatives here.

Among them we find ag, the Silver Searcher, which the author describes as 5-10x faster than Ack in typical usage. The command line options are mostly compatible with ack’s, although they diverged a bit in time.

For just simple searches of a pattern you can just trade ag for ack, although I’m not 100% sure that it supports the full gamut of options and syntax you would expect in a Perl regex; for this reason I’ll stick to a more portable example:

$ ag '^\s*sub\s+(foo|bar)\b'

One interesting aspect of ag is that it’s possible to compile it as a static binary, which makes it extremely portable and a good component of a #toolbox. As an example, you can find a binary rendition of ag, compiled for x84_64, so it’s definitely possible.