8

Say I want execute cmd on all *.cpp and *.hpp files that contain the word FOO.

As far as just finding those files goes, I know I can do,

find /path/to/dir -name '*.[hc]pp' -exec grep -l 'FOO' {} +

but what is the proper way to extend the processing so that I can execute, say cmd on each of those files?

I know I could do -exec bash -c '...' and write the "if file content contains FOO, then run cmd on the file" logic in the ..., but that feels like a cannon to shoot a fly.

1
  • 1
    Oh, it seems like changing + to \; | xargs cmd is enough. I should learn how to use the xargs beast :/
    – Enlico
    Commented Apr 12, 2024 at 6:30

5 Answers 5

22

-exec … \; is also a test, it succeeds iff the command inside returns exit status 0. -exec … + can process many pathnames with one command and it always succeeds, so it's not a useful test.

grep -q plays nicely with -exec … \; because it returns exit status 0 when there is a match (even if an error was detected), 1 otherwise.

Turn your -exec grep … + into -exec grep -q … \; to test files one by one, so you can add another -exec that will conditionally run your desired command:

find /path/to/dir -name '*.[hc]pp' -exec grep -q 'FOO' {} \; -exec …

In general, the fact -exec … \; is a test allows you to build custom tests where you can test for virtually anything; especially because you can run sh -c and therefore use pipelines, variables you can manipulate, shell conditionals and such to implement a test (but mind this: Is it possible to use find -exec sh -c safely?).

11

With GNU implementations of those utilities (and a shell with support for ksh-style¹ process substitution), you could just do:

xargs -r0a <(grep -rlZ --include='*.[hc]pp' FOO) cmd --

Above, with -r (aka --recursive) grep does find's job (beware in current versions of grep, it behaves as if -type f was passed to the equivalent find command), and passes the list of file paths NUL-delimited (-Z) to xargs via a pipe whose path is passed to -a (aka --arg-file).

If you still wanted to use find (for instance to apply more complex file criteria than file name suffix), you'd do:

xargs -r0a <(
  find . -name '*.[hc]pp' -type f -exec grep -lZ FOO {} +
  ) cmd

(here the -- is not necessary as the file paths will start with ./, so not - nor +).

Doing find ... -exec grep -q FOO {} \; -exec cmd {} + works but means forking a process, executing grep, which involves loading and linking shared libraries for each file which is going to be orders of magnitude more expensive than what grep needs to do here (read a few KiBs of text and find FOO within), so is best avoid wherever possible if performance or resource usage is a concern.

If you're not on a GNU system², but your find supports -print0 and xargs supports -r and -0 as will be required in the next version of the POSIX standard, you can do the reporting of file paths with perl:

find . -name '*.[ch]pp' -type f -print0 |
  xargs -r0 perl -0lne 'if (/FOO/} {print $ARGV; close ARGV}' |
  xargs -r0 cmd

Though that assumes cmd does not read from its stdin (which here, depending on the xargs implementation will be either opened on /dev/null or the reading end of the pipe perl is writing to, inherited from xargs).


¹ Including ksh implementations other than those based on pdksh and zsh and bash; change the <(...) to <{...} in rc-like shells, and (...|psub) in fish

² Note however that GNU grep used to be used on BSDs (and still is in some), and some BSDs that have moved away from GNU grep have replicated its API, so you'll find non-GNU implementations of grep that do support its -Z non-standard option; sometimes only the --null long-option equivalent like on OpenBSD (where -Z is for something else), FreeBSD or MacOS.

5

First turn on the globstar option in Bash:

shopt -s globstar

Now you can simply do:

for x in /path/to/dir/**/*.[hc]pp; do
  if grep -q 'FOO' "$x"; then
    ...
  fi
fi

The ** pattern matches zero or more path components (restricted to directories if it is followed by /).

Thus the pattern will find /path/to/dir/foo.cpp as well as /path/to/dir/deeply/nested/foo.cpp.

0
1
find /path/to/dir -name '*.[hc]pp' -exec grep -l 'FOO' {} + \
    | while IFS= read -r f; do cmd "$f"; done
0
0

Fun case of chained xargs:

find /path/to/dir -name '*.[hc]pp' -print0 | xargs -0 grep -lZ 'FOO' | xargs -0 ...

Depending on your command you may need to pass -n 1 to the second invocation of xargs to run one command per file, or you may not.

The find command produces a binary 0 separated list of matching file names, the first xargs command stitches together the minimum number of grep commands to test all of the files for matching contents producing a binary 0 separated list of files that match the pattern, and the second xargs command runs the command you need.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.