I'm getting quite confused with Linux find command's regular expression usage.
I'm aware that there is an option regextype, but without that, according to the current man page, it is supposed to use Emacs regular expressions. This page seems to say that character classes are supported ("this is a POSIX feature"), but my experiments seem to show that nothing like [[:ascii:]] or [[:digit:]] or [[:alnum:]] ever works, quite apart from the fact that these are truly archaic ways of handling characters classes. Instead you seem to have to use [a-zA-Z] which, apart from anything else, is useless for Unicode characters.
So I turned to regextype: I find that you get a list of possible settings by going find -regextype help. This gives:
find: Unknown regular expression type ‘help’; valid types are ‘findutils-default’, ‘awk’, ‘egrep’, ‘ed’, ‘emacs’, ‘gnu-awk’, ‘grep’, ‘posix-awk’, ‘posix-basic’, ‘posix-egrep’, ‘posix-extended’, ‘posix-minimal-basic’, ‘sed’.
... so I assumed that by including -regextype posix-basic, for example, I'd be able to run something like this:
find . -maxdepth 1 -regextype posix-basic -regex .*\d.*
This produces results, but not the ones I was hoping for: all the files and folders in the current directory with the lower-case letter "d" in their names! I was expecting all names with at least one digit.
I've looked at quite a lot of Linux find regex questions here on Stack Exchange, but I don't think I've seen a single one where "modern" character class handling is demonstrated. Is any of the regextype options able to handle something like this:
find . -maxdepth 1 -regextype ??? -regex '.*\d{3}\s+.*'
where I mean "contains three digits followed by one or more empty space characters". I.e. something like regex rules from a normal language like Java, Python, Javascript, etc...?
later, following comments
Here's an exercise: make a directory and put a few files into it with random names. Then added files with the following names: 'ctb117b', 'ctb117c', 'trt117a'.
I then want to isolate the '117' files. There may be files called 'xxx0009333qqq'. So using a modern regex engine I'd go like this, for example (allowing for the preceding ./):
find . -regex './\w{3}\d\{3}.*'
Using these more venerable Linux regex rules, what do I put that works?
find . -regextype posix-basic -regex '.*[[:digit:]]{3}.*'
produces nothing. Nor does '.*[[:digit:]]+.*', for example. If anyone's sufficiently interested, please show me something which works for you (lists the above files).
\dis a Perl-like expression, also supported by some GNU tools as a short way of writing what would be written as[[:digit:]]in a POSIX expression. Same for\s([[:blank:]]). The{3}modifier is a POSIX extended expression modifier.*, be sure to quote it.find . -maxdepth 1 -regextype posix-basic -regex .*\d.*probably gave you unexpected results because*\d.*matched something in the current directory, so the shell expanded it beforefindever saw your regex.[[:digit:]]works with allposix-*-regextype, and also with other (egrep, etc).-name '*[[:digit:]][[:digit:]][[:digit:]][[:blank:]]*'