2

This is a pretty basic question, but I can't get my head around it. I'm learning Linux and one of the questions i've been given is;

*The word sentimentalment includes the same three characters (e.g. "ent") which appear three times. The word "blayblapblam" also contains the same three characters repeated three time (e.g. "bla").

How many words can you find which contain any three characters repeated three times, like the examples "sentimentalment" and "blayblapblam", but which also begin with lower case "d". Use /usr/share/dict/words as your list of possible words and grep to find the answer. The "d" is not one of the characters considered when detecting the three-character strings.*

So far, I can return instances where the same three letters appear twice;

grep -E '^d(...).*\1' /usr/share/dict/words > output

Which to me reads, look for a word beginning with 'd', then combination of three letters, 0 or more characters before the same group (1) appearing again.

I've tried the following;

grep -E '^d(...).*\1.*\1' /usr/share/dict/words > output

Which if my understanding is correct (which it obviously isn't), returns group one, then zero or more characters, then group one again.

Can someone point out where I'm going wrong? Any help is appreciated.

2
  • Just to avoid any confusion, neither "sentimentalment" nor "blayblapblam" are words or even common proper nouns in the English language, which is what I assume /usr/share/dict/words contains. I was unable to find any words that match the pattern described above in my local words file.
    – jw013
    Commented Nov 1, 2012 at 14:36
  • The POSIX standard removed the support of back-references for extended regular expressions, so you can't rely on all grep versions supporting it. To be safe, use basic regular expressions, which even makes the command shorter in this case: grep '^d.*\(...\).*\1.*\1'
    – Philippos
    Commented Nov 3, 2017 at 7:36

1 Answer 1

9

It seems you fixed the three letters to go right after the d. Perhaps you would need something like this instead:

grep -E '^d.*(...).*\1' /usr/share/dict/words > output

which would make your 3-pattern search into

grep -E '^d.*(...).*\1.*\1' /usr/share/dict/words > output

For portability reasons one should avoid combining extended regular expressions with back-references, so better use

grep '^d.*\(...\).*\1.*\1' /usr/share/dict/words > output

7
  • I believe this answer is exactly what you need. Bear in mind that on my system it returns no results from /usr/share/dict/words. It is, nevertheless, correct.
    – terdon
    Commented Nov 1, 2012 at 14:28
  • @terdon I don't have /usr/share/dict/words but tried it on some sample input, and it seems to have behaved as expected.
    – gt6989b
    Commented Nov 1, 2012 at 14:42
  • yes it does, absolutely, that's why I upvoted it :). I am just warning the OP that if there is no output, it does not mean that your solution is wrong.
    – terdon
    Commented Nov 1, 2012 at 15:16
  • @terdon Thank you... wonder why it takes forever to get such one-liners accepted...
    – gt6989b
    Commented Nov 1, 2012 at 15:38
  • 2
    @Utku no, grep will print any line as long as part of the line matches the pattern given. You don't need to match the entire line.
    – terdon
    Commented Jun 18, 2016 at 17:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.