What you want to do is basically to remove everything except the vowels and then count the number of characters in the line. Your attempt failed because grep -o
will print each match on a separate line, so it was printing each vowel by itself. You were looking for something like this:
$ tr -dc 'AEIOUaeiou\n' < names.txt | awk '$0=$0" "length'
Iae 3
a 1
Ao 2
ae 2
ee 2
The tr
command can translate between sets of characters. With the -d
flag it deletes them, and the -c
makes it take the complement of what you give it. So tr -dc x < file
will print out the contents of file
after deleting all characters except an x
. Here, tr -dc 'AEIOUaeiou\n'
will delete everything that isn't a vowel or newline:
$ tr -dc 'AEIOUaeiou\n' < names.txt
Iae
a
Ao
ae
ee
So we just need the count, and I used awk
for that.
$0
is the $
operator applied to the 0 number which results in the full current input record (records being lines by default). For simplicity, you can think of $0
as a special variable that holds the current line.
length
is a function that returns the length (in characters for scalars and in number of elements for arrays) of what you give it, and when you give it nothing, it operates on $0
. So that gives us the number of characters.
$0=$0" "length
then, means "add a space and then the result of length
to the end of the line. Finally, in awk
, the default action when something evaluates to true is to print the current value of $0
. Because of the string concatenation in $0=$0" "length
, the result of that assignment will be a string and a string is considered as true in awk
as long as it's non-empty which will always be the case here, so the resulting $0
will always be printed.
I am just reformulating doneal24's approach here using a slightly different combination of tools.