Revisions to How to count the number of a specific character in each line?

replaced http://unix.stackexchange.com/ with https://unix.stackexchange.com/

Source Link

edited Apr 13, 2017 at 12:36

1

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib enzotib (and the equivalent from manatwork manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' is a file with one line and one long string which contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' is a file with one line and one long string which contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' is a file with one line and one long string which contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

better describe test data

Source Link

edited Oct 2, 2014 at 14:07

josephwb

233
2
5

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' is a file with one line and one long string which contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' is a file with one line and one long string which contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

improve formatting

Source Link

edited Oct 2, 2014 at 1:34

josephwb

233
2
5

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
 
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
 
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
 
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
 
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
 
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
 
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
 
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
 
real    0m0.016s
user    0m0.012s
sys     0m0.004s

The answers using awk fail if the number of matches is too large (which happens to be my situation). For the answer from loki-astari, the following error is reported:

awk -F" '{print NF-1}' foo.txt 
awk: program limit exceeded: maximum number of fields size=32767
    FILENAME="foo.txt" FNR=1 NR=1

For the answer from enzotib (and the equivalent from manatwork), a segmentation fault occurs:

awk '{ gsub("[^\"]", ""); print length }' foo.txt
Segmentation fault

The sed solution by maxschlepzig works correctly, but is slow (timings below).

Some solutions not yet suggested here. First, using grep:

grep -o \" foo.txt | wc -w

And using perl:

perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt

Here are some timings for a few of the solutions (ordered slowest to fastest); I limited things to one-liners here. 'foo.txt' contains 84922 matches.

## sed solution by [maxschlepzig]
$ time sed 's/[^"]//g' foo.txt | awk '{ print length }'
84922
real    0m1.207s
user    0m1.192s
sys     0m0.008s

## using grep
$ time grep -o \" foo.txt | wc -w
84922
real    0m0.109s
user    0m0.100s
sys     0m0.012s

## using perl
$ time perl -ne '$x+=s/\"//g; END {print "$x\n"}' foo.txt
84922
real    0m0.034s
user    0m0.028s
sys     0m0.004s

## the winner: updated tr solution by [maxschlepzig]
$ time tr -d -c '\"\n' < foo.txt |  awk '{ print length }'
84922
real    0m0.016s
user    0m0.012s
sys     0m0.004s

Source Link

answered Oct 1, 2014 at 19:38

josephwb

233
2
5

Loading

Stack Exchange Network

Return to Answer