I created this script out of boredom with the sole purpose of using/testing GNU parallel so I know it's not particularly useful or optimized, but I have a script that will calculate all prime numbers up to n:
#!/usr/bin/env bash
isprime () {
local n=$1
((n==1)) && return 1
for ((i=2;i<n;i++)); do
if ((n%i==0)); then
return 1
fi
done
printf '%d\n' "$n"
}
for ((f=1;f<=$1;f++)); do
isprime "$f"
done
When run with the loop:
$ time ./script.sh 5000 >/dev/null
real 0m28.875s
user 0m38.818s
sys 0m29.628s
I would expect replacing the for loop with GNU parallel would make this run significantly faster but that has not been my experience. On average it's only about 1 second faster:
#!/usr/bin/env bash
isprime () {
local n=$1
((n==1)) && return 1
for ((i=2;i<n;i++)); do
if ((n%i==0)); then
return 1
fi
done
printf '%d\n' "$n"
}
export -f isprime
seq 1 $1 | parallel -j 20 -N 1 isprime {}
Run with parallel:
$ time ./script.sh 5000 >/dev/null
real 0m27.655s
user 0m38.145s
sys 0m28.774s
I'm not really interested in optimizing the isprime() function, I am just wondering if there is something I can do to optimize GNU parallel?
In my testing seq actually runs faster than for ((i=1...)) so I don't think that has much if anything to do with the runtime
Interestingly, if I modify the for loop to:
for ((f=1;f<=$1;f++)); do
isprime "$f" &
done | sort -n
It runs even quicker:
$ time ./script.sh 5000 >/dev/null
real 0m5.995s
user 0m33.229s
sys 0m6.382s