3

EDIT: sorry. this is my first question here.

Here is a minimal working example

#!/bin/bash
#
count=0

function something() {
    if test -f "$1"; then
        # is a file
        ((count++))
        echo $count $1
    elif test -d "$1"; then
        # is a folder
        find $1 | while read line; do
            if [ $1 != $line ]; then  # first line in find is the folder itself. prevent infinte recursion
                something $line
            fi
        done
    fi
}

while [ $# -gt 0 ]
do
    something "$1"
    shift
done

echo "Processed $count files"

Example output:

$ ./test.sh *
1 0/file1.txt
2 0/file2.txt
1 1/file1.txt
2 1/file2.txt
1 2/file1.txt
2 2/file2.txt
1 3/file1.txt
2 3/file2.txt
1 4/file1.txt
2 4/file2.txt
1 5/file1.txt
2 5/file2.txt
1 6/file1.txt
2 6/file2.txt
1 7/file1.txt
2 7/file2.txt
1 8/file1.txt
2 8/file2.txt
1 9/file1.txt
2 9/file2.txt
1 test.sh
Processed 1 files

As you can see each time I call the function recursively it returns to the variable state on parent function and messes up the count.

Using Ubuntu 22.06 on WSL2 VM

5
  • 4
    This is obviously not your real code. Do you use local anywhere? Or declare inside a function? If not, then the variable is global. No there is no separate global scope. Commented Sep 3, 2024 at 19:42
  • 1
    I agree with glenn.  This is confusing.   (1) What does unpack have to do with A and B?   (2) What do the command-line parameters have to do with anything?  Do they affect the control flow?   (3) It’s hard to debug “sometimes”.  Can you clarify?   (4) For a given scenario, is the result consistent?   (5) Can you give a runnable example that demonstrates the problem (perhaps including tests on "$1" that determine whether A calls itself or B)? … (Cont’d) Commented Sep 3, 2024 at 20:10
  • 1
    (Cont’d) …  (6) Suggestion: change your echo statements to echo "A: $count" and echo "outside: $count".  Add echo "B: $count", echo "Calling A with arg xxx" and echo "Calling B" in the appropriate places. Commented Sep 3, 2024 at 20:10
  • 2
    Sorry. Done. Had to try a few times to reproduce
    – Shlomo V
    Commented Sep 3, 2024 at 20:50
  • Copy/paste your script into shellcheck.net and it'll tell you of some additional issues you need to fix.
    – Ed Morton
    Commented Sep 4, 2024 at 11:04

1 Answer 1

14

Do I get it right that you want to count all files in a directory tree, or something like that? I.e. you want the variable count to be global?

Your issue is here:

find $1 | while read line; do

See that pipeline? What pipelines do is that they start two or more processes in parallel, and doing that requires starting distinct OS-level processes. Which the shell does here by cloning itself. Those clones can't affect the parent process, so the variable is effectively localized, and when the pipeline ends, you get the original value back.

See Why is my variable local in one 'while read' loop, but not in another seemingly similar loop? for solutions to that.

As an example, a normal recursive function:

% cat local1.sh
#!/bin/bash

count=1
func() {
    (( count++ ))
    printf "$1 %$1s count=$count\n" ""
    if (( $1 < 3 )); then
        func $(( $1 + 1 ))
    fi
    printf "$1 %$1s count=$count\n" ""
}

func 1

running that prints:

1   count=2
2    count=3
3     count=4
3     count=4
2    count=4
1   count=4

Now, change the inner func line to e.g. echo | func $(( $1 + 1 )) and you'll see the count go back down when the function calls return.


Now, another thing entirely is that calling find on each level of a recursive tree walk is a bit suspicious, since find will itself recursively walk the whole tree. Are you sure you want to add another walk on top of that? It'll lead to directories being processed multiple times.

If we go back to counting files, you could do that with just one find, e.g.:

find "$dir" -type f | wc -l

though that would count filenames with newlines as multiple files. With GNU find, you could use ... -printf x | wc -c to avoid that.


Actually, just removing the manual recursion should be all right to call a function for each (regular) file:

#!/bin/bash
shopt -s lastpipe

count=0
func() {
    echo "processing '$1'..."
    ((count++))
}
find "$1" -type f | while IFS= read -r file; do
    func "$file"
done
echo "end. count=$count"

Clear IFS and use read -r as above to avoid having read treat whitespace and backslashes specially.

(and to avoid issues with newlines in filenames, use find ... -print0 | while IFS= read -r -d '' file; do ... in Bash))

9
  • 1
    @ShlomoV, I was more worried about correctness than efficiency. Anyway, wouldn't it do to just run find once and loop over the output? (see edit)
    – ilkkachu
    Commented Sep 3, 2024 at 21:35
  • 1
    yes but I need to have an option to iterate over files OR traverse through folders too, but yes, you make sense. Regarding my question, according to the link you gave me The answer would be, replacing: find $1 | while read line; do... done with while read line; do... done < <(find $1)
    – Shlomo V
    Commented Sep 3, 2024 at 21:44
  • 5
    @ShlomoV I suggest you post a new question, explaining what you want the script to do and showing what you have so far. If we know the real objective, we could probably help you find a more efficient way to achieve it.
    – terdon
    Commented Sep 3, 2024 at 23:38
  • 1
    @EdMorton, doh, silly me, I tried to be simpler there, but of course that has the same issue with the subshell when the loop ends... Actually, I think pipes look better than <(), so perhaps lastpipe would do...
    – ilkkachu
    Commented Sep 4, 2024 at 13:28
  • 1
    another possibilité to not use a pipe: mkfifo /tmp/a_pipe ; find ... > /tmp/a_pipe & while read line; do ... ; done < /tmp/a_pipe ; command rm /tmp/a_pipe : no subshells needed. Commented Sep 4, 2024 at 14:43

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.