Skip to main content
I guess find does do globbing, but the point about shell expansion remains.
Source Link
icktoofay
  • 1.7k
  • 13
  • 17

That's right; Bash leaves them in place, passing them on to find, where it looks for files not of the patternsit, but looks for files of the exact names (special characters and all)form *.h and *.cpp. Since there are no such files (otherwise, Bash would have substituted them in) find will not find the files, and so nothing will happen. What if there is example.h and example.cpp?

That's right; Bash leaves them in place, passing them on to find, where it looks for files not of the patterns, but of the exact names (special characters and all) *.h and *.cpp. Since there are no such files (otherwise, Bash would have substituted them in) find will not find the files, and so nothing will happen. What if there is example.h and example.cpp?

That's right; Bash leaves them in place, passing them on to find, where it looks for files of the form *.h and *.cpp. Since there are no such files (otherwise, Bash would have substituted them in) find will not find the files, and so nothing will happen. What if there is example.h and example.cpp?

More suggestions.
Source Link
icktoofay
  • 1.7k
  • 13
  • 17

I note twothree potential problems with your approach, both dealing with special characters in filenames, as well as one unrelated suggestion one comment on the example usage.

PipingMarkdown

BothMarkdown cares about special names, too. What if I had a file named wca** b **c? Your script would give me a header:

**a** b **c**

a b c

But that's probably not what you intended. You probably want to enclose it in backticks:

**`a** b **c`**

a** b **c

But even that's not enough. What if my file names have backticks, like, say, a` b `c?

**`a` b `c`**

a b c

That's no good! You have to then double-backtick it:

**``a` b `c``**

a` b `c

Essentially, you have to count the maximum number of consecutive backticks and enclose it in that-number-plus-one of backticks.

Example usage

You suggest

mdproj.sh -name *.h -or -name *.cpp > whatever.md

but that will only work when there are exactly zero or one files whose name end with sed.h can openand exactly zero or one files whose name end with .cpp. Consider how it expands when there are no files:

mdproj.sh -name *.h -or -name *.cpp > whatever.md

That's right; Bash leaves them in place, passing them on their ownto find, where it looks for files not of the patterns, but of the exact names (special characters and all) *.h and *.cpp. You can removeSince there are no such files (otherwise, Bash would have substituted them in) find will not find the redirections:files, and so nothing will happen. What if there is example.h and example.cpp?

bytes="$(wcmdproj.sh -cname "$path")example.h bytes"-or -name example.cpp > whatever.md
lines="$(wc

That works fine. What if there are multiple?

mdproj.sh -lname "$path")foo.h lines"bar.h -or -name foo.cpp bar.cpp > whatever.md
sed

Uh oh. -name takes only one value, but you're passing it two. My find dies with this error:

find: "s/^/bar.h: unknown option

I always use -regex for that sort of thing:

mdproj.sh -regex /"'.*\.h' "$path"-or -regex '.*\.cpp' > whatever.md

(If I was being cleverer, I might use -E and merge them into one regular expression.)

I note two potential problems with your approach, both dealing with special characters in filenames, as well as one unrelated suggestion.

Piping

Both wc and sed can open files on their own. You can remove the redirections:

bytes="$(wc -c "$path") bytes"
lines="$(wc -l "$path") lines"
sed "s/^/    /" "$path"

I note three potential problems with your approach, both dealing with special characters in filenames, as well as one comment on the example usage.

Markdown

Markdown cares about special names, too. What if I had a file named a** b **c? Your script would give me a header:

**a** b **c**

a b c

But that's probably not what you intended. You probably want to enclose it in backticks:

**`a** b **c`**

a** b **c

But even that's not enough. What if my file names have backticks, like, say, a` b `c?

**`a` b `c`**

a b c

That's no good! You have to then double-backtick it:

**``a` b `c``**

a` b `c

Essentially, you have to count the maximum number of consecutive backticks and enclose it in that-number-plus-one of backticks.

Example usage

You suggest

mdproj.sh -name *.h -or -name *.cpp > whatever.md

but that will only work when there are exactly zero or one files whose name end with .h and exactly zero or one files whose name end with .cpp. Consider how it expands when there are no files:

mdproj.sh -name *.h -or -name *.cpp > whatever.md

That's right; Bash leaves them in place, passing them on to find, where it looks for files not of the patterns, but of the exact names (special characters and all) *.h and *.cpp. Since there are no such files (otherwise, Bash would have substituted them in) find will not find the files, and so nothing will happen. What if there is example.h and example.cpp?

mdproj.sh -name example.h -or -name example.cpp > whatever.md

That works fine. What if there are multiple?

mdproj.sh -name foo.h bar.h -or -name foo.cpp bar.cpp > whatever.md

Uh oh. -name takes only one value, but you're passing it two. My find dies with this error:

find: bar.h: unknown option

I always use -regex for that sort of thing:

mdproj.sh -regex '.*\.h' -or -regex '.*\.cpp' > whatever.md

(If I was being cleverer, I might use -E and merge them into one regular expression.)

Source Link
icktoofay
  • 1.7k
  • 13
  • 17

I note two potential problems with your approach, both dealing with special characters in filenames, as well as one unrelated suggestion.

for with find

Consider a directory structure where some paths have spaces in them, say:

Source
├── Constants.h
└── Main Loop.c

(One could argue the wisdom of using spaces in paths, but your script should be able to deal with them anyway.) What would find . -type f output?

./Source/Constants.h
./Source/Main Loop.c

What would for path in $(find . -type f); do echo $path; done output?

./Source/Constants.h
./Source/Main
Loop.c

That's right; for splits on whitespace, breaking your paths. This is a notorious issue. Fortunately, that page provides a solution:

find . -type f -print0 | while IFS= read -r -d '' path; do echo $path; done

echo -e

Look at this line:

echo -e "\n**$path** ($bytes in $lines)\n"

You're substituting path in, and then telling echo to interpret the resulting text. That suggests that if path has an escape sequence in it, echo will interpret it. Indeed, it does:

% path='x\ay'
% touch "$path"
% echo -e "$path"
xy

(The \a was interpreted as "sound the bell.") Instead of using echo -e, consider passing the special characters directly to echo. You can make Bash interpret ones before passing it to echo by using $'':

% echo $'\n'"**$path** ($bytes in $lines)"$'\n'

Piping

Both wc and sed can open files on their own. You can remove the redirections:

bytes="$(wc -c "$path") bytes"
lines="$(wc -l "$path") lines"
sed "s/^/    /" "$path"