3
\$\begingroup\$

I've started learning bash today, but I knew basics of javascript.

It would be very nice if anyone can point out at least a few pieces of advice on the code.

Because there are so many details in bash, I hope you won't be too harsh.

Expected behavior

The code is supposed:

  • to set the value of a variable named 'charge' from the input files
  • to create a few directories and subdirectories based on the input files

Concrete Example

Let's say there is a mol2 file named glucose-minus1.mol2.

  • I expect the code to set up a variable charge with value -1, so charge="-1". So you can see there is a translation from minus1 --> -1 If the filename is 'wrong' the user should be notified "we can't find the charge value in the filename" or something like that.
  • And to create a directory named glucose-minus1 (without the extension) plus the 3 sub-directories "solv" "sin-solv" and "thermo". All this if the directories are do no exist already.
#!/bin/bash
touch "mol-minus1.mol2" "mol-minus2.mol2"
molecules=(*.mol2)
subdirs=("solv" "sin-solv" "thermo")


setCharge () {
        #input molfile
        #if it finds any chargeLabel in molfile, sets the charge value (and we use it later).
        chargeLabels=( "minus3" "minus2" "minus1" "cero" "plus1" "plus2" )
        charges=("-3" "-2" "-1" "0" "+1" "+2")
        echo "${chargeLabels[*]}"
        for index in "${!chargeLabels[@]}"
        do
                if [[ "$1" == *"${chargeLabels[index]}"* ]]
                then
                        charge="${charges[index]}"
                        echo "$charge"
                else
                        echo "Im not setting charges..."
                        continue
                fi
        done
}

createDir(){
        #pass dir as an argument.
        if [ ! -d "$1" ]
        then
                mkdir "$1"
        else
                printf " %s is already there " "$1"
        fi
}
for mol in "${molecules[@]}"
do
        [ -f "$mol" ] || exit 0
        #sets the carge variable to a value included in molfile.
        echo "$mol"
        setCharge "$mol"
        #acreate molecule directory if not there.
        IFS="." read -ra splitted <<< "$mol"
        if [[ ! -d "${splitted[0]}" ]]
        then
                mkdir "${splitted[0]}"
                printf "directory %s has been created. Molecule charge is %s" "${splitted[0]}" "$charge"
        else
                printf "directory %s already exists. Molecule charge is %s" "${splitted[0]}" "$charge"
                continue
        fi

        #create subdirectories if not there.
        cd "${splitted[0]}" || printf "No directory with name %s" "${splitted[0]}" && exit 0
        for dir in "${subdirs[@]}" #create subdirs if not there.
        do
                createDir "$dir"
        done
        cd '../' #single quotes for string literals.
        cp "$mol" "${splitted[0]}/sin-solv/"
        #python -c'import flexo_api.py; flexo_api.flexo("$)'
done

\$\endgroup\$

2 Answers 2

2
\$\begingroup\$

Documentation

Add documentation in the form of comments to the top of the file, similar to the description you have in the body of the question.

# Set the value of a variable named 'charge' from the input files.
# Create a few directories and subdirectories based on the input files.
# [add more details here]

Comments

Many of the comments in the code are meaningful.

Here are some specific comment-related suggestions.

The comment on the following line is redundant with the comment 2 lines above it:

    for dir in "${subdirs[@]}" #create subdirs if not there.

That comment can be removed. Also, the following commented-out code can be removed:

    #python -c'import flexo_api.py; flexo_api.flexo("$)'

There are typos in the following comments ("carge" and "acreate"):

    #sets the carge variable to a value included in molfile.
    #acreate molecule directory if not there.

Layout

In general, the code is laid out well with consistent use of indentation and usage of functions.

However, the 3 lines of code at the top seem out of place. It would be better to move those lines after the function definitions.

Also, there should be a blank line after each function definition.

}
for mol in "${molecules[@]}"
do

Is better as:

}

for mol in "${molecules[@]}"
do

Simpler

When using an if with an else, I find code easier to read and understand when the if condition does not use a negation. For example:

    if [ ! -d "$1" ]

is simpler as:

    if [ -d "$1" ]

This requires the statements to be swapped as well:

    if [ -d "$1" ]
    then
            printf " %s is already there " "$1"
    else
            mkdir "$1"
    fi

In the setCharge function, the continue line is not needed and can be removed.

\$\endgroup\$
1
\$\begingroup\$

This function can fail, but we never check for that:

createDir(){
        #pass dir as an argument.
        if [ ! -d "$1" ]
        then
                mkdir "$1"
        else
                printf " %s is already there " "$1"
        fi
}

mkdir can fail if $1's parent directory doesn't exist or is not writeable, or if $1 already exists (as a regular file or other non-directory type). When we call this function we should do so like

                createDir "$dir" || exit

Alternatively, set -e to have the shell do this for us everywhere.

When we print informational text, we should emit an entire line, including its terminating newline:

            printf ' %s is already there\n' "$1"
            #                           🔺🔺

Check all the printf invocations in this program - this error appears several times. And check that output is going to the right stream; some of these look like error messages and should therefore go to standard error stream (printf >&2).


In setCharge, I think we should be using an associative array rather than a parallel pair of indexed arrays:

    local -A charges=([minus3]=-3 [minus2]=-2 [minus1]=-1 [zero]=0 [plus1]=+1 [plus2]=+2)
    local key
    for key in "${!charges[@]}"
    do
        if [[ $1 == *$key* ]]
        then
            charge=${charges[$key]}
            echo "$charge"
        else
            echo "I'm not setting charges..."
            continue
        fi
    done

That loop is a bit strange - the log message inside the else is misleading and really ought to be printed only after the do loop finds no matches (and the continue has no value, since that's exactly what happens anyway):

    for key in "${!charges[@]}"
    do
        if [[ $1 == *$key* ]]
        then
            charge=${charges[$key]}
            echo "$charge"
            return
        fi
    done
    echo "I'm not setting charges..."

I'm not convinced we should be leaving charge at its previous value when we have no match - shouldn't we be setting it to empty string, or perhaps zero?

Also, communicating via the global variable makes it hard to understand - better to have setCharge write the value to its standard output and have the caller capture that:

    for key in "${!charges[@]}"
    do
        if [[ $1 == *$key* ]]
        then
            echo "${charges[$key]}"
            return
        fi
    done
}
    charge=$(setCharge $i)
    echo "${charge:-"I'm not setting charges..."}"

I'm not convinced this is correct:

    [ -f "$mol" ] || exit 0

Is it really a success when input file is not present? I think we should be exiting with the non-zero status returned by [:

        [ -f "$mol" ] || exit

We need to be really careful with error handling when we use cd - we really don't want to continue in the wrong directory. We've made an attempt with cd || printf >&2 && exit, but that is flawed because printf can fail. Since failing cd emits an error message anyway, cd || exit is adequate without needing a printf.

Even cd .. is risky - it's safer to use a subshell to localise the change in directory:

    (
        cd "${splitted[0]}" || exit
        mkdir -p "${subdirs[@]}" || exit
    )  #   cd '../'
    cp "$mol" "${splitted[0]}/sin-solv/"
\$\endgroup\$

You must log in to answer this question.