1

I want to keep only those sub-directories that are present in the corresponding directory from CSV file. The file structure looks something like this:

100_folder/
├── folder_11
├── folder_25
├── folder_31
└── folder_41
210_folder/
├── folder_13
├── folder_23
├── folder_31
└── folder_42

Information in CSV:

Col6,Col26
100,folder_11
100,folder_13
100,folder_41
210,folder_31
210,folder_42

Based on the information from the columns, I want to remove the sub-directories that are not present in CSV file.

Here is how I read the file:

eCollection=( $(cut -d ',' -f6,26 file.csv ) )
echo "${eCollection[@]}"
0

1 Answer 1

1

Of course we can provide solution for you. But where is the fun?

Let me say your requirements look dangerous to me, as every folder not in csv will deleted (imagine typo, wrong file format/line ending or trailing whitespaces)

That said, I will introduce three friends of text file processing in shell (script)

  • string manipulation (of variables)
  • grep text search core-util (file content, read only)
  • find file search core-util (file name)

Never use code you don't fully understand!

#!/bin/sh

csvfile='index.txt'
csvseparator=','

cut -d ',' -f6,26 file.csv > index.txt

for subdir in ./*/*/
  do
    subdir=${subdir%/}
    dir=${subdir%/*}
    parent=${dir%/*}
    subdir=${subdir##*/}
    dir=${dir##*/}
    if grep -Fxq "${dir%_*}$csvseparator$subdir" "$csvfile"
      then
        echo "ok: $parent/$dir/$subdir"
      elif grep -wq "^${dir%_*}" "$csvfile"
        then
          echo "no: $parent/$dir/$subdir"
#          find "$parent/$dir/$subdir" -delete
    fi
done
10
  • 1
    Substring Removal, grep, find
    – alecxs
    Commented Oct 22, 2021 at 15:14
  • 1
    just saw your csv file contains more columns than in question, added line cut -d ',' -f6,26 file.csv > index.txt
    – alecxs
    Commented Oct 22, 2021 at 15:58
  • Thanks for the solution and detailed explanation. It was really helpful. However, I tried uncommenting the #delete line but it's not working. I also printed out till dir and subdir by parsing and the directories and sub-directories are printed out correctly. I think there may be some problem in the grep step. Sorry, I may be wrong though.
    – botloggy
    Commented Oct 22, 2021 at 17:03
  • 1
    If I copy data below "Information in CSV" in index.txt it works for me. for grep -x flag index.txt must not contain gaps or trailing whitespaces because every byte is taken literally. another cause may wrong line ending CRLF instead LF. just start from scratch writing your own script, you will figure it out
    – alecxs
    Commented Oct 22, 2021 at 17:38
  • 1
    linux.die.net/man/1/grep you can also add quotes in your grep search pattern, although you probably must escape them \"
    – alecxs
    Commented Oct 22, 2021 at 17:52

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.