2

array[1] is a string pulled from a 30k lines CSV: example:

samsung black 2014

I need match those lines with one of the values contained within an array (arrayItems).

arrayItems contains 221 values like:

apple
sony
samsung

The actual script:

while IFS=$';' read -r -a array
do
    mapfile -t arrayItems < $itemsFile
    ## now loop through the above array
    for itemToFind in "${arrayItems[@]}"
    do
       itemFound=""
       itemFound="$(echo ${array[1]} | grep -o '^$itemToFind')"
       if [ -n "$itemFound" ] 
       then 
          echo $itemFound 
          # so end to search in case the item is found
          break
       fi
    done
   # here I do something with ${array[2]}, ${array[4]} line by line and so on, 
   # so I can't match the whole file $file_in at once but online line by line.
done < $file_in

The problem is that grep don't match.

but works If I try to hardcode $itemToFind like this:

itemFound="$(echo ${array[1]} | grep -o '^samsung')"

Another thing is... how to do it faster as $file_in is a 30k lines CSV?

3
  • If you want better answers, you need to provide a better example. You would also benefit from reading Raymond's smart question essay
    – Thor
    Commented Nov 29, 2018 at 13:27
  • @Thor you're right. next time I will take a little more time and write a smarter question
    – Kintaro
    Commented Nov 29, 2018 at 13:45
  • Can you provide an example of lines from the CSV file ?
    – lauhub
    Commented Nov 29, 2018 at 15:20

2 Answers 2

2

You can use grep with file pattern option (-f)

Example:

$ echo -e "apple\nsony\nsamsung" > file_pattern
$ grep -f file_pattern your.csv

EDIT: In response of your new contraints:

sed 's/^/\^/g' $itemsFile > /tmp/pattern_file
while IFS=$';' read -r -a array
do
    echo ${array[1]} | grep -q -f /tmp/pattern_file.txt
    if [ $? -eq 0 ]; then 
        # here I do something with ${array[2]}, ${array[4]} line by line and so on, 
        # so I can't match the whole file $file_in at once but online line by line.
    fi
done < $file_in
5
  • I think you miss the -e option for echo
    – lauhub
    Commented Nov 29, 2018 at 12:22
  • I need to check it line by line. (question code updated)
    – Kintaro
    Commented Nov 29, 2018 at 13:16
  • Yes, this is working very fast! I found it here too. Now the only thing I miss is the ^ in the regex (I edited again, sorry)
    – Kintaro
    Commented Nov 29, 2018 at 13:43
  • If you want check if line start with pattern, you need to add ^ at the start of each line of $itemsFile. You can use sed -i 's/^/\^/g' $itemsFile. Be careful, this command change your file.
    – apapillon
    Commented Nov 29, 2018 at 13:59
  • @Kintaro Why do you need to check it line by line? This is already what grep does.
    – Kusalananda
    Commented Dec 20, 2018 at 16:43
1

There are two errors in your script:

  • grep tries to match the string $itemToFind because you put it between single quote '. Use double-quote instead.

  • you are using an array from index 1 while help read tells it is starting at zero.

This should give this:

while IFS=$';' read -r -a array
do
    mapfile -t arrayItems < $itemsFile
    ## now loop through the above array
    for itemToFind in "${arrayItems[@]}"
    do
       itemFound=""
       itemFound=$(echo ${array[0]} | grep -o "$itemToFind")
       if [ -n "$itemFound" ] 
       then 
          echo $itemFound 
          # so end to search in case the item is found
          break
       fi
    done
done < $file_in

EDIT:

If you want to make it faster, you can use extended regular expressions :

grep -E 'apple|sony|samsung' $file_in

And if you want to display only brands:

grep -E 'apple|sony|samsung' $file_in | awk '{print $1}'
5
  • I use ${array[1]} because in the ${array[1]} doesn't contain the data I need from the CSV. ${array[0]} contain the first item of the column (which in this case is a reference code), I need the second item (which is the item name). Plus, the first while do other things during every loop (I'm going to add some code in the question)
    – Kintaro
    Commented Nov 29, 2018 at 12:58
  • I suggest you to add the line echo array0=${array[0]} array1=${array[1]} in your loop and check what happens (to me, ${array[0]} is the complete line, as read separates entries with newline characters)
    – lauhub
    Commented Nov 29, 2018 at 13:02
  • $file_in is a CSV with ; as a separator (as you can see the 1st while have: IFS=$';'), ${array[0]} contains the first value of the line, ${array[1]} the 2nd and so no. p.s. I just edited the question code.
    – Kintaro
    Commented Nov 29, 2018 at 13:10
  • @Kintaro Did changing the single-quote help ?
    – lauhub
    Commented Nov 29, 2018 at 15:21
  • yes double quotes helped but then I switched to the -f option
    – Kintaro
    Commented Nov 30, 2018 at 8:13

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.