9

I have the following command:

echo "column1, column2, column3" > test.csv &&
cat data.json | jq -r '. | [.["column1"], .["column2"], .["column3"]] | @csv’ >> test.csv

It creates a column headings and the data from data.json.

I am trying to also add it where for example it only would pull data that contains the words ("abc") from column3.

I added |select(.column3| startswith ('ab')) so the full command is:

echo "column1, column2, column3" > test.csv &&
cat data.json | jq -r '. | [.["column1"], .["column2"], .["column3"]] |select(.column3| startswith ('ab')) | @csv’ >> test.csv

but I get the following error:

-bash: syntax error near unexpected token `('

my json.data looks like this:

{
      "column1": "hello",
      "column2": "bye",
      "column3": "abc"
}

How do I parse column3? Not sure what I am doing wrong.

9
  • 2
    I added: where and how did you add it? Please post the full command that you have tried. Please post sample data from data.json, enough so others can test it. Commented Aug 8, 2020 at 20:30
  • Thanks for asking, the full command is: echo "column1, column2, column3" > test.csv && cat data.json | jq -r '. | [.["column1"], .["column2"], .["column3"]] |select(.column3| startswith ('abc')) | @csv’ >> test.csv Commented Aug 8, 2020 at 20:32
  • 1
    @KamilCuk it’s hard to come up with the Json.data example though Commented Aug 8, 2020 at 20:58
  • 2
    Please give an indicative example of data.json -- one that is both illustrative and sufficient to replicate the error. Commented Aug 8, 2020 at 20:59
  • @peak I have added an example of the json.data { "column1": "hello", "column2": "bye", "column3": "abc" } Commented Aug 8, 2020 at 21:16

1 Answer 1

16

It's easier to filter before converting the object to an array for @csv:

$ (echo "column1,column2,column3";
   jq -r 'select(.column3 | startswith("ab"))
          | [.column1, .column2, .column3]
          | @csv' data.json) > test.csv
$ cat test.csv
column1,column2,column3
"hello","bye","abc"

But if you do want to convert to an array first, you then have to select using the appropriate array index:

jq -r '[.column1, .column2, .column3]
       | select(.[2] | startswith("ab"))
       | @csv' data.json

Note how I enclosed the echo and jq in a set of parenthesis so they both run in the same subshell, and the output redirection outside of it, instead of having to redirect the output of both commands. Also gets rid of the Useless Use Of Cat; jq takes input filenames as arguments. Even if it didn't, input redirection is better than cat.

Sign up to request clarification or add additional context in comments.

4 Comments

thank you for this! However, the actual json I am using I guess isn't a string maybe as I had thought...cause it gives me the following error: "startswith() requires string inputs" it's a json with over million entries so it's hard to figure out what it is or why I am getting that error. But your answer is super helpful and thank you again!
ah only a few rows were having that error, so that worked. Thank you!
@artemisia480 'over a million entries'? 'a few rows'? If your real input isn't a file with a single json object, your sample input in the question shouldn't be a single object either.
I agree with you, my real json is very nested and hairy, I was having a hard time trying to replicate it. I truly appreciate your answer and it is working so thank you again.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.