1

I have a data frame where each row contains JSON formatted input. How can I parse these 3 rows into one table. I tired following commands, but it shows error.

a<-data.frame(v1=c('[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":23.005,"Cost":10,"DIFF":"13.005"}}]',
                   '[{"ID":2,"Type":"BMW","UpdateType":"Every hour","Key":{"Service":"Destan","Name":"Mark"},"Fields":{"Price":2.005,"Cost":1,"DIFF":"1.005"}}]',
                   '[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"9.005"}}]'))
a<-paste0(a$v1, collapse = ",")
l <- fromJSON(paste0(a$v1, collapse = ","))

Error is Error in a$v1 : $ operator is invalid for atomic vectors

1
  • Can you please edit your question to show the error message and include which library you are using for the fromJSON function? Thank you.
    – user438383
    Commented Jan 12, 2021 at 13:49

1 Answer 1

1

If you're trying to concat the rows of json into a larger valid json string, I think you'll also need surrounding [ and ]. Like:

missing_array_ends <- paste0(a$v1, collapse=',')
json_in_r <-  glue::glue("[{missing_array_ends}]") %>% fromJSON

But, I think another complication is the nested structure of the json. Key and Field have multiple sub entries. But you can flatten (unlist) them if you work on each row (lapply). lapply returns a list of items that you want to be rows in a dataframe. dplyr::bind_rows combines them back into a single dataframe

lapply(a$v1, function(j) unlist(fromJSON(j))) %>%
  bind_rows

creates

# A tibble: 3 x 8
  ID    Type  Key.Service Key.Name Fields.Price Fields.Cost Fields.DIFF
  <chr> <chr> <chr>       <chr>    <chr>        <chr>       <chr>      
1 1     Honda Destan      John     23.005       10          13.005     
2 2     BMW   Destan      Mark     2.005        1           1.005      
3 1     Honda Destan      John     13.005       4           9.005      
# … with 1 more variable: UpdateType <chr>
2
  • Thank you it is working, however, imagine that there are 2 sub JSONs inside one big JSON (like the last row value is '[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"9.005"}},{"ID":3,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"1.005"}}]' How then the last code with bind_rows can work? because if it is added it assumes IDs as different columns
    – Erko Tru
    Commented Jan 12, 2021 at 14:59
  • the data frame doesn't care if you have repeated parts (ID, Type, Service, Name) or even entirly duplicated rows (see duplicated() to check). lapply+bind_rows has given you a "long" format (one identifier in many rows). You can move the data to "wide" using tools like tidyr::spread or tidyr::pivot_wider. It can get tricky/confusing and might make a good question on it's own.
    – Will
    Commented Jan 12, 2021 at 18:33

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.