How parse JSON formatted rows in a dataframe in R?

Question

I have a data frame where each row contains JSON formatted input. How can I parse these 3 rows into one table. I tired following commands, but it shows error.

a<-data.frame(v1=c('[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":23.005,"Cost":10,"DIFF":"13.005"}}]',
                   '[{"ID":2,"Type":"BMW","UpdateType":"Every hour","Key":{"Service":"Destan","Name":"Mark"},"Fields":{"Price":2.005,"Cost":1,"DIFF":"1.005"}}]',
                   '[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"9.005"}}]'))
a<-paste0(a$v1, collapse = ",")
l <- fromJSON(paste0(a$v1, collapse = ","))

Error is Error in a$v1 : $ operator is invalid for atomic vectors

Can you please edit your question to show the error message and include which library you are using for the fromJSON function? Thank you. — user438383, Commented Jan 12, 2021 at 13:49

Will · Accepted Answer · 2021-01-12 14:30:24Z

1

If you're trying to concat the rows of json into a larger valid json string, I think you'll also need surrounding [ and ]. Like:

missing_array_ends <- paste0(a$v1, collapse=',')
json_in_r <-  glue::glue("[{missing_array_ends}]") %>% fromJSON

But, I think another complication is the nested structure of the json. Key and Field have multiple sub entries. But you can flatten (unlist) them if you work on each row (lapply). lapply returns a list of items that you want to be rows in a dataframe. dplyr::bind_rows combines them back into a single dataframe

lapply(a$v1, function(j) unlist(fromJSON(j))) %>%
  bind_rows

creates

# A tibble: 3 x 8
  ID    Type  Key.Service Key.Name Fields.Price Fields.Cost Fields.DIFF
  <chr> <chr> <chr>       <chr>    <chr>        <chr>       <chr>      
1 1     Honda Destan      John     23.005       10          13.005     
2 2     BMW   Destan      Mark     2.005        1           1.005      
3 1     Honda Destan      John     13.005       4           9.005      
# … with 1 more variable: UpdateType <chr>

edited Jan 12, 2021 at 14:30

answered Jan 12, 2021 at 14:22

Will

1,3239 silver badges22 bronze badges

Thank you it is working, however, imagine that there are 2 sub JSONs inside one big JSON (like the last row value is '[{"ID":1,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"9.005"}},{"ID":3,"Type":"Honda","Key":{"Service":"Destan","Name":"John"},"Fields":{"Price":13.005,"Cost":4,"DIFF":"1.005"}}]' How then the last code with bind_rows can work? because if it is added it assumes IDs as different columns
– Erko Tru
Commented Jan 12, 2021 at 14:59
the data frame doesn't care if you have repeated parts (ID, Type, Service, Name) or even entirly duplicated rows (see duplicated() to check). lapply+bind_rows has given you a "long" format (one identifier in many rows). You can move the data to "wide" using tools like tidyr::spread or tidyr::pivot_wider. It can get tricky/confusing and might make a good question on it's own.
– Will
Commented Jan 12, 2021 at 18:33

Add a comment |

Collectives™ on Stack Overflow

How parse JSON formatted rows in a dataframe in R?

1 Answer 1

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Related