There are loads of answers on this topic, but for the life of me I cannot a solution to my issue.
Say I have a JSON like
json_2_explode = [{'scalar': '43',
'units': 'm',
'parameter': [{'no_1': '45',
'no_2': '1038',
'no_3': '356'}],
'name': 'Foo'},
{'scalar': '54.1',
'units': 's',
'parameter': [{'no_1': '78',
'no_2': '103',
'no_3': '356'}],
'name': 'Yoo'},
{'scalar': '1123.1',
'units': 'Hz',
'parameter': [{'no_1': '21',
'no_2': '43',
'no_3': '3577'}],
'name': 'Baz'}]
documenting some readings for attributes Foo
, Yoo
and Baz
. For each I detail a number, that is, the value itself, some parameters, and the name.
Say this JSON is a column in a dataframe,
df = pd.DataFrame(data = {'col1': [11, 9, 23, 1],
'col2': [7, 3, 1, 12],
'col_json' : [json_2_explode,
json_2_explode,
json_2_explode,
json_2_explode]}, index=[0, 1, 2, 3])
col1 col2 col_json
0 11 7 [{'scalar': '43', 'units': 'MPa', 'parameter':...
1 9 3 [{'scalar': '43', 'units': 'MPa', 'parameter':...
2 23 1 [{'scalar': '43', 'units': 'MPa', 'parameter':...
3 1 12 [{'scalar': '43', 'units': 'MPa', 'parameter':...
The issue I have is that if I try
df = pd.json_normalize(df['col_json'].explode())
I get
scalar units parameter name
0 43 m [{'no_1': '45', 'no_2': '1038', 'no_3': '356'}] Foo
1 54.1 s [{'no_1': '78', 'no_2': '103', 'no_3': '356'}] Yoo
2 1123.1 Hz [{'no_1': '21', 'no_2': '43', 'no_3': '3577'}] Baz
3 43 m [{'no_1': '45', 'no_2': '1038', 'no_3': '356'}] Foo
4 54.1 s [{'no_1': '78', 'no_2': '103', 'no_3': '356'}] Yoo
5 1123.1 Hz [{'no_1': '21', 'no_2': '43', 'no_3': '3577'}] Baz
6 43 m [{'no_1': '45', 'no_2': '1038', 'no_3': '356'}] Foo
7 54.1 s [{'no_1': '78', 'no_2': '103', 'no_3': '356'}] Yoo
8 1123.1 Hz [{'no_1': '21', 'no_2': '43', 'no_3': '3577'}] Baz
9 43 m [{'no_1': '45', 'no_2': '1038', 'no_3': '356'}] Foo
10 54.1 s [{'no_1': '78', 'no_2': '103', 'no_3': '356'}] Yoo
11 1123.1 Hz [{'no_1': '21', 'no_2': '43', 'no_3': '3577'}] Baz
So it is exploding each JSON in 3 rows (admitteldy each JSON does contain 3 sub-dicts, so to say).
I actually would like Foo
, Yoo
and Baz
to be documented in the same row, adding columns.
Is there maybe solution not involving manually manipulating rows/piercing it as desired? I would love to see one of your fancy one-liners, thanks so much for your help