2

I am looking for a function that will take a nested dictionary, and produce the combinations / product of the values.

My query is similar to the problem specified here, but I can't seem to adapt the answers their to fit my needs: Cartesian product of nested dictionaries of lists

I wish to have an input like this:

d = {
  "country": [1, 2],
  "health_state": [
    {"healthy": [1]},
    {"breast cancer": {"stage": [1, 2]}}
  ]
}

Produce an output as follows:

[
{{"country":1},{"health state":{"healthy":1}}},
{{"country":2},{"health state":{"healthy":1}}},
{{"country":1},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":1},{"health state":{"breast cancer":{"stage":2}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":1}}}},
{{"country":2},{"health state":{"breast cancer":{"stage":2}}}}
]

In this example, the output is a list of 'states' that a person can occupy

  • any two elements of a list (input) should not be in the same element of the returned list (output), e.g. someone cannot be in country 1 and country 2 simultaneously
  • all keys in a dictionary (input) should be returned in the same element of the list (output), e.g. someone is in country 1 and also in a health_state. If that health state is 'breast cancer' they are also in stage 1 or stage 2

I can envision a solution that requires lots of for loops, checking whether elements are dictionaries, lists or neither, but this seems inefficient, especially for deeply nested dictionaries. I suspect there is a more elegant solution using itertools.product and recursion perhaps?

1 Answer 1

3

You can use recursion with itertools.product:

import itertools as it
d = {'country': [1, 2], 'health_state': [{'healthy': [1]}, {'breast cancer': {'stage': [1, 2]}}]}
def c_prod(d):
  if isinstance(d, list):
     for i in d:
        yield from ([i] if not isinstance(i, (dict, list)) else c_prod(i))
  else:
     for i in it.product(*map(c_prod, d.values())):
        yield dict(zip(d.keys(), i))

print(list(c_prod(d)))

Output:

[{'country': 1, 'health_state': {'healthy': 1}}, 
 {'country': 1, 'health_state': {'breast cancer': {'stage': 1}}}, 
 {'country': 1, 'health_state': {'breast cancer': {'stage': 2}}}, 
 {'country': 2, 'health_state': {'healthy': 1}}, 
 {'country': 2, 'health_state': {'breast cancer': {'stage': 1}}}, 
 {'country': 2, 'health_state': {'breast cancer': {'stage': 2}}}]

The output from the code above produces a list of dictionaries, but your desired output mirrors a list of list of dictionaries (list[list[dict]]), thus, a final transformation can be made:

r = [[{j:k} for j, k in i.items()] for i in c_prod(d)]

Output:

[[{'country': 1}, {'health_state': {'healthy': 1}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 1}, {'health_state': {'breast cancer': {'stage': 2}}}], [{'country': 2}, {'health_state': {'healthy': 1}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 1}}}], [{'country': 2}, {'health_state': {'breast cancer': {'stage': 2}}}]]
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.