How to get the Cartesian product of multiple lists

Question

How can I get the Cartesian product (every possible combination of values) from a group of lists?

For example, given

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

How do I get this?

[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), (2, 'a', 5), ...]

_{One common application for this technique is to avoid deeply nested loops. See Avoiding nested for loops for a more specific duplicate. Similarly, this technique might be used to "explode" a dictionary with list values; see Combine Python Dictionary Permutations into List of Dictionaries .}

_{If you want a Cartesian product of the same list with itself multiple times, itertools.product can handle that elegantly. See Operation on every pair of element in a list or How can I get "permutations with repetitions" from a list (Cartesian product of a list with itself)?.}

_{Many people who already know about itertools.product struggle with the fact that it expects separate arguments for each input sequence, rather than e.g. a list of lists. The accepted answer shows how to handle this with *. However, the use of * here to unpack arguments is fundamentally not different from any other time it's used in a function call. Please see Expanding tuples into arguments for this topic (and use that instead to close duplicate questions, as appropriate).}

be aware that 'every possible combination' is not quite the same as 'Cartesian product', since in Cartesian products, duplicates are allowed. — Kenan Banks
– Kenan Banks, Commented Feb 10, 2009 at 20:08
There should be no duplicates in a Cartesian product, unless the input lists contain duplicates themselves. If you want no duplicates in the Cartesian product, use set(inputlist) over all your input lists. Not on the result. — CamilB
– CamilB, Commented Aug 24, 2017 at 8:39
Mathematically, a Cartesian product is a set, so a Cartesian product does not contain duplicates. On the other hand, itertools.product will have duplicates in the output if the inputs have duplicates. So itertools.product is not strictly speaking the Cartesian product, unless you wrap the inputs in set, as mentioned by @CamilB. — Cameron Bieganek
– Cameron Bieganek, Commented Dec 8, 2020 at 22:56

TylerH · Accepted Answer · 2022-11-23 19:13:16Z

673

Use itertools.product, which has been available since Python 2.6.

import itertools

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]
for element in itertools.product(*somelists):
    print(element)

This is the same as:

for element in itertools.product([1, 2, 3], ['a', 'b'], [4, 5]):
    print(element)

edited Nov 23, 2022 at 19:13

TylerH

21.3k86 gold badges84 silver badges122 bronze badges

answered Feb 10, 2009 at 19:58

Kenan Banks

213k36 gold badges162 silver badges176 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

brian buck Over a year ago

Just wanted to add the '*' character is required if you use the variable somelists as provided by the OP.

jfs Over a year ago

@jaska: product() generates nitems_in_a_list ** nlists elements in the result (reduce(mul, map(len, somelists))). There is no reason to believe that yielding a single element is not O(nlists) (amortized) i.e., the time complexity is the same as for simple nested for-loops e.g., for the input in the question: nlists=3, total number of elements in the result: 3*2*2, and each element has nlists items (3 in this case).

Vineet Kumar Doshi Over a year ago

What is the use of * before somelists? What does it do?

Moberg Over a year ago

@VineetKumarDoshi: Here it is used to unpcak a list into multiple arguments to the function call. Read more here: stackoverflow.com/questions/36901/…

normanius Over a year ago

Just a detail, but note that itertools.product() can also handle generators, and not just list-like objects.

|

Jason Baker · Accepted Answer · 2009-02-10 19:58:31Z

127

import itertools
>>> for i in itertools.product([1,2,3],['a','b'],[4,5]):
...         print i
...
(1, 'a', 4)
(1, 'a', 5)
(1, 'b', 4)
(1, 'b', 5)
(2, 'a', 4)
(2, 'a', 5)
(2, 'b', 4)
(2, 'b', 5)
(3, 'a', 4)
(3, 'a', 5)
(3, 'b', 4)
(3, 'b', 5)
>>>

answered Feb 10, 2009 at 19:58

Jason Baker

201k140 gold badges388 silver badges522 bronze badges

Comments

jfs · Accepted Answer · 2009-02-10 21:30:13Z

For Python 2.5 and older:

>>> [(a, b, c) for a in [1,2,3] for b in ['a','b'] for c in [4,5]]
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]

Here's a recursive version of product() (just an illustration):

def product(*args):
    if not args:
        return iter(((),)) # yield tuple()
    return (items + (item,) 
            for items in product(*args[:-1]) for item in args[-1])

Example:

>>> list(product([1,2,3], ['a','b'], [4,5])) 
[(1, 'a', 4), (1, 'a', 5), (1, 'b', 4), (1, 'b', 5), (2, 'a', 4), 
 (2, 'a', 5), (2, 'b', 4), (2, 'b', 5), (3, 'a', 4), (3, 'a', 5), 
 (3, 'b', 4), (3, 'b', 5)]
>>> list(product([1,2,3]))
[(1,), (2,), (3,)]
>>> list(product([]))
[]
>>> list(product())
[()]

The recursive version doesn't work if some of args are iterators.

Peter Mortensen · Accepted Answer · 2023-03-19 15:33:30Z

43

I would use list comprehension:

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

cart_prod = [(a,b,c) for a in somelists[0] for b in somelists[1] for c in somelists[2]]

edited Mar 19, 2023 at 15:33

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Nov 21, 2016 at 8:42

user1035648

5214 silver badges7 bronze badges

3 Comments

Bằng Rikimaru Over a year ago

@llekn because the code seems to be fixed to the number of lists

Lucas Schwartz Over a year ago

@Bằng Rikimaru How is the list comprehension fixed? lst = [i for i in itertools.product(*somelists)]

Lucas Lima Over a year ago

@LucasSchwartz this answer doesn't use itertools, it uses chained list comprehension loops. Your solution is another answer, basically.

Peter Mortensen · Accepted Answer · 2023-03-19 15:28:15Z

36

With itertools.product:

import itertools
result = list(itertools.product(*somelists))

edited Mar 19, 2023 at 15:28

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Feb 10, 2009 at 20:01

SilentGhost

323k67 gold badges312 silver badges294 bronze badges

3 Comments

Vineet Kumar Doshi Over a year ago

What is the use of * before somelists?

hhh Over a year ago

@VineetKumarDoshi "product(somelists)" is a cartesian product between the sublists in a way that Python first get "[1, 2, 3]" as an element and then gets other element after next comman and that is linebreak so the first product term is ([1, 2, 3],), similary for the second ([4, 5],) and so "[([1, 2, 3],), ([4, 5],), ([6, 7],)]". If you wanna get a cartesian product between elements inside the tuples, you need to tell Python with Asterisk about the tuple structure. For dictionary, you use **. More here.

Solomon Ucko Over a year ago

@VineetKumarDoshi See stackoverflow.com/questions/36901/….

Anurag Uniyal · Accepted Answer · 2013-06-14 06:08:27Z

19

Here is a recursive generator, which doesn't store any temporary lists

def product(ar_list):
    if not ar_list:
        yield ()
    else:
        for a in ar_list[0]:
            for prod in product(ar_list[1:]):
                yield (a,)+prod

print list(product([[1,2],[3,4],[5,6]]))

Output:

[(1, 3, 5), (1, 3, 6), (1, 4, 5), (1, 4, 6), (2, 3, 5), (2, 3, 6), (2, 4, 5), (2, 4, 6)]

answered Jun 14, 2013 at 6:08

Anurag Uniyal

89.3k41 gold badges181 silver badges223 bronze badges

5 Comments

Quentin Pradet Over a year ago

They're stored in the stack, though.

Anurag Uniyal Over a year ago

@QuentinPradet do you mean a generator like def f(): while True: yield 1 will keep on increasing its stack size as we go through it?

Anurag Uniyal Over a year ago

@QuentinPradet yeah, but even in this case only the stack needed for max depth, not the whole list, so in this case stack of 3

Quentin Pradet Over a year ago

It's true, sorry. A benchmark could be interesting. :)

njzk2 Over a year ago

we have yield from, now, which makes this simpler

Peter Mortensen · Accepted Answer · 2023-03-19 15:30:11Z

12

In Python 2.6 and above, you can use 'itertools.product`. In older versions of Python you can use the following (almost -- see documentation) equivalent code from the documentation, at least as a starting point:

def product(*args, **kwds):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = map(tuple, args) * kwds.get('repeat', 1)
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)

The result of both is an iterator, so if you really need a list for further processing, use list(result).

edited Mar 19, 2023 at 15:30

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Feb 10, 2009 at 20:02

user3850

3 Comments

Kenan Banks Over a year ago

Per the documentation, the actual itertools.product implementation does NOT build intermediate results, which could be expensive. Using this technique could get out of hand quite quickly for moderately sized lists.

user3850 Over a year ago

i can only point the OP to the documentation, not read it for him.

Kenan Banks Over a year ago

The code from the documentation is meant to demonstrate what the product function does, not as a workaround for earlier versions of Python.

Matthew Beckman · Accepted Answer · 2017-02-21 06:40:59Z

10

Although there are many answers already, I would like to share some of my thoughts:

Iterative approach

def cartesian_iterative(pools):
  result = [[]]
  for pool in pools:
    result = [x+[y] for x in result for y in pool]
  return result

Recursive Approach

def cartesian_recursive(pools):
  if len(pools) > 2:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return cartesian_recursive(pools)
  else:
    pools[0] = product(pools[0], pools[1])
    del pools[1]
    return pools
def product(x, y):
  return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]

Lambda Approach

def cartesian_reduct(pools):
  return reduce(lambda x,y: product(x,y) , pools)

edited Feb 21, 2017 at 6:40

Matthew Beckman

1,7222 gold badges16 silver badges30 bronze badges

answered Feb 21, 2017 at 4:03

weiyixie

5816 silver badges6 bronze badges

7 Comments

Sachin S Over a year ago

In "Iterative Approach", why is result declared as result = [[]] I know that it is list_of_list but in general even if we have declare list_of_list we use [] and not [[]]

Johnny Boy Over a year ago

I'm a bit of a newby in terms of Pythonic solutions. Would you or some passerby please write the list comprehension in the "iterative approach" in separate loops?

Daniel Over a year ago

@SachinS you use an inner list inside the outer list because you iterate over the outer list (for x in result), and the inner list means the outer list isn't empty. If it were empty, no iteration would occur since there would be no x in 'result'. And then you add items to that list. The example is pretty much taken from the official documentation, but I daresay it's more implicit than explicit. If you were to refactor it into code only based on loops and cut out the comprehensions, as Johny Boy is saying, then it would take a quite a lot more code.

blkpingu Over a year ago

what is pools? Is it a list of the lists I want the product of?

CyTex Over a year ago

can someone please help to explain this line return [xx + [yy] if isinstance(xx, list) else [xx] + [yy] for xx in x for yy in y]

|

Jai · Accepted Answer · 2024-02-20 17:32:10Z

Recursive Approach:

def rec_cart(start, array, partial, results):
  if len(partial) == len(array):
    results.append(partial)
    return 

  for element in array[start]:
    rec_cart(start+1, array, partial+[element], results)

rec_res = []
some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
rec_cart(0, some_lists, [], rec_res)
print(rec_res)

Iterative Approach:

def itr_cart(array):
  results = [[]]
  for i in range(len(array)):
    temp_res = []
    for res in results:
      for element in array[i]:
        temp_res.append(res+[element])
    results = temp_res
  
  return results

some_lists = [[1, 2, 3], ['a', 'b'], [4, 5]]  
itr_res = itr_cart(some_lists)
print(itr_res)

user3759769 · Accepted Answer · 2018-01-11 22:56:55Z

A minor modification to the above recursive generator solution in variadic flavor:

def product_args(*args):
    if args:
        for a in args[0]:
            for prod in product_args(*args[1:]) if args[1:] else ((),):
                yield (a,) + prod

And of course a wrapper which makes it work exactly the same as that solution:

def product2(ar_list):
    """
    >>> list(product(()))
    [()]
    >>> list(product2(()))
    []
    """
    return product_args(*ar_list)

with one trade-off: it checks if recursion should break upon each outer loop, and one gain: no yield upon empty call, e.g.product(()), which I suppose would be semantically more correct (see the doctest).

Regarding list comprehension: the mathematical definition applies to an arbitrary number of arguments, while list comprehension could only deal with a known number of them.

Jack Taylor · Accepted Answer · 2022-11-29 01:39:00Z

In 99% of cases you should use itertools.product. It is written in efficient C code, so it is probably going to be better than any custom implementation.

In the 1% of cases that you need a Python-only algorithm (for example, if you need to modify it somehow), you can use the code below.

def product(*args, repeat=1):
    """Find the Cartesian product of the arguments.

    The interface is identical to itertools.product.
    """
    # Initialize data structures and handle bad input
    if len(args) == 0:
        yield () # Match behavior of itertools.product
        return
    gears = [tuple(arg) for arg in args] * repeat
    for gear in gears:
        if len(gear) == 0:
            return
    tooth_numbers = [0] * len(gears)
    result = [gear[0] for gear in gears]

    # Rotate through all gears
    last_gear_number = len(gears) - 1
    finished = False
    while not finished:
        yield tuple(result)

        # Get next result
        gear_number = last_gear_number
        while gear_number >= 0:
            gear = gears[gear_number]
            tooth_number = tooth_numbers[gear_number] + 1
            if tooth_number < len(gear):
                # No gear change is necessary, so exit the loop
                result[gear_number] = gear[tooth_number]
                tooth_numbers[gear_number] = tooth_number
                break
            result[gear_number] = gear[0]
            tooth_numbers[gear_number] = 0
            gear_number -= 1
        else:
            # We changed all the gears, so we are back at the beginning
            finished = True

The interface is the same as for itertools.product. For example:

>>> list(product((1, 2), "ab"))
[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b')]

This algorithm has the following advantages over other Python-only solutions on this page:

It does not build up intermediate results in memory, keeping the memory footprint small.
It uses iteration instead of recursion, meaning you will not get "maximum recursion depth exceeded" errors.
It can accept any number of input iterables, making it more flexible than using nested for loops.

This code is based on the itertools.product algorithm from PyPy, which is released under the MIT licence.

I also like this one because it is the only one so far that can be easily modified to stream the answers without materializing iterators passed in. So if you are doing the product of a bunch of really big (or infinite, or expensive) iterators and you might stop before the end, you only have to materialize as much as you need. I added one approach in a post below

Peter Mortensen · Accepted Answer · 2023-03-19 15:43:19Z

3

You can use itertools.product in the standard library to get the Cartesian product. Other cool, related utilities in itertools include permutations, combinations, and combinations_with_replacement. Here is a link to a Python CodePen for the snippet below:

from itertools import product

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

result = list(product(*somelists))
print(result)

edited Mar 19, 2023 at 15:43

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Dec 12, 2020 at 21:42

chriskoch

492 bronze badges

Comments

Peter Mortensen · Accepted Answer · 2023-03-19 15:31:47Z

2

Just to add a bit to what has already been said: if you use SymPy, you can use symbols rather than strings which makes them mathematically useful.

import itertools
import sympy

x, y = sympy.symbols('x y')

somelist = [[x,y], [1,2,3], [4,5]]
somelist2 = [[1,2], [1,2,3], [4,5]]

for element in itertools.product(*somelist):
  print element

About SymPy.

edited Mar 19, 2023 at 15:31

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Oct 29, 2015 at 22:28

Tyler Heers

1325 bronze badges

Comments

Eric Zinda · Accepted Answer · 2023-04-18 17:21:46Z

I like jack taylor 's implementation above because it is the only one so far that can be easily modified to stream the answers without materializing iterators passed in. So, if you are doing the product of a bunch of really big (or infinite, or expensive) iterators and you might stop before the end, you only have to materialize as much as you need.

Here's one approach to modifying it for that:

def product_stream(*args, repeat=1):
    """Find the Cartesian product of the arguments.

    The interface is identical to itertools.product.
    """
    def index_from_stream(array_stream, index):
        try:
            while index >= len(array_stream[0]):
                next_element = next(array_stream[1])
                array_stream[0].append(next_element)

            return True, array_stream[0][index]

        except StopIteration:
            return False, None

    # Initialize data structures and handle bad input
    if len(args) == 0:
        # Match behavior of itertools.product
        yield ()
        return

    gears = [([], arg) for arg in args] * repeat
    for gear in gears:
        if not index_from_stream(gear, 0)[0]:
            return

    tooth_numbers = [0] * len(gears)
    result = [index_from_stream(gear, 0)[1] for gear in gears]

    # Rotate through all gears
    last_gear_number = len(gears) - 1
    finished = False
    while not finished:
        yield tuple(result)

        # Get next result
        gear_number = last_gear_number
        while gear_number >= 0:
            gear = gears[gear_number]
            tooth_number = tooth_numbers[gear_number] + 1
            has_tooth, gear_tooth_value = index_from_stream(gear, tooth_number)
            if has_tooth:
                # No gear change is necessary, so exit the loop
                result[gear_number] = gear_tooth_value
                tooth_numbers[gear_number] = tooth_number
                break

            _, result[gear_number] = index_from_stream(gear, 0)
            tooth_numbers[gear_number] = 0
            gear_number -= 1

        else:
            # We changed all the gears, so we are back at the beginning
            finished = True

Lucas Schwartz · Accepted Answer · 2021-12-12 01:36:54Z

1

List comprehension is simple and clean:

import itertools

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]
lst = [i for i in itertools.product(*somelists)]

answered Dec 12, 2021 at 1:36

Lucas Schwartz

1743 silver badges17 bronze badges

Comments

Sergio Polimante · Accepted Answer · 2022-05-06 14:15:56Z

1

This can be done as

[(x, y) for x in range(10) for y in range(10)]

another variable? No problem:

[(x, y, z) for x in range(10) for y in range(10) for z in range(10)]

edited May 6, 2022 at 14:15

answered Sep 16, 2021 at 23:13

Sergio Polimante

18910 bronze badges

Comments

Peter Mortensen · Accepted Answer · 2023-03-19 15:49:29Z

1

If you want to reimplement it yourself, you can try with recursion. Something as simple as:

def product(cats, prefix = ()):
  if not cats:
    yield prefix
  else:
    head, *tail = cats
    for cat in head:
      yield from product(tail, prefix + (cat,))

is a working start.

The recursion depth is how many lists of categories you have.

edited Mar 19, 2023 at 15:49

Peter Mortensen

31.2k22 gold badges111 silver badges134 bronze badges

answered Mar 8, 2023 at 12:49

njzk2

39.5k7 gold badges72 silver badges111 bronze badges

Comments

Richard Samuelson · Accepted Answer · 2020-01-04 10:18:24Z

0

I believe this works:

def cartesian_product(L):  
   if L:
       return {(a,) + b for a in L[0] 
                        for b in cartesian_product(L[1:])}
   else:
       return {()}

edited Jan 4, 2020 at 10:18

answered Jan 4, 2020 at 7:26

Richard Samuelson

172 bronze badges

Comments

Peter Mortensen · Accepted Answer · 2023-03-19 15:47:12Z

The following code is a 95% copy from Using NumPy to build an array of all combinations of two arrays; all credits go there! This is said to be much faster since it is only in NumPy.

import numpy as np

def cartesian(arrays, dtype=None, out=None):
    arrays = [np.asarray(x) for x in arrays]
    if dtype is None:
        dtype = arrays[0].dtype
    n = np.prod([x.size for x in arrays])
    if out is None:
        out = np.zeros([n, len(arrays)], dtype=dtype)

    m = int(n / arrays[0].size)
    out[:,0] = np.repeat(arrays[0], m)
    if arrays[1:]:
        cartesian(arrays[1:], out=out[0:m, 1:])
        for j in range(1, arrays[0].size):
            out[j*m:(j+1)*m, 1:] = out[0:m, 1:]
    return out

You need to define the dtype as a parameter if you do not want to take the dtype from the first entry for all entries. Take dtype = 'object' if you have letters and numbers as items. Test:

somelists = [
   [1, 2, 3],
   ['a', 'b'],
   [4, 5]
]

[tuple(x) for x in cartesian(somelists, 'object')]

Out:

[(1, 'a', 4),
 (1, 'a', 5),
 (1, 'b', 4),
 (1, 'b', 5),
 (2, 'a', 4),
 (2, 'a', 5),
 (2, 'b', 4),
 (2, 'b', 5),
 (3, 'a', 4),
 (3, 'a', 5),
 (3, 'b', 4),
 (3, 'b', 5)]

Collectives™ on Stack Overflow

How to get the Cartesian product of multiple lists

19 Answers 19

8 Comments

Comments

1 Comment

3 Comments

3 Comments

5 Comments

3 Comments

Iterative approach

Recursive Approach

Lambda Approach

7 Comments

Comments

Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

19 Answers 19

8 Comments

Comments

1 Comment

3 Comments

3 Comments

5 Comments

3 Comments

Iterative approach

Recursive Approach

Lambda Approach

7 Comments

Comments

Comments

1 Comment

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Comments

Linked

Related