1

I have a long input list, and I want to generate combinations of 3 elements. But each combination's elements should be drawn from a length-5 sublist of the original list, rather than drawing elements from the whole list.

Equivalently, no two elements of a combination can be spaced more than 4 positions apart in the original list.

I wrote this code:

import itertools

input = [
    'a','b','c','d','e','f','g','h','i','j'
]
l=0
space = ' '
count = 3
run = 5

def combine_list(input_list, count, run):
    list_out = []
    j=0
    k = (len(input)-run)+1
    while j < k:
        comb_set = input_list[j:run+j:]
        comb = itertools.combinations(comb_set,count)
        for i in comb:
            out = space.join(list(i))
            list_out += [out]
        j += 1
    return list_out

x = combine_list(input,count, run)

print(x, len(x))

with this output:

['a b c', 'a b d', 'a b e', 'a c d', 'a c e', 'a d e', 'b c d', 'b c e', 'b d e', 'c d e', 'b c d', 'b c e', 'b c f', 'b d e', 'b d f', 'b e f', 'c d e', 'c d f', 'c e f', 'd e f', 'c d e', 'c d f', 'c d g', 'c e f', 'c e g', 'c f g', 'd e f', 'd e g', 'd f g', 'e f g', 'd e f', 'd e g', 'd e h', 'd f g', 'd f h', 'd g h', 'e f g', 'e f h', 'e g h', 'f g h', 'e f g', 'e f h', 'e f i', 'e g h', 'e g i', 'e h i', 'f g h', 'f g i', 'f h i', 'g h i', 'f g h', 'f g i', 'f g j', 'f h i', 'f h j', 'f i j', 'g h i', 'g h j', 'g i j', 'h i j'] 60

But this produces duplicates. I can sort to remove them, but I would rather not make them in the first place.

1
  • In addition to what is explained in the answers, I feel I should point out that input is a standard Python function, so you should avoid using that as a variable, in case you might want to use input() somewhere else in cour code. Commented Apr 17, 2025 at 3:51

2 Answers 2

2

Rather than generating all length-3 combinations of length-5 sublists of the list, go over each element of the list and generate all valid length-3 combinations that start with a particular element.

To do that for an element, take all length-2 combinations of the next 4 elements, and stick the current element in front of the combination.

When you near the end of the input, you'll reach a point where there aren't 4 next elements. Just generate all length-3 combinations of what's left at that point.

import collections
import itertools

def restricted_combinations(elements, window_length, combination_length):
    elements = iter(elements)
    pool = collections.deque(itertools.islice(elements, 0, window_length-1))

    for next_elem in elements:
        pool.append(next_elem)
        start = pool.popleft()
        for tail in itertools.combinations(pool, combination_length - 1):
            yield (start, *tail)

    yield from itertools.combinations(pool, combination_length)
Sign up to request clarification or add additional context in comments.

3 Comments

Yes, that is the way. "Fix" the first element, and consider any 2 other elements not further than 5-1 elements away, to avoid duplicates and produce all allowed subsequences. You can even do that quite easily "by hand", without itertools.
Since you support any iterables, not just lists, maybe rename sublist_length to window_length.
@Robert: Ah, that's a good name.
2

Like user2357112's, but fixing the end element is simpler (Attempt This Online!, with testing):

import collections
import itertools

def restricted_combinations(elements, window_length, combination_length):
    pool = collections.deque(maxlen=window_length-1)
    for elem in elements:
        for head in itertools.combinations(pool, combination_length - 1):
            yield (*head, elem)
        pool.append(elem)

Or since the input is a list, which we can slice (fixing the start instead):

import itertools

def restricted_combinations(elements, window_length, combination_length):
    for i in range(len(elements)):
        start, *pool = elements[i : i+window_length]
        for tail in itertools.combinations(pool, combination_length - 1):
            yield (start, *tail)

Attempt This Online!

3 Comments

That indeed seems simpler. My code produces output in lexicographic order, while this approach doesn't, but most use cases won't care about that. (I wasn't deliberately going for that when I wrote my code.)
@user2357112 Yeah I had thought about which would be preferable, but didn't come to a conclusion. Anyway, added another solution with start fixed instead.
Thanks for the help. Yes order is important for later steps. only really for tracing back over method.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.