Great first question!
I've been taking a crack at helping to reduce the runtime for this. So far, I seem to have been somewhat successful. (I'm averaging a ~21.5% increase in performance)
Here is what I suggest so far:
1.) Pre-sorting the expansions
By pre-sorting the expansions using a heap, we are making subsequent calls to canonical and it's use of sorted more efficient. This was the most noticeable performance increase. I chose a heap because it's a good option for sorting when appending to an empty list, which you are doing in possible_expansions with the expansions list. This is because heaps have an insertion time complexity of O(log n), which is very good.
Your code:
# Inside of `possible_expansions`
for p in positions:
if p not in piece_set:
expansions.append(piece + [p])
# Inside of `expand_pieces`
for piece in pieces:
for e in possible_expansions(piece):
exp = canonical(e)
My change:
# Inside of `possible_expansions`
for p in positions:
if p not in piece_set:
heappush(expansions, piece + [p])
# Inside of `expand_pieces`
exps = possible_expansions(piece)
for _ in range(len(exps)): # Use _ to show that we aren't using the loop var
e = heappop(exps)
exp = canonical(e)
2.) Reducing loop copies
Originally, canonical effectively re-uses the same list comprehension for getting min_x and min_y:
def canonical(piece):
min_x = min(x for (x, y) in piece)
min_y = min(y for (x, y) in piece)
res = sorted((x - min_x, y - min_y) for (x, y) in piece)
return res
By using a for loop instead of a list comprehension, we can more readably convert this into a single loop, reducing iterations by 50%:
def canonical(piece):
min_x = float('inf')
min_y = min_x
for (x, y) in piece:
if x < min_x:
min_x = x
if y < min_y:
min_y = y
res = sorted((x - min_x, y - min_y) for (x, y) in piece)
return res
Since we got rid of the calls to min, we must keep track of the current minimums for x and y ourselves, and because this code could eventually reach extremely high numbers for x and y, we can set min_x and min_y to float('inf').
2.5) One more loop removal
On line 52 inside of expand_pieces you check if the current piece, exp, contains an x or y greater than or equal to8:
if (max(x for (x, y) in exp) >= 8) or (max(y for (x, y) in exp) >= 8):
We can remove one list comprehension, and replace max in the other with any, further reducing total iterations:
if any([(x > 7 or y > 7) for (x, y) in exp]):
I made another simplification by changing >= 8 to > 7, but this is more of a preference than anything.
We can replace the list comprehensions with a for loop, and break out as soon as we find a piece that is too big:
too_big = False
for (x, y) in exp:
if (x > 7) or (y > 7) or (x < -7) or (y < -7):
too_big = True
break
if too_big:
continue
2.75) Don't add bad permutations
In the first for loop of possible_expansions, we can reject pieces that are too large:
for (x, y) in piece:
for dx, dy in neighbor_offsets:
if abs(x + dx) > 7 or abs(y + dy) > 7:
continue
positions.add((x + dx, y + dy))
3.) Preventing re-initialization of constant variables
neighbor_offsets never changes but is initialized every time possible_expansions is called. You can move the declaration and initialization of neighbor_offsets to the global scope, just above the definition of possible_expansions.
Output sample comparison:
Number of Polyomino pieces up to size 63
Output as-is Output after optimizations
Pieces with 10 blocks: 4645
0.36945559999730904 vs. 0.28241100000741426
Pieces with 11 blocks: 16929
1.572818599990569 vs. 1.208543800006737
Pieces with 12 blocks: 62149
6.77434189998894 vs. 5.203064699991955
Pieces with 13 blocks: 226701
28.675560999996378 vs. 22.347502399992663
Pieces with 14 blocks: 819508
116.19935209999676 vs. 91.290519400005
I will edit this if I find anything else!