Skip to main content
2 of 3
deleted 160 characters in body

I got nerdsniped in Discord with this one. No code yet, too distracted (/maybe later/someone else will need to translate to python), and the text is more suitable to a random post than a formal review. At a glance (and wholly ignoring the idea of saving anything to disk):

  • At the surface-level, there is a lot of unnecessary generation of data. possible_expansions() generates 4 tuples to check for every pixel present in the piece. That gets collapsed by a set but the set check would be unnecessary with a bit of extra checking up front.
  • Data structures are re-generated when needed instead of being cached somehow. Memory usage could balloon quickly if sets are stored with each piece though, so maybe something else (see below) would have to be used.
  • I don't know python well enough to know how the in lookup works for sets (or list-as-sets?), but I'd imagine it has to be generic enough for the lack of types so it's probably a linear lookup as opposed to a hash or tree, which may in some cases be faster, but for this problem a custom solution would be better, especially if you stopped using tuples altogether.
  • Keeping track of the bounds of pieces and maybe using multiple lookup tables for each boundary area could reduce the number of lookups and normalizations needed to check for existing pieces. Knowing bounds could also reduce the scope further by only allowing [square + wide] or [square + tall] bounds, as a piece in either set has an equivalent transformed piece in the other set. Normalizing bounds would only be needed when expanding left/up. Expansions could be reduced by only expanding the set of w*h pieces to a set of (w+1)h or w(h+1) pieces.
  • The specific problem is to generate all possible polyominoes that fit within an 8x8 area. That means that up to n=64 polyominoes can be generated, but generating n=64 polyominoes in arbitrary space is massively inefficient as the problem describes. An 8x8 area has 64 possible [pixel] positions, all of those positions are constant, it should use a bit set instead of a set of tuples.
  • Once you're in the land of bit sets, you can make optimizations on hash or tree lookups for pieces that already exist as difference checks become a single 64-bit integer comparison instead of an O(n^2) loop through all positions in two sets ("does any piece match this" as opposed to "does any piece have all positions matching all of the new positions").
  • Bit sets make it easy to do transformations on the bits, you could use rotations on the i64 to shift individual bytes to imitate a vertical flip, and horizontal flips can be done easily enough with a [256] lookup table.
  • Similarly, positions to expand to or functions used to expand can be placed in a lookup table for each direction, with the index taken by masking bytes for the horizontals and adding a few or-shifts for the verticals. Could maybe even do some clever hackery with the lookups by including two rows in the mask/shift and only doing vertical expansions where a horizontal would not have happened.