Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.
ls = [[1],
[2, 3],
[4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
[2, 3, 0],
[4, 5, 6]]
Attempt
def find_shape(seq):
try:
len_ = len(seq)
except TypeError:
return ()
shapes = [find_shape(subseq) for subseq in seq]
return (len_,) + tuple(max(sizes) for sizes in
itertools.zip_longest(*shapes, fillvalue=1))
Problem
Too slow for large arrays. Can it be done faster? Test & bench code.
Solution shouldn't require list values at final nest depth, only basic attributes (e.g. len), as the listthat depth in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays (dim at least up.
Exact goal is to 5)attain such padding, but with potentially variablechoice of padding from left or from right. The data structure is a list lengths at eachof lists that's to form a 5D array, and only the final nest level (but if it helpscontains non-lists, firstwhich are 1D arrays. First two dims can be assumednest levels have fixed lengthlist lengths (can directly form array), and the 1D arrays are of same length, so the only uncertainty is on 3rd and 4th dims.