Skip to main content
edited tags
Link
200_success
  • 145.7k
  • 22
  • 191
  • 481
added 271 characters in body
Source Link

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values at final nest depth, only basic attributes (e.g. len), as the listthat depth in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays (dim at least up.

Exact goal is to 5)attain such padding, but with potentially variablechoice of padding from left or from right. The data structure is a list lengths at eachof lists that's to form a 5D array, and only the final nest level (but if it helpscontains non-lists, firstwhich are 1D arrays. First two dims can be assumednest levels have fixed lengthlist lengths (can directly form array), and the 1D arrays are of same length, so the only uncertainty is on 3rd and 4th dims.

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values, only basic attributes (e.g. len), as the list in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays (dim at least up to 5), with potentially variable list lengths at each nest level (but if it helps, first two dims can be assumed fixed length).

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values at final nest depth, only basic attributes (e.g. len), as that depth in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays.

Exact goal is to attain such padding, but with choice of padding from left or from right. The data structure is a list of lists that's to form a 5D array, and only the final nest level contains non-lists, which are 1D arrays. First two nest levels have fixed list lengths (can directly form array), and the 1D arrays are of same length, so the only uncertainty is on 3rd and 4th dims.

added 178 characters in body
Source Link

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values, only basic attributes (e.g. len), as the list in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays (dim at least up to 5), with potentially variable list lengths at each nest level (but if it helps, first two dims can be assumed fixed length).

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values, only basic attributes (e.g. len), as the list in application contains 1D arrays on GPU (and accessing values moves them back to CPU).

Find shape of n-dim array that'd be formed from nested list of lists with variable lengths if we were to pad the lists to same length at each nest level. E.g.

ls = [[1],
      [2, 3],
      [4, 5, 6]]
# (3, 3) because
ls_padded = [[1, 0, 0],
             [2, 3, 0],
             [4, 5, 6]]

Attempt

def find_shape(seq):
    try:
        len_ = len(seq)
    except TypeError:
        return ()
    shapes = [find_shape(subseq) for subseq in seq]
    return (len_,) + tuple(max(sizes) for sizes in
                           itertools.zip_longest(*shapes, fillvalue=1))

Problem

Too slow for large arrays. Can it be done faster? Test & bench code.

Solution shouldn't require list values, only basic attributes (e.g. len), as the list in application contains 1D arrays on GPU (and accessing values moves them back to CPU). It must work on n-dim arrays (dim at least up to 5), with potentially variable list lengths at each nest level (but if it helps, first two dims can be assumed fixed length).

added 176 characters in body
Source Link
Loading
Source Link
Loading