1

Which is the most performant way to convert something like that

problem = [ [np.array([1,2,3]), np.array([4,5])],
            [np.array([6,7,8]), np.array([9,10])]]

into

desired = np.array([[1,2,3,4,5], 
                   [6,7,8,9,10]])

Unfortunately, the final number of columns and rows (and length of subarrays) is not known in advance, as the subarrays are read from a binary file, record by record.

5
  • So, the number of elems in each element of the list would be the same, like it's 5 here? Commented Nov 23, 2016 at 9:58
  • Does bmat work? Commented Nov 23, 2016 at 11:06
  • The number for each row is the same, so no padding or else is required. Commented Nov 23, 2016 at 13:11
  • Then, I guess the fastest one would be with a traditional loop, intialize output array and use np.concatenate iteratively to assign for each row as listed in @Carles Mitjans's solution. Commented Nov 23, 2016 at 15:32
  • A similar question with answer by Warren: stackoverflow.com/questions/39128514/… Commented Nov 23, 2016 at 17:54

2 Answers 2

5

How about this:

problem = [[np.array([1,2,3]), np.array([4,5])],
        [np.array([6,7,8]), np.array([9,10])]]

print np.array([np.concatenate(x) for x in problem])
Sign up to request clarification or add additional context in comments.

Comments

2

I think this:

print np.array([np.hstack(i) for i in problem])

Using your example, this runs in 0.00022s, wherease concatenate takes 0.00038s

You can also use apply_along_axis although this runs in 0.00024s:

print np.apply_along_axis(np.hstack, 1, problem)

6 Comments

looks good, but i believe that does it take 2 allocation processes? First for each row, and then every row gets copied into the large array?
Look at bmat code - an hstack for each row, and vstack to join the rows. If the list was flattened, you could use one concatenate and then reshape. I don't think the time differences are significant.
np.concatenate seems to be faster than stacking or bmat: problem = [ [np.array([.1]*5000)] * 5 ] * 10000 solution = np.array([np.concatenate(x) for x in problem])
Hstack uses concatenate.
maybe, but there'se a clear difference for me (python 3.5, numpy 1.11)
|

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.