converty numpy array of arrays to 2d array

Question

I have a pandas series features that has the following values (features.values)

array([array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0]), ...,
       array([0, 0, 0, ..., 0, 0, 0]), array([0, 0, 0, ..., 0, 0, 0]),
       array([0, 0, 0, ..., 0, 0, 0])], dtype=object)

Now I really want this to be recognized as matrix, but if I do

>>> features.values.shape
(10000,)

rather than (10000, 3000) which is what I would expect.

How can I get this to be recognized as 2d rather than a 1d array with arrays as values. Also why does it not automatically detect it as a 2d array?

possible duplicate: stackoverflow.com/questions/42920363/… — anishtain4, Commented Jun 21, 2018 at 15:21
Try np.stack(features). It treats the array as a list of arrays, and concatenates them on a new axis. np.vstack(features) would also work in this case. That's assuming that all internal arrays have the same shape. — hpaulj, Commented Jun 21, 2018 at 16:18
@anishtain4, your link is for a pandas dataframe, not a numpy array. — hpaulj, Commented Jun 21, 2018 at 16:19
@hpaulj np.stack worked great. Just really dont understand why features.values doesn't return it as such, or why numpy doesnt recognize it as a 2d array. Thank you! — Nate Stemen, Commented Jun 21, 2018 at 17:32

hpaulj · Accepted Answer · 2018-06-21 19:10:09Z

In response your comment question, let's compare 2 ways of creating an array

First make an array from a list of arrays (all same length):

In [302]: arr = np.array([np.arange(3), np.arange(1,4), np.arange(10,13)])
In [303]: arr
Out[303]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

The result is a 2d array of numbers.

If instead we make an object dtype array, and fill it with arrays:

In [304]: arr = np.empty(3,object)
In [305]: arr[:] = [np.arange(3), np.arange(1,4), np.arange(10,13)]
In [306]: arr
Out[306]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

Notice that this display is like yours. This is, by design a 1d array. Like a list it contains pointers to arrays elsewhere in memory. Notice that it requires an extra construction step. The default behavior of np.array is to create a multidimensional array where it can.

It takes extra effort to get around that. Likewise it takes some extra effort to undo that - to create the 2d numeric array.

Simply calling np.array on it does not change the structure.

In [307]: np.array(arr)
Out[307]: 
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
      dtype=object)

stack does change it to 2d. stack treats it as a list of arrays, which it joins on a new axis.

In [308]: np.stack(arr)
Out[308]: 
array([[ 0,  1,  2],
       [ 1,  2,  3],
       [10, 11, 12]])

Shaida Muhammad · Accepted Answer · 2021-11-15 17:41:20Z

15

Shortening @hpauli answer:

your_2d_arry = np.stack(arr_of_arr_object)

answered Nov 15, 2021 at 17:41

Shaida Muhammad

1,66018 silver badges28 bronze badges

Add a comment |

Collectives™ on Stack Overflow

converty numpy array of arrays to 2d array

2 Answers 2

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Linked

Related