In response your comment question, let's compare 2 ways of creating an array
First make an array from a list of arrays (all same length):
In [302]: arr = np.array([np.arange(3), np.arange(1,4), np.arange(10,13)])
In [303]: arr
Out[303]:
array([[ 0, 1, 2],
[ 1, 2, 3],
[10, 11, 12]])
The result is a 2d array of numbers.
If instead we make an object dtype array, and fill it with arrays:
In [304]: arr = np.empty(3,object)
In [305]: arr[:] = [np.arange(3), np.arange(1,4), np.arange(10,13)]
In [306]: arr
Out[306]:
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
dtype=object)
Notice that this display is like yours. This is, by design a 1d array. Like a list it contains pointers to arrays elsewhere in memory. Notice that it requires an extra construction step. The default behavior of np.array
is to create a multidimensional array where it can.
It takes extra effort to get around that. Likewise it takes some extra effort to undo that - to create the 2d numeric array.
Simply calling np.array
on it does not change the structure.
In [307]: np.array(arr)
Out[307]:
array([array([0, 1, 2]), array([1, 2, 3]), array([10, 11, 12])],
dtype=object)
stack
does change it to 2d. stack
treats it as a list of arrays, which it joins on a new axis.
In [308]: np.stack(arr)
Out[308]:
array([[ 0, 1, 2],
[ 1, 2, 3],
[10, 11, 12]])
np.stack(features)
. It treats the array as a list of arrays, and concatenates them on a new axis.np.vstack(features)
would also work in this case. That's assuming that all internal arrays have the same shape.pandas
dataframe, not a numpy array.np.stack
worked great. Just really dont understand whyfeatures.values
doesn't return it as such, or why numpy doesnt recognize it as a 2d array. Thank you!