Pythonic/Numpy way of converting a 1D array into 2D vector array of indexed values

Question

This is what I currently have:

import numpy as np

data = [0.2, 0.6, 0.3, 0.5]
vecs = np.reshape([np.arange(len(data)),data], (2, -1)).transpose()

vecs
array([[ 0.  ,  0.2],
       [ 1.  ,  0.6],
       [ 2.  ,  0.3],
       [ 3.  ,  0.5]])

This gives me the correct data as I want it, but it seems complex. Am I missing a trick?

You could use np.stack instead of reshaping and transposing? — Arvin Kushwaha, Commented Jul 30, 2020 at 9:45

yatu · Accepted Answer · 2020-07-30 09:58:52Z

2

You can simplify with np.stack and transpose:

data = np.array([0.2, 0.6, 0.3, 0.5])

np.stack([np.arange(len(data)), data], axis=1)
array([[0. , 0.2],
       [1. , 0.6],
       [2. , 0.3],
       [3. , 0.5]])

Timings -

a = np.random.random(10000)
%timeit np.stack([np.arange(len(a)), a], axis=1) 
# 26.3 µs ± 1.54 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit np.array([*enumerate(a)])
# 4.51 ms ± 156 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

edited Jul 30, 2020 at 9:58

answered Jul 30, 2020 at 9:45

yatu

88.4k12 gold badges93 silver badges147 bronze badges

I think you have one extra dimension.
– Arvin Kushwaha
Commented Jul 30, 2020 at 9:47
Yup, just realized :) @arvin
– yatu
Commented Jul 30, 2020 at 9:47
1

Might I suggest the use of the axis keyword? I think it'll save you from having to do a transpose while also looking neater. Also, you can directly use range instead of np.arange, though I suppose it's a bit "hacky".
– Arvin Kushwaha
Commented Jul 30, 2020 at 9:53
1

Thanks for that @arvin Forgot it had that arg. Not hacky, but why not just use np.arange? Numpy has then to cast to float anyways, so we're saving it from some work there
– yatu
Commented Jul 30, 2020 at 9:56
1

Timings show quite a difference @arvin . And as I say, range creates a range of integers, which would then have to be cast to float
– yatu
Commented Jul 30, 2020 at 9:59

| Show 3 more comments

Sayandip Dutta · Accepted Answer · 2020-07-30 09:47:11Z

2

You can try enumerate:

>>> np.array([*enumerate(data)])
array([[0. , 0.2],
       [1. , 0.6],
       [2. , 0.3],
       [3. , 0.5]])

answered Jul 30, 2020 at 9:47

Sayandip Dutta

15.9k4 gold badges26 silver badges56 bronze badges

Oh doh. That's very nice. But is it costly?
– Konchog
Commented Jul 30, 2020 at 9:54
Given a choice between arange(len(data)) and enumerate I think enumerate would be much faster.
– Sayandip Dutta
Commented Jul 30, 2020 at 9:57
1

It turns out that it's actually quite a bit slower to utilize enumerate.
– Arvin Kushwaha
Commented Jul 30, 2020 at 9:59
Yep, saw that. It seems like scaling issue for large arrays.
– Sayandip Dutta
Commented Jul 30, 2020 at 10:00
1

Right, I was basing my argument on range(len(data)) which returns an iterator rather than arange(len(data)).
– Sayandip Dutta
Commented Jul 30, 2020 at 10:03

| Show 1 more comment

Collectives™ on Stack Overflow

Pythonic/Numpy way of converting a 1D array into 2D vector array of indexed values

2 Answers 2

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Related