Taking an average of an array according to another array of indices

Question

Say I have an array that looks like this:

a = np.array([0, 20, 40, 30, 60, 35, 15, 18, 2])

and I have an array of indices that I want to average between:

averaging_indices = np.array([2, 4, 7, 8])

What I want to do is to average the elements of array a according to the averaging_indices array. Just to make that clear I want to take the averages:

np.mean(a[0:2]), np.mean(a[2:4]), np.mean(a[4:7]), np.mean(a[7,8]), np.mean(a[8:])

and I want to return an array that then has the correct dimensions, in this case

result = [10, 35, 36.66, 18, 2]

Can anyone think of a neat way to do this? The only way I can imagine is by looping, which is very anti-numpy.

I think instead of @Divakars np.mean(a[8:-1]) you should use np.mean(a[8:]) or do you want to exclude the last index? And do you just don't want a loop or is speed a concern here? — MSeifert
– MSeifert, Commented Feb 15, 2016 at 14:34
Sorry, I meant shouldn't that be ... np.mean(a[7:8]), np.mean(a[8:]) instead as also mentioned by @MSeifert? — Divakar
– Divakar, Commented Feb 15, 2016 at 14:35

Divakar · Accepted Answer · 2016-02-15 15:04:13Z

Here's a vectorized approach with np.bincount -

# Create "shifts array" and then IDs array for use with np.bincount later on
shifts_array = np.zeros(a.size,dtype=int)
shifts_array[averaging_indices] = 1
IDs = shifts_array.cumsum()

# Use np.bincount to get the summations for each tag and also tag counts.
# Thus, get tagged averages as final output.
out = np.bincount(IDs,a)/np.bincount(IDs)

Sample input, output -

In [60]: a
Out[60]: array([ 0, 20, 40, 30, 60, 35, 15, 18,  2])

In [61]: averaging_indices
Out[61]: array([2, 4, 7, 8])

In [62]: out
Out[62]: array([ 10.        ,  35.        ,  36.66666667,  18.        ,   2.        ])

Collectives™ on Stack Overflow

Taking an average of an array according to another array of indices

1 Answer 1

Comments

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Related