1

Say I have an array that looks like this:

a = np.array([0, 20, 40, 30, 60, 35, 15, 18, 2])

and I have an array of indices that I want to average between:

averaging_indices = np.array([2, 4, 7, 8])

What I want to do is to average the elements of array a according to the averaging_indices array. Just to make that clear I want to take the averages:

np.mean(a[0:2]), np.mean(a[2:4]), np.mean(a[4:7]), np.mean(a[7,8]), np.mean(a[8:])

and I want to return an array that then has the correct dimensions, in this case

result = [10, 35, 36.66, 18, 2]

Can anyone think of a neat way to do this? The only way I can imagine is by looping, which is very anti-numpy.

3
  • 1
    I think instead of @Divakars np.mean(a[8:-1]) you should use np.mean(a[8:]) or do you want to exclude the last index? And do you just don't want a loop or is speed a concern here? Commented Feb 15, 2016 at 14:34
  • 1
    Sorry, I meant shouldn't that be ... np.mean(a[7:8]), np.mean(a[8:]) instead as also mentioned by @MSeifert? Commented Feb 15, 2016 at 14:35
  • 1
    Also, np.mean(a[4:7]) comes out to be 36.66. Commented Feb 15, 2016 at 15:05

1 Answer 1

1

Here's a vectorized approach with np.bincount -

# Create "shifts array" and then IDs array for use with np.bincount later on
shifts_array = np.zeros(a.size,dtype=int)
shifts_array[averaging_indices] = 1
IDs = shifts_array.cumsum()

# Use np.bincount to get the summations for each tag and also tag counts.
# Thus, get tagged averages as final output.
out = np.bincount(IDs,a)/np.bincount(IDs)

Sample input, output -

In [60]: a
Out[60]: array([ 0, 20, 40, 30, 60, 35, 15, 18,  2])

In [61]: averaging_indices
Out[61]: array([2, 4, 7, 8])

In [62]: out
Out[62]: array([ 10.        ,  35.        ,  36.66666667,  18.        ,   2.        ])
Sign up to request clarification or add additional context in comments.

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.