Find index of a row in numpy array

Question

Given m x n numpy array

X = np.array([
  [1, 2],
  [10, 20],
  [100, 200]
])

how to find index of a row, i.e. [10, 20] -> 1?

n could any - 2, 3, ..., so I can have n x 3 arrays

Y = np.array([
  [1, 2, 3],
  [10, 20, 30],
  [100, 200, 300]
])

so I need to pass a vector of size n, in this case n=3, i.e a vector [10, 20, 30] to get its row index 1? Again, n could be of any value, like 100 or 1000.

Numpy arrays could be big, so I don't want to convert them to lists to use .index()

In your example you have [10, 20] then you refer to a vector [10, 20, 30]: is the query tensor you are using to find the index of variable size? — Ivan
– Ivan, Commented Aug 15, 2021 at 21:50
My thoughts that np.where((X[:,0] == 10) & (X[:,1] == 20)) do the job, but I don't know how to make a condition to handle arbitrary number n of elements in a vector, i.e. how given a vector [10, 20, 30] automatically get a condition like (Y[:,0] == 10) & (Y[:,1] == 20) & (Y[:,2] == 30) — Sengiley
– Sengiley, Commented Aug 15, 2021 at 22:00

ddejohn · Accepted Answer · 2021-08-16 07:24:15Z

4

Just in case that the query array contains duplicate rows that you are looking for, the function below returns multiple indices in such case.

def find_rows(source, target):
    return np.where((source == target).all(axis=1))[0]

looking = [10, 20, 30]

Y = np.array([[1, 2, 3],
              [10, 20, 30],
              [100, 200, 300],
              [10, 20, 30]])

print(find_rows(source=Y, target=looking)) # [1, 3]

edited Aug 16, 2021 at 7:24

ddejohn

9,0043 gold badges21 silver badges31 bronze badges

answered Aug 15, 2021 at 22:42

NMZ

663 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

akuiper · Accepted Answer · 2021-08-15 22:39:19Z

2

You can use numpy.equal, which will broadcast and compare row vector against each row of the original array, and if all elements of a row are equal to the target, the row is identical to the target:

import numpy as np
np.flatnonzero(np.equal(X, [10, 20]).all(1))
# [1]

np.flatnonzero(np.equal(Y, [10, 20, 30]).all(1))
# [1]

edited Aug 15, 2021 at 22:39

answered Aug 15, 2021 at 22:34

akuiper

216k33 gold badges362 silver badges379 bronze badges

2 Comments

ddejohn Over a year ago

@Sengiley you should look into boolean masking as an alternate solution as it is a very powerful technique.

akuiper Over a year ago

@Sengiley You're welcome. Glad it works for you :)

zoldxk · Accepted Answer · 2021-08-15 23:35:22Z

0

You can make a function as follow:

def get_index(seq, *arrays):
    for array in arrays:
        try:
            return np.where(array==seq)[0][0]
        except IndexError:
            pass

then:

>>>get_index([10,20,30],Y)
1

Or with just indexing:

>>>np.where((Y==[10,20,30]).all(axis=1))[0]
1

edited Aug 15, 2021 at 23:35

answered Aug 15, 2021 at 21:59

zoldxk

2,0501 gold badge10 silver badges33 bronze badges

2 Comments

Sengiley Over a year ago

yes, but it's iteration which is not vectorized, thus is slow I want to make use of numpy indexing

Sengiley Over a year ago

unfortunately, your indexing example is incorrect as it finds indices of individual vector elements Y = np.array([ [10, 2, 3], [10, 20, 30], [100, 20, 30] ]) so np.where(Y[:,:]==[10, 20, 30]) gives (array([0, 1, 1, 1, 2, 2]), array([0, 0, 1, 2, 1, 2]))

Collectives™ on Stack Overflow

Find index of a row in numpy array

3 Answers 3

Comments

2 Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

2 Comments

Related