2

I am trying to use numpy.where to find the indices I want. Here's the code:

import numpy as np
a = np.array([20,58,32,0,107,57]).reshape(2,3)
item_index = np.where((a == 58) | (a == 107) | (a == 20))
print item_index

I get item_index as below:

(array([0, 0, 1]), array([0, 1, 1]))

However, in reality, the dimensions of a is 20000 x 7 and the conditions are several hundred instead of just three. Is there a way to use numpy.where for multiple conditions? I found topics here, here and here useful, but I couldn't find the answer to my question.

2
  • I'd say the problem you're having isn't connected to where. The problem you're having is efficiently compressing several hundred equality conditions into one short, efficient condition. Commented Jul 30, 2014 at 3:23
  • @user2357112 I agree. I will likely to edit the title. In the solutions provided other users did not use where at all and mostly used np.in1d
    – ahoosh
    Commented Jul 30, 2014 at 3:46

3 Answers 3

3

Given (per your example):

>>> a
array([[ 20,  58,  32],
       [  0, 107,  57]])

with the query, 'is an array element of a in a list of values', just use numpy.in1d:

>>> np.in1d(a, [58, 107, 20])
array([ True,  True, False, False,  True, False], dtype=bool)

If you want the indexes to be the same as the underlying array, just reshape to the shape of a:

>>> np.in1d(a, [58, 107, 20]).reshape(a.shape)
array([[ True,  True, False],
       [False,  True, False]], dtype=bool)

Then test against that:

>>> tests=np.in1d(a, [58, 107, 20]).reshape(a.shape)
>>> tests[1,1]                 # is the element of 'a' in the list [58, 107, 20]?
True

In one line (obvious, but I do not know if efficient for one off queries):

>>> np.in1d(a, [58, 107, 20]).reshape(a.shape)[1,1]
True
2

Someone better at numpy may have a better solution - but if you have pandas installed you could do something like this.

import pandas as pd
df = pd.DataFrame(a) # Create a pandas dataframe from array

conditions = [58, 107, 20]
item_index = df.isin(conditions).values.nonzero()

isin builds boolean array which is True is the value is in the conditions list. The call to .values extracts the underlying numpy array from the pandas DataFrame. The call to nonzero() converts bools to 1s and 0s.

3
  • 1
    The same can be achieved in numpy alone using np.in1d and some magic with the indexing: np.unravel_index(np.in1d(a, [58, 107, 20]).nonzero()[0], a.shape)
    – Jaime
    Commented Jul 30, 2014 at 1:36
  • @chrisb I'm thinking to use pandas at some point and your solution would totally work using pandas.
    – ahoosh
    Commented Jul 30, 2014 at 3:43
  • @Jaime Solutions similar to your were provided by two other members. I tested it and it totally works.
    – ahoosh
    Commented Jul 30, 2014 at 3:44
2

Add another dimension to each so they can be broadcast against each other:

>>> 
>>> a = np.array([20,58,32,0,107,57]).reshape(2,3)
>>> b = np.array([58, 107, 20])
>>> np.any(a[...,np.newaxis] == b[np.newaxis, ...], axis = 2)
array([[ True,  True, False],
       [False,  True, False]], dtype=bool)
>>> 

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.