Question

np.where on a numpy MxN matrix but return M rows with indices where condition exists

I am trying to use np.where on a MxN numpy matrix, where I want to return the same number of M rows but the indices in each row where the element exists. Is this possible to do so? For example:

a = [[1 ,2, 2]
     [2, 3, 5]]

np.where(a == 2)

I would like this to return:

[[1, 2],
 [0]]
 3  72  3
1 Jan 1970

Solution

 1

One option is to post-process the output of where, then split:

a = np.array([[1, 2, 2],
              [2, 3, 5]])

i, j = np.where(a == 2)

out = np.split(j, np.diff(i).nonzero()[0]+1)

Alternatively, using a list comprehension:

out = [np.where(x==2)[0] for x in a]

Output:

[array([1, 2]), array([0])]

using this output to average another array

a = np.array([[1, 2, 2], [2, 3, 5]])
b = np.array([[10, 20, 30], [40, 50, 60]])

m = a == 2
i, j = np.where(m)
# (array([0, 0, 1]), array([1, 2, 0]))

idx = np.r_[0, np.diff(i).nonzero()[0]+1]
# array([0, 2])

out = np.add.reduceat(b[m], idx)/np.add.reduceat(m[m], idx)
# array([50, 40])/array([2, 1])

Output:

array([25., 40.])
handling NaNs:
a = np.array([[1, 2, 2], [2, 3, 5]])
b = np.array([[10, 20, np.nan], [40, 50, 60]])

m = a == 2
i, j = np.where(m)
# (array([0, 0, 1]), array([1, 2, 0]))

idx = np.r_[0, np.diff(i).nonzero()[0]+1]
# array([0, 2])

b_m = b[m]
# array([20., nan, 40.])
nans = np.isnan(b_m)
# array([False,  True, False])

out = np.add.reduceat(np.where(nans, 0, b_m), idx)/np.add.reduceat(~nans, idx)
# array([20., 40.])/array([1, 1])

Output:

array([20., 40.])
2024-07-15
mozway