Question
Numpy: apply mask to values, then take mean, but in parallel
I have an 1d numpy array of values:
v = np.array([0, 1, 4, 0, 5])
Furthermore, I have a 2d numpy array of boolean masks (in production, there are millions of masks):
m = np.array([
[True, True, False, False, False],
[True, False, True, False, True],
[True, True, True, True, True],
])
I want to apply each row from the mask to the array v, and then compute the mean of the masked values.
Expected behavior:
results = []
for mask in m:
results.append(np.mean(v[mask]))
print(results) # [0.5, 3.0, 2.0]
Easy to do sequentially, but I am sure there is a beautiful version in parallel? One solution, that I've found:
mask = np.ones(m.shape)
mask[~m] = np.nan
np.nanmean(v * mask, axis=1) # [0.5, 3.0, 2.0]
Is there another solution, perhaps using np.ma module? I am looking for a solution that is faster than my current two solutions.