Question
Polars - Filter DataFrame using another DataFrame's row's
I have two Dataframes - graph
and search
with the same schema
Schema for graph:
SCHEMA = {
START_RANGE: pl.Int64,
END_RANGE: pl.Int64,
}
Schema for search:
SCHEMA = {
START: pl.Int64,
END: pl.Int64,
}
I want to search the graph dataframe.
For every row in search dataframe, I want to find all the rows in the graph dataframe where the START_RANGE
and END_RANGE
values are strictly within the range of the search
dataframe.
How to achieve this using polars only?
Example -
graph = pl.DataFrame(
{
"START_RANGE": [1, 10, 20, 30],
"END_RANGE": [5, 15, 25, 35],
},
)
search = pl.DataFrame(
{
"START": [2, 11, 7],
"END": [5, 14, 12],
},
)
# Expected output
[2,5] is in range [1,5] -> Where 2>=1 and 5<=5
[11,14] is in range [10,15] -> Where 11>=10 and 14<=15
[7,12] is not in any range
output = pl.DataFrame(
{
"START_RANGE": [1, 10],
"END_RANGE": [5, 15],
},
)