Question
How do I identify consecutive/contiguous dates in polars
I'm trying to isolate runs of dates using polars. For this, I've been playing with .rolling
, .rle
, and .rle_id
but can't seem to fit them together to make it work.
Given this:
df = pl.DataFrame({
"date": ["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-05", "2024-01-06", "2024-01-07"]
})
df = df.with_columns(pl.col("date").str.to_date())
I would like to have a 2nd column with the id of the run if they're contiguous:
[1, 1, 1, 2, 2, 2]
4 67
4