Question
Create dates ranges using sample rate and number of samples using Polars
I have a time-series dataframe in Polars.
df = pl.DataFrame(
{
"sample_started_at": [
datetime(2022, 1, 1, hour=1, minute=1, second=1),
datetime(2022, 1, 1, hour=2, minute=1, second=1),
datetime(2022, 1, 1, hour=3, minute=1, second=1)
],
"sample_rate": [25600, 25600, 51200],
"sample_size": [100, 200, 100],
}
)
With columns:
sample_started_at
: when the sample started.sample_rate
: how many samples took per second.sample_size
: number of samples in the measurement.
I want to add an array with when the sample was took.
The only way that I was able to do it is with pl.datetime_ranges
and hard-coded SAMPLE_SIZE
, SAMPLE_RATE
.
import polars as pl
from datetime import datetime, timedelta
SAMPLE_SIZE = 100
SAMPLE_RATE = 25600
df.with_columns(
ranges=pl.datetime_ranges(
start=pl.col("sample_started_at"),
end=pl.col("sample_started_at") + timedelta(seconds=1/SAMPLE_RATE * (SAMPLE_SIZE -1)),
interval=timedelta(seconds=1/SAMPLE_RATE),
),
).select(
pl.col("sample_started_at"),
pl.col("ranges"),
ranges_len=pl.col("ranges").list.len()
)
But since those values might change over per sample I need to use dynamic values in the columns.
Is there other way?
Thanks
3 34
3