It depends on the exact operation you need to perform.
The example you've given can be expressed in terms of .cum_sum()
with additional arithmetic:
def plus_prev_times_2(col):
x = 2 ** pl.int_range(pl.len() - 1).reverse()
y = 2 ** pl.int_range(1, pl.len())
cs = (x * col.slice(1)).cum_sum()
return cs / x + col.first() * y
df = pl.DataFrame({"some_col": [7, 3, 9, 2]})
df.with_columns(
pl.col.some_col.first()
.append(pl.col.some_col.pipe(plus_prev_times_2))
.alias("plus_prev_times_2")
)
shape: (4, 2)
┌──────────┬───────────────────┐
│ some_col ┆ plus_prev_times_2 │
│ --- ┆ --- │
│ i64 ┆ f64 │
╞══════════╪═══════════════════╡
│ 7 ┆ 7.0 │
│ 3 ┆ 17.0 │
│ 9 ┆ 43.0 │
│ 2 ┆ 88.0 │
└──────────┴───────────────────┘
Vertical fold/scan
In general, I believe what you're asking for is called a "Vertical fold/scan"
Polars only offers a horizontal version, pl.cum_fold
df = pl.DataFrame(dict(a=[7], b=[3], c=[9], d=[2]))
df.with_columns(
pl.cum_fold(acc=0, function=lambda acc, x: acc * 2 + x, exprs=pl.all())
)
shape: (1, 5)
┌─────┬─────┬─────┬─────┬──────────────┐
│ a ┆ b ┆ c ┆ d ┆ cum_fold │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ struct[4] │
╞═════╪═════╪═════╪═════╪══════════════╡
│ 7 ┆ 3 ┆ 9 ┆ 2 ┆ {7,17,43,88} │
└─────┴─────┴─────┴─────┴──────────────┘
As discussed in the issue, a vertical equivalent would be hugely inefficient.
For an efficient approach, you can write plugins in Rust:
But using something like numba is probably easier to implement.
There are several existing numba answers, e.g.