Question

Follow sort after a group_by in polars

import polars as pl

# Sample data
data = {
    'Group': ['A', 'A', 'B', 'B', 'C', 'C'],
    'Value': [10, 20, 15, 25, 5, 30],
    'OtherColumn': [100, 200, 150, 250, 50, 300]
}

# Create DataFrame
df = pl.DataFrame(data)

# Group by 'Group' and sort within each group by 'Value'
sorted_df = df.group_by('Group').map_groups(lambda group_df: group_df.sort('Value'))

# Display the sorted DataFrame
print(sorted_df)

is there a native polars way to do a sort within the groups after a group_by without using map_groups? the alternative approach I know is to sort specifying multiple columns, but I would like to first group_by, and then do the sort.

 3  52  3
1 Jan 1970

Solution

 2
  • sort_by using 'Value' as parameter.
  • over() to do it within the group.
df.select(pl.all().sort_by('Value').over('Group'))

┌───────┬───────┬─────────────┐
│ Group ┆ Value ┆ OtherColumn │
│ ---   ┆ ---   ┆ ---         │
│ str   ┆ i64   ┆ i64         │
╞═══════╪═══════╪═════════════╡
│ A     ┆ 10    ┆ 100         │
│ A     ┆ 20    ┆ 200         │
│ B     ┆ 15    ┆ 150         │
│ B     ┆ 25    ┆ 250         │
│ C     ┆ 5     ┆ 50          │
│ C     ┆ 30    ┆ 300         │
└───────┴───────┴─────────────┘

Or if you ok with possible reordering of groups themselves:

df.sort('Group', 'Value')

┌───────┬───────┬─────────────┐
│ Group ┆ Value ┆ OtherColumn │
│ ---   ┆ ---   ┆ ---         │
│ str   ┆ i64   ┆ i64         │
╞═══════╪═══════╪═════════════╡
│ A     ┆ 10    ┆ 100         │
│ A     ┆ 20    ┆ 200         │
│ B     ┆ 15    ┆ 150         │
│ B     ┆ 25    ┆ 250         │
│ C     ┆ 5     ┆ 50          │
│ C     ┆ 30    ┆ 300         │
└───────┴───────┴─────────────┘
2024-07-05
Roman Pekar