Question

Is it undefined behavior to directly remove elements from the underlying range of filter_view?

For the following code:

void foo(std::set<int>& s) {
    for (int val : s | std::views::filter([](int i) { return i > 0; })) {
        s.erase(val - 3);
    }
}

This code is not intended to perform any meaningful task; it's only for illustrative purposes. Here, I use the range adaptor std::views::filter to filter out all positive numbers from s and remove the corresponding val - 3 elements.

This code involves erasing elements from the container while iterating over it. For std::set, erasing an element does not invalidate iterators other than those pointing to the erased element. Therefore, if I convert the above code to an equivalent iterator form, it should not include undefined behavior:

void bar(std::set<int>& s) {
    for (int val : s) {
        if (val > 0) {
            s.erase(val - 3);
        }
    }
}

However, I am not sure if foo contains undefined behavior because the standard imposes additional semantic constraints on some concepts in the ranges library, such as requiring constrained expressions to be equality-preserving. I'm not sure if foo violates these semantic requirements.

Additionally, range.filter.iterator explicitly allows modifying the underlying elements through iterators from filter_view as long as the modified elements still satisfy the predicate. But the standard does not specify what happens when directly modifying the underlying range as in foo.

Is foo undefined behavior due to these modifications, or is it safe under the current C++ standard?

 5  117  5
1 Jan 1970

Solution

 3

One of the inherent problems with lazy filtering is the unexpected behavior you can get in the presence of mutation coupled with multi-pass algorithms - because mutation with filter, specifically, changes whether an element is in the range to begin with! So the 2nd pass of an algorithm can get different amounts of elements.

An example I wrote about previously was:

vector<int> v = {1, 2, 3, 4, 5, 6};
auto evens = v | views::filter(is_even);
pairwise(evens.begin(), evens.end(), [](int& i, int& j){
    fmt::println("({}, {})", i, j);
    ++j;
});

which prints (2, 4) and then... (6, 6). Because the trailing iterator skips an element (because it no longer satisfies the predicate) and catches up to the leading iterator.

But the primary issue with filter is specifically about changing an element from satisfying the predicate to not satisfying the predicate. Actually removing an element from underneath the range (as you are doing) isn't really specific to views::filter -- it's not really any different to removing an element out from under the range in views::transform. It's just that we want to say this particular form of mutation with respect to views::filter can lead to unexpected behavior.

In general this kind of mutating-the-range-out-from-under-a-view behavior is kind of sketchy and could lead to the usual kinds of lifetime issues. So just... be careful with it.

2024-07-01
Barry

Solution

 2

A few things work in your favour for this not to be undefined behaviour.

  • You are using std::set, which only invalidates pointers, references and iterators to the removed elements.
  • You are removing elements strictly preceding the current element.
  • std::ranges::filter_view​::​iterator is defined in terms of an (exposition only) data member iterator_t<V> current_.

Taking all of those into account, the standard does define the behaviour of your filter_view example.

Aside: it is not required to have precisely the same behaviour as the raw loop version, because filter_view​::​iterator::operator++ is defined in terms of ranges::find_if, which isn't required to check in order when given forward iterators, but afaik all major implementations will do a linear search.

2024-07-01
Caleth