Question

Why is `i = ++i + 1` unspecified behavior?

Consider the following C++ Standard ISO/IEC 14882:2003(E) citation (section 5, paragraph 4):

Except where noted, the order of evaluation of operands of individual operators and subexpressions of individual expressions, and the order in which side effects take place, is unspecified. 53) Between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored. The requirements of this paragraph shall be met for each allowable ordering of the subexpressions of a full expression; otherwise the behavior is undefined. [Example:
i = v[i++];  // the behavior is unspecified 
i = 7, i++, i++;  //  i becomes 9 

i = ++i + 1;  // the behavior is unspecified 
i = i + 1;  // the value of i is incremented 
—end example]

I was surprised that i = ++i + 1 gives an undefined value of i. Does anybody know of a compiler implementation which does not give 2 for the following case?

int i = 0;
i = ++i + 1;
std::cout << i << std::endl;

The thing is that operator= has two args. First one is always i reference. The order of evaluation does not matter in this case. I do not see any problem except C++ Standard taboo.

Please, do not consider such cases where the order of arguments is important to evaluation. For example, ++i + i is obviously undefined. Please, consider only my case i = ++i + 1.

Why does the C++ Standard prohibit such expressions?

45 4181 45

1 Jan 1970

Solution

You make the mistake of thinking of operator= as a two-argument function, where the side effects of the arguments must be completely evaluated before the function begins. If that were the case, then the expression i = ++i + 1 would have multiple sequence points, and ++i would be fully evaluated before the assignment began. That's not the case, though. What's being evaluated in the intrinsic assignment operator, not a user-defined operator. There's only one sequence point in that expression.

The result of ++i is evaluated before the assignment (and before the addition operator), but the side effect is not necessarily applied right away. The result of ++i + 1 is always the same as i + 2, so that's the value that gets assigned to i as part of the assignment operator. The result of ++i is always i + 1, so that's what gets assigned to i as part of the increment operator. There is no sequence point to control which value should get assigned first.

Since the code is violating the rule that "between the previous and next sequence point a scalar object shall have its stored value modified at most once by the evaluation of an expression," the behavior is undefined. Practically, though, it's likely that either i + 1 or i + 2 will be assigned first, then the other value will be assigned, and finally the program will continue running as usual — no nasal demons or exploding toilets, and no i + 3, either.

2009-12-07

Solution

It's undefined behaviour, not (just) unspecified behaviour because there are two writes to i without an intervening sequence point. It is this way by definition as far as the standard specifies.

The standard allows compilers to generate code that delays writes back to storage - or from another view point, to resequence the instructions implementing side effects - any way it chooses so long as it complies with the requirements of sequence points.

The issue with this statement expression is that it implies two writes to i without an intervening sequence point:

i = i++ + 1;

One write is for the value of the original value of i "plus one" and the other is for that value "plus one" again. These writes could happen in any order or blow up completely as far as the standard allows. Theoretically this even gives implementations the freedom to perform writebacks in parallel without bothering to check for simultaneous access errors.

2009-12-07