Value categories in C++11

One of the most important additions to C++ in the C++11 standard was the introduction of movable types. This feature has consequences for many common programming tasks such as assigning variables and passing arguments to or returning objects from a function. Move semantics are a bit subtle, and when reading documentation, it helps to understand some vocabulary: value categories.

So, before C++11, any expression in C++ was categorized as either an lvalue or an rvalue, depending on whether it had identity. Basically, an expression has identify it it has a name and thus outlives an expression that uses it. Pre-C++11, an lvalue was something with identity, and an rvalue was anything else, e.g. any temporary value that doesn’t outlive its particular expression. So if we have a line like

MyClass foo = bar();

then foo is an lvalue because it has a name and life beyond this particular line, whereas the temporary object returned by bar() does not and is thus an rvalue. The terms come from the fact that lvalues usually show up on the _l_eft side of an assignment while rvalues are found on the right. However, that doesn’t always hold – a const object can’t be assigned to, but it still has a name and a lifetime beyond a particular expression, so it’s still an lvalue.

Some more examples:

int f();
int& g();

int bar;
bar = 4;   // good
4 = bar;   // bad -- can't assign to the rvalue 4
bar = f(); // good
f() = bar; // bad -- can't assign to the temporary int returned by f()
g() = bar; // good -- the reference returned by g() is an lvalue

So anyway, the point here is that prior to C++11, expressions were categorized based on identity. Lvalues have identity, so they can be assigned to, it makes sense to take their address, etc., while we can’t assign to or ask for the address of an rvalue.

In C++11, things changed a bit, thanks to the need to talk about the new move semantics. It turns out to be helpful to categorize expressions not just based on identity, but also based on movability. Identity means the same thing as before, but we should clarify movability. When we say an object “can be moved”, we don’t mean that it’s possible to copy all its bits to a different spot in memory and thus have moved it. Essentially, an expression is movable if we can use it as input to a move constructor or move assignment operator. These move methods were introduced in C++11 and are allowed to basically cannibalize one object in order to cheaply construct a new (or moved) object.

An easy example here is std::vector. A vector object is basically composed of a bit of accounting information along with a pointer to a potentially-large, dynamically-allocated array. It can be expensive to copy vectors around (say, when returning from a function) because the potentially-large array has to be copied. So instead, it’s sometimes easier for a new vector to be constructed from an old one not by copying the old vector’s array but rather just by stealing it. This is often implented with a swap operation – the new object starts as an empty vector and then swaps its pointer with the old object’s. Anyhow, this is destructive to the original object. After this movement, the old (cannibalized) object is left in a “valid but unspecified” state. Basically we shouldn’t do anything with it as-is other than call its destructor.

So, there are some safety concerns here. The langauge can’t go around using the move functionality willy-nilly under the covers, because the programmer might end up trying to use an object after its been cannibalized. There are two spots where it’s safe to use move operations:

when the programmer explicitly says so
for temporary objects – if we don’t have a name for something, we can’t shoot ourselves in the foot by accessing it later

Great, so we say something is moveable if it’s safe to cannibalize it, and that only happens in the two cases above.

Now we have two properties of expressions: identity and movability. Any given expression either has identity or does not and is movable or is not. There are 4 possible combinations:

lvalues have identity and are not movable
xvalues (I’ve heard eXtraordinary, eXpert, eXpiring) have identity and are movable
prvalues (pure rvalues) do not have identity but are movable
The last combination, no identity and not movable, is not useful and thus not used.

Any expression in C++11 belongs to exactly one of these three value categories.

So let’s see – lvalues are pretty much the same as before C++11. An lvalue is anything that’s not safe to move – it’s something that has a name and a life. We can assign to and ask its address. We can’t move (i.e. cannibalize) it because we might access it later and get unspecified behavior. The other two value types are movable and correspond to the two different ways it can be safe to move something.

First, the compiler can move something if the programmer explicity says it’s OK (using, perhaps, std::move()). In this case, we get something that has a name but is also moveable – an xvalue. It is up to the programmer to avoid doing anything dumb with an xvalue’s name. Second, the compiler can move a temporary object with no identity, a.k.a. a prvalue.

That’s pretty much all there is to it – expressions in C++11 are partitioned into lvalues, xvalues, and prvalues based on whether they have identity and whether they’re movable. There are things that can’t be moved at all (lvalues), things that can be moved because the programmer says so (xvalues), and things that can be moved because they don’t have a name (prvalues).

It turns out there are a couple more terms used that group those 3 fundamental categories in different ways. Any expression with identity (regardless of movability) is called a glvalue (generalized lvalue). In other words, an expression is a glvalue if it’s either an lvalue or an xvalue. Anything that’s movable (regardless of identity) is called an rvalue. That is, something’s an rvalue if it’s if it’s either a prvalue or an xvalue. Here’s a picture that might help:

Value categories

Quiz:

What do you call a glvalue that’s movable?
What do you call a C++11 rvalue that doesn’t have identity?
What type of expression does std::move return?