Question

Assign base object without changing inherited ones?

How valid is it to assign a new object to a base one, without changing the inherited ones?

The example below works as expected, but is it by chance?

This example is simple, but is there any situation where something can break, such as when there are virtual functions, pointers, and other more complex C++ stuff?

#include <iostream>

struct Base
{
    Base(int x) : base_data(x){}
    int base_data;
};

struct Concrete : public Base
{
    Concrete(int x, int y) : Base(x), concrete_data(y){}
    int concrete_data;
};

int main()
{
    Concrete child(1, 2);

    std::cout << child.base_data << std::endl;     // 1
    std::cout << child.concrete_data << std::endl; // 2

    Base* parent = &child;
    *parent = Base(3); // change base only

    std::cout << child.base_data << std::endl;     // 3 (changed)
    std::cout << child.concrete_data << std::endl; // 2 (remains the same)

    return 0;
}

5 140 5

1 Jan 1970

Solution

As mentioned in the comments, this line:

*parent = Base(3); // change base only

Is 100% valid and well defined.

The result is that the base class - Base - members will be updated in the *parent object (despite the fact that is it actually a Concrete object i.e. an instance of a derived class).

The vtable holding the pointers to the virtual methods (which BTW are implementation details and not specified as such in the standard) should not be overwritten.
Therefore the virtual methods of Concrete will be called (because this was and still is the type of the actual object).

Therefore the behavior you observe is the expected one.

Complete example including a virtual method:

#include <iostream>

struct Base {
    Base(int x) : base_data(x) {}
    int base_data;
    virtual void m() { std::cout << "Base::m()\n"; }
};

struct Concrete : public Base {
    Concrete(int x, int y) : Base(x), concrete_data(y) {}
    int concrete_data;
    void m() override { std::cout << "Concrete::m()\n"; }
};

int main() {
    Concrete child(1, 2);

    std::cout << child.base_data << std::endl;     // 1
    std::cout << child.concrete_data << std::endl; // 2
    child.m();

    Base* parent = &child;
    *parent = Base(3); // change base only

    std::cout << child.base_data << std::endl;     // 3 (changed)
    std::cout << child.concrete_data << std::endl; // 2 (remains the same)
    child.m();
}

Output:

1
2
Concrete::m()
3
2
Concrete::m()

Live demo

A side note:
Although defined, such code is somewhat confusing (which I guess is the reason for your question).
I recommend doing it only if you have a good reason, that you did not explain.

2024-07-18

wohlstad

Solution

TL;DR: Yes this is well defined in what it does BUT what it does may not be exactly what you want in all cases -- you need to carefully design your classes to ensure it doesn't cause problems.

The code

Base* parent = &child;
*parent = Base(3); // change base only

is well-defined in what it does -- it calls the Base (move) assignment operator, whatever that may be. For simple object like you've defined, this is just fine -- the default assignment generated for these classes will simply assign (just) the base fields. The derived field (and the actual type) of the child object will not be affected in anyway.

However, the fact that this can be called this way has implications for the design of any class that can be inherited from, or any class that inherits from it.

The Base assignment operator needs to be aware of the fact that it might be called on a derived class instance, or with an argument that is a reference to a derived class instance (or both! and they might even be different derived classes!)
The derived class needs to be aware of the fact that base class assignment (and methods) may be called, which may be relevant if it is trying to maintain some invariants between base class fields an derived class fields.

For example, consider an attempt to inherit from a pimpl class:

class Base {
 protected:
    class Impl {
        // implementation details
     public:
        virtual Impl *clone() const { return new Impl(*this); }
    } *impl;
    Base(Impl *i) : impl(i) {}
 public:
    Base() : impl(new Impl) {}
    Base(const Base &a) : impl(a.impl->clone()) {}
    Base(Base &&a) : impl(a.impl) { a.impl = nullptr; }
    Base &operator=(const Base &a) {
        if (this != &a) {
            delete impl;
            impl = a.impl->clone(); }
        return *this; }
    Base &operator=(Base &&a) {
        std::swap(impl, a.impl);
        return *this; }
    ~Base() { delete impl; }
    // other public methods, generally call impl->whatever
};

class Derived : public Base {
 protected:
    class Impl : public Base::Impl {
        // implementation details
        Impl *clone() const { return new Impl(*this); }
    }
 public:
    Derived : Base(new Impl) {}
    void foo() {
        /* we "know" this will always be a Derived */
        static_cast<Impl *>(impl)->some_derived_impl_method();
    }
};

At first glance, this may seem ok, but it turns out the comment in Derived::foo is wrong. If someone goes and does:

Derived child;
Base *parent = &child;
*parent = Base();
child.foo().

bad things will happen. You can fix this by changing the assignment operator to carefully check the type of *impl and the object being assigned, but that can get very complex fast when there are multiple derived classes that might exist.

2024-07-22

Chris Dodd