Report a bug
If you spot a problem with this page, click here to create a Bugzilla issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using a local clone.

Reference Counted Class Objects

Version 1
Created 2015-02-23
StatusDraft
Last modified --
Author Walter Bright and Andrei Alexandrescu

Abstract

This DIP proposes @safe reference counted class objects (including exceptions) and interfaces for D.

Description

DIP25 allows defining struct types that own data and expose references to it, @safely, whilst controlling lifetime of that data. This proposal allows defining class objects that are safe yet use deterministic destruction for themselves and resources they own.

The compiler detects automatically and treats specially all classes and interfaces that define the following two methods:

class Widget {
    T1 opAddRef();
    T2 opRelease();
    ...
}

T1 and T2 may be any types (usually void or an integral type). The methods may or may not be final, virtual, or inherited from a supertype. Any attributes are allowed on these methods. (If practical, nothrow and final are suggested for performance.) They must be public. UFCS-expanded calls are not acceptable. If these two methods exist, the compiler categorizes this class or interface type as a reference counted object (RCO).

Rules

General

@safe class Widget1 {
    private int data;
    ref int getData() { return data; } // fine
    ...
}

@safe class Widget2 {
    private int data;
    ref int getData1() { return data; } // ERROR
    ref int getData2() return { return data; } // fine
    ulong opAddRef();
    ulong opRelease();
    ...
}

This is because it is safe for a garbage collected object to escape references to its internal state. The same is not allowed for reference counted objects because they are expected to be deallocated in a deterministic manner (same as e.g. struct objects on the stack).

Creating references

auto a = function(x) { if (x) x.opAddRef(); return x; }(lvalExpr);

Assignment to existing references

function(ref x, y) { 
    if (y) y.opAddRef();
    scope(failure) if (y) y.opRelease();
    if (x) x.opRelease();
    x = y;
}(lvalExprA, lvalExprB);

The complexity of this code underlies the importance of making opAddRef and especially opRelease nothrow. In that case the scope(failure) statement may be elided.

function(ref x, y) { 
    if (x) x.opRelease();
    x = y;
}(lvalExpr, rvalExpr);

Scope and Destructors

Passing references by value into functions

void fun(Widget x, Widget y, bool c) {
    if (c) x = null;
    y.someMethod();
}
...
auto w = new Widget;
fun(w, w, true);

In this case, fun borrows the same RCO twice, while it still has only one recorded reference (the one at birth). Therefore, unwittingly assigning to x (and inserting the appropriate x.opRelease) will result in the reference count going to zero (and the object getting potentially deallocated). Following that, the use of y will be incorrect.

void fun(Widget x, Widget y, bool c) {
    // BEGIN INSERTED CODE
    if (x) x.opAddRef();
    scope(exit) if (x) x.opRelease();
    if (y) y.opAddRef();
    scope(exit) if (y) y.opRelease();
    // END INSERTED CODE
    if (c) x = null;
    y.someMethod();
}
...
auto w = new Widget;
fun(w, w, true);

The two references don’t have to be aliased for problematic cases to occur. A more subtle example involves borrowing two RCOs, one being a member of the other:

class Gadget {
    Gadget next;
    ...
    // RCO primitives
    void opAddRef();
    void opRelease();
}
void fun(Gadget x, Gadget y, bool c) {
    if (c) x.next = null;
    y.someMethod();
}
...
auto m = new Gadget;
m.next = new Gadget;
fun(m, m.next, true);

In the example above, the two Gadget objects created have reference count 1 upon entering fun. The conservatively generated (correct) code first raises both reference count to 2. Upon exiting fun, both reference counts are correctly restored to 1. A wrong code generation approach might free the m.next field, thus invalidating m.

Functions returning references by value

Widget fun() {
    auto a = new Widget;
    return a; // no calls inserted
}

Note: this is not an optimization. The compiler does not have the discretion to insert additional opAddRef/opRelease calls.

Widget fun() {
    return new Widget; // no calls inserted
}

Note: this is not an optimization. The compiler does not have the discretion to insert additional opAddRef/opRelease calls.

Widget fun(ref Widget a, Widget b, int c) {
    if (c == 0)
    {
        static widget w;
        if (!w) w = new Widget;
        return w; // opAddRef inserted
    }
    if (c == 1) return a; // opAddRef inserted
    return b; // opAddRef inserted
}
Widget identity(Widget x) {
    return x;
}
....
auto a = new Widget; // reference count is 1
a = a; // fine, call opAddRef then opRelease per assignment lowering
a = identity(a); // fine, identity calls opAddRef and assignment calls opRelease

Optimizations

Widget fun() {
    auto a = new Widget;
    auto b = a;
    return b;
}

Applying the rules defined above would have fun’s lowering insert one call to opAddRef (for creating b) and one call to opRelease (when a goes out of scope). However, these calls may be elided.

Idioms and How-Tos

Defining a non-copyable reference type

Using @disable this(this); is a known idiom for creating struct objects that can be created and moved but not copied. The same is achievable with RCOs by means of @disable opAddRef(); (the declaration must still be present in order for the type to qualify as RCO, and implemented if not final).

Defining a reference counted object with deallocation

Classic reference counting techniques can be used with opAddRef and opRelease.

class Widget {
    private uint _refs = 1;
    void opAddRef() {
        ++_refs;
    }
    void opRelease() {
        if (_refs > 1) {
            --_refs;
        } else {
            this.destroy();
            GC.free(cast(void*) this);
        }
    }
   ...
}

Usually such approaches also use private constructors and object factories to ensure the same allocation method is used during creation and destruction of the object.

If the object only needs to free this (and no other owned resources), the typechecking ensured by the compiler is enough to verify safety (however, @trusted needs to be applied to the call that frees this).

Defining a type that owns resources

RCOs that own references are defined similarly to structs that own references. Attention must be paid to annotate all functions returning references to owned data with return.

class Widget {
    private uint _refs = 1;
    private int[] _payload; // owned

    ref int opIndex(size_t n) return { // mark this as a non-escape reference
        return _payload[n];
    }

    void opAddRef() {
        ++_refs;
    }
    void opRelease() {
        if (_refs > 1) {
            --_refs;
        } else {
            GC.free(_payload.ptr);
            _payload = null;
            this.destroy();
            GC.free(cast(void*) this);
        }
    }
   ...
}

Relinquishing an owned resource

Consider that Widget in the example above wants to give away its _payload to user code. It can do so with a method that effects a destructive read:

class Widget {
    ...
    int[] releasePayload() {
        auto result = _payload;
        _payload = null;
        return result;
    }
}

The method is correctly not annotated with return because the slice it returns is not scoped by this. Note that if the implementer of Widget forgets the assignment _payload = null, user code may end up with a dangling reference.

Defining a type that can be used both with RC and GC

The simplest way to define a type that works with both RC and GC (subject to e.g. a configuration option) is to simply always define opAddRef and opRelease and rig them to be no-op in the GC case. There are instances in which this approach is not desirable:

Another possibility is to make RC vs. GC a policy choice instructing the class being defined:

enum MMPolicy { GC, RC }

class Widget(MMPolicy pol) {
    static if (pol == MMPolicy.RC) {
        void opAddRef() { ... }
        void opRelease() { ... }
    }
    ...
}

Such a class may benefit of the full benefits of each policy, selectable by appropriate use of static if.

Unittests should make sure that the class works as expected with both approaches.

Qualified Types

TODO

Aftermath

This DIP allows defining reference counted class objects that are usable in @safe code. However, it does not enforce safety.

Explicitly freeing memory associated with an object remains the responsibility of the user. If the user decides to annotate calls such as free or GC.free etc. as @trusted, it is the user’s responsibility to make sure the class was designed to insert return annotations for all references to owned objects made accessible by the class.

In short, this DIP makes it possible to write @safe objects with deterministic memory deallocation, but does not enforce it.

This document has been placed in the Public Domain.