Reference Counted Class Objects
Version | 1 |
Created | 2015-02-23 |
Status | Draft |
Last modified | -- |
Author | Walter Bright and Andrei Alexandrescu |
Abstract
This DIP proposes @safe
reference counted class
objects (including
exceptions) and interface
s for D.
Description
DIP25 allows defining struct
types that own data and expose references
to it, @safe
ly, whilst controlling lifetime of that data. This
proposal allows defining class
objects that are safe yet use
deterministic destruction for themselves and resources they own.
The compiler detects automatically and treats specially all class
es
and interface
s that define the following two methods:
class Widget {
T1 opAddRef();
T2 opRelease();
...
}
T1
and T2
may be any types (usually void
or an integral type). The
methods may or may not be final
, virtual, or inherited from a
supertype. Any attributes are allowed on these methods. (If practical,
nothrow
and final
are suggested for performance.) They must be
public. UFCS-expanded calls are not acceptable. If these two methods
exist, the compiler categorizes this class
or interface
type as a
reference counted object (RCO).
Rules
General
-
@safe
code may not issue explicit calls toopAddRef
/opRelease
. -
Implicit conversion to supertypes (
class
orinterface
) is allowed ONLY if the supertype is also a reference counted type. It follows that reference counted types cannot be converted toObject
(unlessObject
itself defines the two methods). -
Method calls to supertypes are only allowed if the supertype that defines the method is also reference counted.
-
Explicit casting to or from
void*
does not entail a call toopAddRef
. -
Typechecking methods of reference counted types is done the same as for
struct
s. This is important because it limits what reference counted types can do. Consider:
@safe class Widget1 {
private int data;
ref int getData() { return data; } // fine
...
}
@safe class Widget2 {
private int data;
ref int getData1() { return data; } // ERROR
ref int getData2() return { return data; } // fine
ulong opAddRef();
ulong opRelease();
...
}
This is because it is safe for a garbage collected object to escape
references to its internal state. The same is not allowed for reference
counted objects because they are expected to be deallocated in a
deterministic manner (same as e.g. struct
objects on the stack).
Creating references
- Whenever a new reference to an object is created (e.g.
auto
a
=
b;
), compiler inserts a call toopAddRef
in the generated code. Call is evaluated only if the reference is notnull
. The lowering ofauto
a
=
lvalExpr;
to pre-DIP74 code is conceptually as follows:
auto a = function(x) { if (x) x.opAddRef(); return x; }(lvalExpr);
- If a new reference is created from an rvalue (including a call to
new
or the result of a function), no call toopAddRef
is inserted. As a consequence, there is no call inserted for the first reference created via a constructor (i.e. it is assumed the constructor already puts the object in the appropriate state). For example the lowering ofauto
a
=
new
Widget;
does not insert a call toopAddRef
.
Assignment to existing references
- Whenever a reference to an object is assigned (e.g.
a
=
b
), firstb.opAddRef()
is called and thena.opRelease()
is called, followed by the reference assignment itself. Calls are only made if the respective objects are notnull
. So the lowering of e.g.lvalExprA
=
lvalExprB;
to pre-DIP74 code is:
function(ref x, y) {
if (y) y.opAddRef();
scope(failure) if (y) y.opRelease();
if (x) x.opRelease();
x = y;
}(lvalExprA, lvalExprB);
The complexity of this code underlies the importance of making
opAddRef
and especially opRelease
nothrow
. In that case the
scope(failure)
statement may be elided.
- Assigning an lvalue from an rvalue does not insert a call to
opAddRef
. It does insert a call toopRelease
against the previous value of the reference. So the lowering of e.g.lvalExpr
=
rvalExpr;
to pre-DIP74 code is:
function(ref x, y) {
if (x) x.opRelease();
x = y;
}(lvalExpr, rvalExpr);
Scope and Destructors
-
Whenever a reference to an object goes out of scope, the compiler inserts an implicit call to
opRelease
. Call is evaluated only if the reference is notnull
. -
struct
,class
, and closure types that have RCO members accommodate calls toopRelease
during their destruction.
Passing references by value into functions
-
The pass-by-value protocol for RCOs is as follows: the caller does NOT insert
opAddRef
for RCOs passed by value. (As one consequence, noopAddRef
oropRelease
calls are issued for the implicitthis
parameter.) This means the callee must assume it is working on references borrowed from the caller. -
If the caller never assigns to a RCO parameter (i.e. it never inserts a call to
opRelease
), then there is no extra code generated related to parameter passing. -
If the caller potentially assigns to an RCO parameter, it may need to insert additional calls to
opAddRef
/opRelease
because it may borrow the same object through several parameters. Consider:
void fun(Widget x, Widget y, bool c) {
if (c) x = null;
y.someMethod();
}
...
auto w = new Widget;
fun(w, w, true);
In this case, fun
borrows the same RCO twice, while it still has only
one recorded reference (the one at birth). Therefore, unwittingly
assigning to x
(and inserting the appropriate x.opRelease
) will
result in the reference count going to zero (and the object getting
potentially deallocated). Following that, the use of y
will be
incorrect.
- Therefore, a function is allowed to conservatively insert a pair of
opAddRef
/opRelease
calls to each RCO parameter. The lowering offun
to pre-DIP74 code might be:
void fun(Widget x, Widget y, bool c) {
// BEGIN INSERTED CODE
if (x) x.opAddRef();
scope(exit) if (x) x.opRelease();
if (y) y.opAddRef();
scope(exit) if (y) y.opRelease();
// END INSERTED CODE
if (c) x = null;
y.someMethod();
}
...
auto w = new Widget;
fun(w, w, true);
The two references don’t have to be aliased for problematic cases to occur. A more subtle example involves borrowing two RCOs, one being a member of the other:
class Gadget {
Gadget next;
...
// RCO primitives
void opAddRef();
void opRelease();
}
void fun(Gadget x, Gadget y, bool c) {
if (c) x.next = null;
y.someMethod();
}
...
auto m = new Gadget;
m.next = new Gadget;
fun(m, m.next, true);
In the example above, the two Gadget
objects created have reference
count 1 upon entering fun
. The conservatively generated (correct) code
first raises both reference count to 2. Upon exiting fun
, both
reference counts are correctly restored to 1. A wrong code generation
approach might free the m.next
field, thus invalidating m
.
Functions returning references by value
- A function that returns a local RCO calls neither
opAddRef
noropRelease
against that value. Example:
Widget fun() {
auto a = new Widget;
return a; // no calls inserted
}
Note: this is not an optimization. The compiler does not have the
discretion to insert additional opAddRef
/opRelease
calls.
- A function that returns an RCO rvalue calls neither
opAddRef
noropRelease
against that value. Example:
Widget fun() {
return new Widget; // no calls inserted
}
Note: this is not an optimization. The compiler does not have the
discretion to insert additional opAddRef
/opRelease
calls.
- Functions that return an RCO (other than the two cases above) call
opAddRef
against the returned reference. This includes globals,static
s, and RCO parameters received either by value or by reference. Example:
Widget fun(ref Widget a, Widget b, int c) {
if (c == 0)
{
static widget w;
if (!w) w = new Widget;
return w; // opAddRef inserted
}
if (c == 1) return a; // opAddRef inserted
return b; // opAddRef inserted
}
- As a litmus test, consider:
Widget identity(Widget x) {
return x;
}
....
auto a = new Widget; // reference count is 1
a = a; // fine, call opAddRef then opRelease per assignment lowering
a = identity(a); // fine, identity calls opAddRef and assignment calls opRelease
Optimizations
- The compiler considers that
opRelease
is the inverse ofopAddRef
, and therefore is at liberty to elide pairs of calls toopAddRef
/opRelease
. Example:
Widget fun() {
auto a = new Widget;
auto b = a;
return b;
}
Applying the rules defined above would have fun
’s lowering insert one
call to opAddRef
(for creating b
) and one call to opRelease
(when
a
goes out of scope). However, these calls may be elided.
Idioms and How-Tos
Defining a non-copyable reference type
Using @disable
this(this);
is a known idiom for creating struct
objects that can be created and moved but not copied. The same is
achievable with RCOs by means of @disable
opAddRef();
(the
declaration must still be present in order for the type to qualify as
RCO, and implemented if not final
).
Defining a reference counted object with deallocation
Classic reference counting techniques can be used with opAddRef
and
opRelease
.
class Widget {
private uint _refs = 1;
void opAddRef() {
++_refs;
}
void opRelease() {
if (_refs > 1) {
--_refs;
} else {
this.destroy();
GC.free(cast(void*) this);
}
}
...
}
Usually such approaches also use private
constructors and object
factories to ensure the same allocation method is used during creation
and destruction of the object.
If the object only needs to free this
(and no other owned resources),
the typechecking ensured by the compiler is enough to verify safety
(however, @trusted
needs to be applied to the call that frees this
).
Defining a type that owns resources
RCOs that own references are defined similarly to struct
s that own
references. Attention must be paid to annotate all functions returning
references to owned data with return
.
class Widget {
private uint _refs = 1;
private int[] _payload; // owned
ref int opIndex(size_t n) return { // mark this as a non-escape reference
return _payload[n];
}
void opAddRef() {
++_refs;
}
void opRelease() {
if (_refs > 1) {
--_refs;
} else {
GC.free(_payload.ptr);
_payload = null;
this.destroy();
GC.free(cast(void*) this);
}
}
...
}
Relinquishing an owned resource
Consider that Widget
in the example above wants to give away its
_payload
to user code. It can do so with a method that effects a
destructive read:
class Widget {
...
int[] releasePayload() {
auto result = _payload;
_payload = null;
return result;
}
}
The method is correctly not annotated with return
because the slice it
returns is not scoped by this
. Note that if the implementer of
Widget
forgets the assignment _payload
=
null
, user code may end
up with a dangling reference.
Defining a type that can be used both with RC and GC
The simplest way to define a type that works with both RC and GC
(subject to e.g. a configuration option) is to simply always define
opAddRef
and opRelease
and rig them to be no-op in the GC case.
There are instances in which this approach is not desirable:
- RCOs objects are subject to additional limitations compared to their
GC counterparts:
- No conversion to
Object
orinterface
s that are not reference counted - Cannot escape pointers and references to direct members in
@safe
code
- No conversion to
- If the stubbed
opAddRef
andopRelease
are notfinal
, efficiency may be a concern: the compiler may be unable to detect the functions do nothing and still insert virtual calls to them.
Another possibility is to make RC vs. GC a policy choice instructing the class being defined:
enum MMPolicy { GC, RC }
class Widget(MMPolicy pol) {
static if (pol == MMPolicy.RC) {
void opAddRef() { ... }
void opRelease() { ... }
}
...
}
Such a class may benefit of the full benefits of each policy, selectable
by appropriate use of static
if
.
Unittests should make sure that the class works as expected with both approaches.
Qualified Types
TODO
Aftermath
This DIP allows defining reference counted class
objects that are
usable in @safe
code. However, it does not enforce safety.
Explicitly freeing memory associated with an object remains the
responsibility of the user. If the user decides to annotate calls such
as free
or GC.free
etc. as @trusted
, it is the user’s
responsibility to make sure the class
was designed to insert return
annotations for all references to owned objects made accessible by the
class.
In short, this DIP makes it possible to write @safe
objects with
deterministic memory deallocation, but does not enforce it.
Copyright
This document has been placed in the Public Domain.