Report a bug
If you spot a problem with this page, click here to create a Bugzilla issue.
Improve this page
Quickly fork, edit online, and submit a pull request for this page. Requires a signed-in GitHub account. This works well for small changes. If you'd like to make larger changes you may want to consider using a local clone.

Safe references without runtime checks

Version 1
Created 2013-05-06
StatusDraft
Last modified 2013-05-10
Author Timothee Cour

Abstract

Dconf13 introduced safe references enabled by a runtime check (see email thread from Walter: (‘Rvalue references - The resolution’). We propose a formulation that is safe (guaranteed safety at compile time), efficient (doesn’t require any runtime bounds checks) and simple.

We introduce 2 types of references for ref input arguments of ref return functions: inref and outref (the exact keywords can be discussed later) to distinguish whether a given input argument can be escaped or not (possibly via field accesses): ref fun(int inref a, int outref b, int outref c, int d); indicates that b and c can be escaped by ref return (there could indeed be multiple return statements), but not a.

We argue that these annotations (inref and outref) are sufficient for guaranteeing ref safety, simply by typechecking a program under a set of allowable conversions.

We propose two schemes:

If the function is a method or internal function, the function itself is marked as inref or outref as reference to the implicit ‘this’ parameter.

The annotations are part of the type system and written in the automatically generated di interface files.

Examples


struct U{T x;}
ref T foo(ref T a, ref T b, ref U c, int d){
  static T e;
  if(...) return a;
  else if(...) return c.x;
  else return e;
}

shall have the new signature:

ref T foo(outref T a, inref T b, outref U c, int d);

indicating that it may return by ref a and c only (dependency on c is via field access).

Second example: when the function is a member (say of a struct), the ‘this’ parameter is implicit, and the same rules apply:

struct S { T t; ref T fooc(ref T a) { if(...) return t; else return a;} }

shall have the new signature:

struct S { T t; ref T fooc(outref T a) outref; }

indicating that it may return by ref a and the hidden ‘this’ parameter. The annotation for ‘this’ is at the method level, same as where const would be.

The di file will also have those inref/outref annotations.

Safe ref validation at compile time

Given those inref/outref annotations, it is easy to validate/invalidate ref safety; we simply check whether the program typechecks under the following conversion rules:

Allowed type conversions:

Examples taken from Walter’s above mentioned email. Each one yields an error, and an explanation is given.


//Case A:
    ref T fooa(ref T t) { return t; }
    //=> ref T fooa(outref T t);
    ref T bar() { T t; return fooa(t); } // T t: local; fooa(t): local because ref fooa takes t as outref and t is a local expression. return fooa(t) therefore returns a local, which is an error.

//Case B:
    ref T foob(ref U u) { return u.t; } 
//=>ref T foob(outref U u) { return u.t; } 
    ref U bar() { T t; return foob(t); } // same as above, using rule 'outref 'dot' field => outref'.

//Case C:
    struct S { T t; ref T fooc() { return t; } }
//=>struct S { T t; ref T fooc() outref; } //outref refers to hidden this parameter
    ref T bar() { S s; return s.fooc(); } // same as above

//Case D:
    ref T food() {
        T t;
        ref T bar() { return t; }
//=>ref T bar() outref; //outref refers to hidden this parameter, this could be rewritten as: ref T bar(outref void*this) ; 
        return bar(); // same error as above (since 'this' refers to local stack)
    }

//case E:
    Transitively calling other functions:
    ref T fooe(T t) { return fooa(t); } //same error because t is a local.

scheme A: the user annotates the ref return functions on his own

Just a choice between inref or outref is needed for each ref input arg.

scheme B: the compiler takes care of the annotations via a proposed procedural analysis

We sketch an algorithm to infer inref/outref attributes. Let’s take the following example for illustration:

    ref T foo1(ref T a, T b, ref T c) { if(...) return foo2(a); else return foo2(c); }
    ref T foo2(ref T a) { return a; }

The propagation algorithm goes as follows

  * nodes are ref-return functions    * edges are ref-return dependencies (one edge per return statement in a ref return function): with example above, there is a graph with 2 nodes (foo1 and foo2) and a single edge (foo1 -> foo2).

  * for each node with a 'ref' annotation      * recompute annotations and remove uncertainty if all all outgoing edges have no uncertainty (ie ref instead of inref or outref)

For the above example we have: iteration 0(initialization): foo1(ref T a, T b, ref T c); foo2(ref T a);

iteration 1: foo1(ref T a, T b, ref T c); foo2(outref T a);

iteration 2: foo1(outref T a, T b, outref T c); foo2(outref T a);

Loops in the graph (case A2) correspond to the case of mutually recursive ref return functions. For example:

    ref T foo1(ref T a, T b, ref T c) { if(...) return foo2(a,0,c); else return a; }
    ref T foo2(ref T a, T b, ref T c) { if(...) return foo1(a,1,c); else return c; }

Rvalue references

See DIP39 for how to handle those safely in conjunction with this DIP38.

This document has been placed in the Public Domain.