Smart Pointers

Author: Christoph Karl Walter Grein
18th May 2016

Download

Access types in Ada have been designed in a way to prevent the occurrence of dangling references, i.e. they can never designate objects that have gone out of scope. There remains however one problem: When Unchecked_Deallocation is used in order to reclaim storage, we might access already freed storage or, even worse, storage occupied by new objects of different types, unless utmost care is taken (this is the very reason why the generic is called "unchecked" deallocation).

Also see Safe Pointers, an alternative implementation.

Contents

The Basic Package

In the following, a package providing a reference-counted access type Smart_Pointer will be presented that avoids this problem. References to allocated data item are counted and allocations are automatically reclaimed when no more references exist, i.e. a user of this package need not care about storage deallocation, and indeed, the interface disables direct deallocation. (But beware of isolated cyclic pointer islands.)

The design follows closely the AdaCore Gem #97 and improves the implementation in a way that is the theme of AdaCore Gem #107.

Our smart pointers

  type Smart_Pointer is private;

store any data derived from

  type Client_Data is abstract tagged private;

We allocate a new data item with Set:

  procedure Set (Self: in out Smart_Pointer; Data: in Client_Data'Class);

Smart_Pointer is a controlled type which references, beside the client data, a counter counting the number of all pointers that currently reference the same data. Thus a call of Set will first clean up its reference: If there is a data item (after declaration, a smart pointer points to nothing - it is null), it decreases the counter by one, since it will no longer grant access to the data. If this was the only pointer accessing the data, the counter will now be zero and the data can safely be deallocated. Then a new item will be allocated holding the data and a counter with value 1. (See the AdaCore Gem #97 for the details.)

We could read and write the data with a getter:

  function Get (Self: Smart_Pointer) return access Client_Data'Class;

Let's give an example:

  type My_Data is new Client_Data with record
    I: Integer;
  end record;

  Set (P, My_Data'(Client_Data with I => -10));

  My_Data (Get (P).all).I := 42;

Get (P) would return an access to Client_Data; thus, in order to access the actual data, here the integer component I, the access value has to be dereferenced and a view conversion to My_Data done, which will incur a tag check that will succeed in this example. In general, you have to know the type with which to view-convert in order to access the relevant components.

This is all as in the Gem #97.

However this naive implementation of Get is very dangerous, since the caller might hold onto the returned pointer forever – long after the package has freed the data. More generally, the package can't know how long they are using the returned pointer, so it can't know if it is safe to free the associated data. Continuing the example above:


  declare
    Obj: access Client_Data'Class := Get (P);
  begin
    My_Data (Obj.all).I := 2012;                  -- No problem.
    Set (P, My_Data'(Client_Data with I => 83));  -- Frees the original object,
                                                  -- because the reference count
                                                  -- is zero after this operation.
    My_Data (Obj.all).I := 95;                    -- Oops - writing freed memory!
  end;

(A minor danger is that the caller might be tempted to free the data via the pointer, which is none of his business and should completely remain under control of the package, and thus create dangling pointers. But people aren't likely to free stuff they didn't create.)

We thus direly need a safe access, which we will construct now. (See the AdaCore Gem #107.)

So instead of returning a direct access to the data, we define an accessor, a limited type with such an access as a discriminant, and let Get return such an object:

  type Accessor (Data: access Client_Data'Class) is limited private;
  function  Get (Self: Smart_Pointer) return Accessor;

Making the type limited prevents copying, and access discriminants are unchangeable. The discriminant also cannot be copied to a variable of a named access type. The result is that the discriminant can be used only for reading and writing the object (and also not for deallocation). Thus we have achieved our goal of making accesses safe.

To show how to implement Get, we give the private definition of our smart pointers:

  type Client_Data is abstract tagged record
    Count: Natural := 1;  -- the reference count
  end record;

  type Accessor (Data: access Client_Data'Class) is limited null record;

  type Client_Data_Ptr is access Client_Data'Class;

  type Smart_Pointer is new Ada.Finalization.Controlled with record
    Pointer: Client_Data_Ptr;
  end record;

The implementation of the function Get is quite straight forward:

  function Get (Self: Smart_Pointer) return Accessor is
  begin
    return Accessor'(Data => Self.Pointer);
  end Get;

Alas, we are not yet completely safe. To see this, we have to consider in detail the lifetime of the Accessor objects. Let's return to the example above, now with the accessor:

  My_Data (Get (P).Data.all).I := 42;

Here, the lifetime of Get (P) ends with the statement and the accessor is finalized, i.e. it ceases to exist (in Ada vernacular, the master of the object is the statement). So tasking issues aside, nothing can happen to the accessed object (the integer in our example) while the accessor exists.

Now consider a variant of the above. Imagine we have a pointer P whose reference count is 1, and let's extend the accessor's lifetime, like we did for the pointer before:


  declare
    A: Accessor renames Get (P);
  begin
    Set (P, ...);  -- allocate a new object
    My_Data (A.Data.all).I := 42;  -- ?
  end;  -- A's lifetime ends here

In this example, the master of the accessor is the block (and there are other ways to make the lifetime as long as one wishes). Now in the block, the pointer P is given a new object to access. Since we said that P was the only pointer to the old object, it's finalized just as with the pointer before and again with the same disastrous effect: A.Data is now a dangling pointer granting access to a nonexistent object until the end of the declare block.

To cure the situation, we have to prevent the deallocation. That suggests increasing the reference count with the construction of an accessor and decreasing the count when the accessor is finalized again. The easiest way to accomplish this is to piggyback upon the properties of the smart pointer type:

  type Accessor (Data: access Client_Data'Class) is limited record
    Hold: Smart_Pointer;
  end record;

  function Get (Self: Smart_Pointer) return Accessor is
  begin
    return Accessor'(Data => Self.Pointer, Hold => Self);
  end Get;

Note by the way that when the discriminant is declared access constant, the accessor object can be used only for reading the data.

Dispatching

Above, we saw that a type conversion is necessary for accessing the data. This is impractical since you have to know which type the smart pointer references, and this information might not be available (e.g. when retrieving pointers from a polymorphous list). Hence a better way is access via dispatching.

Here is the visible package specification for ease of discussion:

package Smart_Pointers is

  type Client_Data is abstract tagged private;

  type Accessor (Data: access Client_Data'Class) is limited private;

  type Smart_Pointer is private;

  procedure Set (Self: in out Smart_Pointer; Data: in Client_Data'Class);
  function  Get (Self: Smart_Pointer) return Accessor;

private

  ...  -- not shown

end Smart_Pointers;

A user's package might look like so:

with Smart_Pointers;
use  Smart_Pointers;

package My_Pointers is

  type Int_Data is new Client_Data with record
    I: Integer;
  end record;

  not overriding procedure Work (X: in out Int_Data);

  type Flt_Data is new Client_Data with record
    F: Float;
  end record;

  not overriding procedure Work (X: in out Flt_Data);

end My_Pointers;

The idea is to use a dispatching call to Work instead of a type conversion:

  Work (Int_Data (Get (P).Data.all));  -- will raise Constraint_Error if Tag_Check fails
  Work (Get (P).Data.all);             -- meant to dispatch

The first call is statically bound with the type conversion, just as in the original example. However, the second call is illegal because of a type mismatch. For dispatching to work properly, we have to derive our data types from a common ancestor which has a primitive operation Work:

  type My_Root is abstract new Client_Data with null record;
  not overriding procedure Work (X: in out My_Root) is abstract;

  type Int_Data is new My_Root with record ...;  -- and similarly for Flt_Data
  overriding procedure Work (X: in out Int_Data);

Now the dispatching call looks like this:

  Work (My_Root'Class (Get (P).Data.all));

and, depending on the tag, the correct version of procedure Work will be called. (Note that we still have to do a type conversion, but in this case, it is a view conversion to the parent's class My_Root'Class, which will always pass as long as all further types are derived from My_Root.)

As a concluding question: Why don't we equip Client_Data with an abstract procedure Work? Well, who knows how the Smart_Pointers package will be used - and such a procedure will have to be overridden for all derived types, which will be irritating if there is no use for it. Of course, we could make it a null procedure, but the irritating fact remains that it might not be useful for the application at hand.

Genericity

For simplicity of discussion, the package Smart_Pointers has been presented as a library package. This has the effect that derivations from Client_Data have to be made also on library level (as shown in the examples above).

Since this is unpracticable, the package in reality is generic: Generic_Smart_Pointers. The package may be instantiated at any level; any type derivations have to be made on the same level. See the test programs for examples.

package Smart_Pointers is new Generic_Smart_Pointers;

Child Packages

As you have seen, you have to derive from Client_Data and thus have to know the specific type of the data accessed or use dispatching in order to access it again via a view conversion. Since this is awkward, two child packages with identical interfaces have been constructed on top of the above which hide all these details by being generic with respect to the data type they can access, one for definite and one for indefinite types:


generic

  type T (<>) is private;

package Generic_Smart_Pointers.Generic_InDefinite_Pointers is

  type Accessor (Data: access T) is limited private;

  type Smart_Pointer is private;

  procedure Set (Self: in out Smart_Pointer; Data: in T);
  function  Get (Self: Smart_Pointer) return Accessor;

  ...  -- Rest not shown

end Generic_Smart_Pointers.Generic_InDefinite_Pointers;

Ada 2012

Ada 2012 alleviates dereferences of the Accessor with the new aspect Implicit_Dereference:


  type Accessor (Data: access T) is limited private with
    Implicit_Dereference => Data;

Accessor is called a reference type (see RM 2012 4.1.5; see also AdaCore Gem #123). This aspect allows you to write:


  declare
    A: Accessor := Get (P);  -- a reference object
    X: T        := ...;
  begin
    A           := X;        -- Assignment through a reference is equivalent to
    A.Data.all  := X;
    X           := Get (P);  -- Implicit dereference is equivalent to
    X           := Get (P).Data.all;
  end;

Note that now the call of Get (P) is overloaded – it can denote the reference object or the data it references, the compiler will select the correct interpretation depending on context. Unfortunately this does not work for the basic package because of the needed type conversion in the dereference, which does not provide enough context for overload resolution.

Concurrency

The code presented so far works fine in sequential environments or as long as each data item is handled by only one task. In order to make it task-safe, the manipulation of the reference counter has to be protected. This is not done!


Download together with test programs in zip format.

You will get the Ada 2012 version. To compile it with an Ada 2005 compiler, just remove the Implicit_Dereference aspect shown in green above.


History
18.05.2016 Made Smart_Pointers generic: Generic_Smart_Pointers. Type derivation from Client_Data need no more be on library level. See test programs.
10.10.2012 Improved text. Fixed link to Gem 123
30.06.2012 Added Deep_Copy to My_Smart_Indefinite_String_Pointers (was forgotten in previous release)
29.06.2012 Bug fix in Generic_Indefinite_Pointers
28.06.2012 GNAT GPL 2012 is out with full implementation of Ada 2012
16.09.2011 Example for dispatching added
09.07.2011 Free for client type added; preview to Ada 2012
31.05.2011 First release

Deutsch Heimat Inhaltsverzeichnis
English Contents
Deutsch English

Valid XHTML 1.0 Transitional!