Add Finalization

Christoph Karl Walter Grein

Never solve puzzles that open the gates of Hell...

With its tagged types and appropriate dynamic binding facilities, Ada95 has become a mighty language. Virtually ad libidum, you can add new components to given types of a class. This seems to imply some kind of arbitrariness in the choice of the starting point of a type hierarchy (the root of a class). However, this arbitrariness is only a seeming one and you rather have to exercise utmost care in choosing the root lest you might run into troubles at the end. This fact is high-lighted by the following problem about controlled types I found some time ago in the Ada internet forum Chat@Gnat.com, which I thought might be of common interest.

Concisely:

1. Introduction

The problem was initially posted by Heath White (the link <white@titian.jeol.com> is meanwhile broken) who said: "One of the main headaches this causes occurs when you already have a tagged hierarchy of types. You want to derive off from it, but you want the new type to have controlled behavior. There is no good way to do this."

Robert I. Eachus objected: "Of course there is, but in part it is non-intuitive ... If you are adding a component that needs to be controlled, the controlledness is a property of the component, and the language supports that correctly. It has to, since you might not be aware that the component you are adding to a non-controlled class is controlled.

The more 'troubling' case is when the addition of a component creates a need to finalize the containing structure. Access discriminants allow the component to see the containing structure.

I keep meaning to create an abstraction which deals with the most common sort of such extensions separate from other 'building block' tagged components. It is less than twenty lines long but choosing some of those lines correctly takes thought. Hmmm ...

This version has the side effect of making the extended type limited. The body for a non-limited version is much more complex because you have to use a record with an access value as a non-discriminant, then track assignment and copying using Initialize and Adjust. (Left as an exercise for the reader.)

There is however one 'can't be avoided' problem. The order in which the finalization of two extensions will be called is determined in part by the subclassing order (see RM 7.6.1(9)) and in part by whether or not there is an access discriminant. If you have too many components trying to finalize things, the ordering can get to be a problem."

2. Solutions

2.1 Limited types

You will find Eachus's solution for the class of limited types here. The only change I made was to make it a child of a root package

so that other siblings can then deal with different type classes.

The code looks quite innocent and you would not suspect that the solution for the unlimited case is that "hellish" as the problem's originator Heath White called it. Let me add that a rule in RM 3.9.1(3) enforces that the actual type for the formal limited Uncontrolled be limited too [AARM 3.9.1(3.a) explains why], so you would probably like to have an unlimited solution.

The astute reader will have noticed that something is missing in Eachus's package: the operation Initialize. So let us add the generic parameter

Before going on, the reader should first try the implementation - it is not so trivial.1

2.2 Unlimited types, sequential environment

My attempts to find a solution led to a lengthy discussion in the net. If you think about it, you'll soon find the first of the two big problems: How to get the access value of the object just finalised before into Adjust? See the remarks in the AARM 7.6 (17a..e): Adjust just knows nothing about the target. Also the Ada Rationale (4.6.3) points out the problem by stating that the very reason for access discriminants being provided for limited types only is avoiding copying problems with self-referring objects.

The only solution seems to be a global store where Finalize intermediately saves the access value so that afterwards Adjust can retrieve it. This is somewhat ugly because it defeats the idea to keep everything hidden inside the object - and what is more: In a multi-tasking environment, this cannot safely work!

We might therefore try something like the code shown here. Apart from the ugly store2, the code is quite similar to the previous one and straight-forward. But there is a snag, and that is the second big problem: What kind of access attribute can we take?

The accessibility rules forbid that an object has a deeper accessibility level than the [anonymous] access type denoted by the simple 'Access attribute. This prevents so-called dangling pointers, i.e. it prevents a pointer from outliving the object it references.

'Unchecked_Access circumvents this accessibility check. Hidden within the object, this is absolutely safe because the pointer can never outlive the object it points to. An early Ada95 compiler promptly let me fall into this trap by compiling the package and executing the corresponding test program. A later version of another compiler only rejected the package quite rightly by pointing out that the prefix of 'Unchecked_Access must be an aliased view (RM 13.10 (3), definition in 3.10 (9)).

Now we are back at the beginning. I was unable to find a solution within the Ada95 standard. Perhaps a reader can find one - improvements are welcome.

As a last resort, we might use compiler dependent features (if any). Gnat provides the attribute 'Unrestricted_Access, which worked fine in my case. Eachus expressed his hopes by stating: "If we can come up with an implementation independent package spec, then we can expect the various vendors to publish and maintain the package bodies. Yes, we might have to write the original versions, but the vendors are usually willing to maintain (i.e. not break) working code from real customers."

2.3 Unlimited types, tasking environment

As already stated above, this package can only be used if the type hierarchy constructed from it is constrained to one task. In a multi-tasking environment, we would have to make Finalize and Adjust an unseparable pair of twins to prevent another task from switching in and overwriting the contents of the global store. But just as Adjust is unaware of the target of an assignment, Finalize is unaware of whether there is such a call of Adjust following or not (i.e. it does not know whether finalisation is required within an assignment operation for the object on the left hand side or whether an object is leaving its scope). So enclosing the global store by a protected object would not help. It looks like we need a separate store for each task in the partition. Each task would then use its own store in a purely sequential order, which is safe again.

Now we have reduced the problem to finding a way to identify the tasks and using their identity as a key to the store. The problem is however aggravated by the fact that, in the general case, we have tasks that dynamically enter and leave the partition. In my feeling, this turns out to be a rather unwieldy problem, for which I have not found a solution - except if you are so lucky as to have a compiler complying with RM annex C (Gnat complies to all annexes). [Note that this annex does not belong to the core language.]

Annex C supplies us just with what we want: The library package Ada.Task_Identification [RM C.7.1] lets us obtain the identity of any task. We could construct a list of task IDs, where each task upon birth would have to register and upon death to unregister. Finalize would then have to search the list for the ID of the current task and store the access value, whereupon Adjust would again have to search the list to retrieve this value. This is getting much more complicated than we anticipated. However this proceeding can hardly be looked at as a solution of the problem because it affords an intrusion into the existing program's tasking system that is at least as big as reconstructing the type hierarchy with controlledness right from the beginning - and it is our very aim to avoid any such intrusion.

Fortunately all this turns out unnecessary. Again RM C.7.2 provides just what we are looking for: Package Task_Attributes enables us to attach to each task in the partition an attribute holding the access value to the object just finalised from where Adjust can get it. [With the implementation permission of RM C.7.2 (28), tasks that do not use the type hierarchy need not even store the attribute.]

See here the final code.

3. Derivation

Let us check the algorithm with a simply type hierarchy. When objects of type Controlled leave their scope, the operation Finalize is called:

So far, everything works as expected.

Alas, we have to realize that the code does not work for objects of type Final. For them when getting finalised upon leaving scope, rather than the expected new Initialize (Final'(F)), the old Initialize (Derived (F)) is called. (The astute reader will however have noticed that this very statement is already true for objects of type Controlled.)

The reason for this is the darned but unavoidable type conversion in the body of the function Add_Finalization.To_Limited_Uncontrolled.Finalize.

The solution for this last problem is relatively simple, if you realize that type conversions for tagged types are only view conversions, the tags themselves never change. So if we replace the generic parameter procedure

with the following:

we can redispatch the call in the body of the actual procedure:

Finally, we have reached our goal:

Note that Finalize is called for neither P (of course not) nor D. (Objects of the abstract type Root cannot be declared.)

4. Conclusion

The problem discussed here arises of course only for very big programs, where it is not so easy to redefine a type (program parts developed by different teams in different countries might depend on them) - and also a recompilation of several hundred thousand lines of code is not so easily feasible.

As we have however seen, a general solution is only possible for limited types; for unlimited types, we had to fall back on a compiler-dependend attribute (Gnat), and even with it, the solution is utmost tricky and not without problems.

But let us listen to Tucker Taft, the father of Ada95, who states: "We made no attempt to solve this problem in Ada 95, so it would not surprise me if it were impossible. My contention was that each component should, in so far as possible, clean up after 'itself.' Being able to 'add' controlled-ness later in a hierarchy did not seem to be that important."

Hence there is only one practicable way out of the dilemma (and that is also the one chosen by White): "In our code, we have just structured it so this problem doesn't arise - we've just made entire tagged hierarchies controlled, and that's that."

And even in the course of the development of very big programs, there are phases where restructuring is more easily feasible. In the words of Tucker Taft: "That doesn't seem like an unreasonable solution. In general, there are times when you do have to reorganize things to deal with certain kinds of enhancements. This may be one of them. ... In any case, I don't really recommend these tricks. I would suggest you just go back and make the root type controlled if you don't find using controlled components sufficiently flexible."


As a Post Scriptum, I would like to add that there are more hidden problems. Perhaps you have felt the need for a constructor, which is missing in the code above. You could try to add it to the visible part of package Add_Finalization.To_Uncontrolled (as a primitive operation or not - but note that primitive functions get abstract upon derivation) with the following body:

but this attempt is doomed to fail when used in a declaration as an initial value because of the distinction in the RM between assignment statements and assignment operations. In the code fragment

the initialisation of C is done by an assignment operation, which has no preceding finalization, so C cannot be self-referencing correctly, i.e. instead of C.Controller.Parent pointing to C, it will go astray.

There is yet another surprise that I experienced. I defined the hierarchy of types as shown above only to find an object of type Controlled finalized directly upon declaration:

destructing any default initial values we might have defined for the components of record Derived. This has to do with AI95-147 dealing with allowed optimizations when aggregates of controlled types are used as initial values. Now I'm not enough of an Ada language lawyer to decide whether Gnat 3.10p1 is in error when finalizing an object directly upon declaration but I was able to track the reason for this unexpected finalization down to the declaration

where we have an aggregate which might be assigned into a (transient) anomymous object which has to be finalized thereafter. Woe - Finalize for an object of type Component calls Finalize for the object referenced by the access value, which is my object X!

I do not really see a direct connection to the AI cited above because the RM already allows to optimize away the anonymous object for the aggregate. So if Gnat did optimize it away (together with the corresponding Adjust/Finalize pair), also the unexpected finalization would (or should - if my analysis is correct) disappear.


Corrections

1 This is an error in the original publication. For limited types it is of course trivial. It is difficult for unlimited types only.

2 Robert A Duff on 13 June 2002 pointed out that there's a problem, even in the sequential case. Suppose you have records X and Y containing two of these things. Then if you say:

    X := Y;
it will finalize both components of X, then adjust both components, so one component will end up pointing to the other.


The full code together with some test programs and also this paper (as a MS Word document) can be downloaded in zip format, so that anyone wanting to experiment themselves need not hack it in.
The file Test_Add_Finalization_To_Limited_Uncontrolled.Result.txt for the usable variant has been updated with results for recent compilers on 20 September 2004. If your favourite compiler does not produce this result, it is severely broken with respect to Finalization. (However I did not bother of re-running the unlimited variant since it does not and cannot work anyway.)


Ada 2005 RM 3.9.1(3/2): Contrary to Ada 95, type extension can now also be done in inner scopes. An example for this has been added.

The keyword overriding has been added to the code where appropriate. If you don't have an Ada 2005 compiler, just remove it.


Published in Ada Letters, Volume XIX, Number 4, December 1999.

Text in German.


Contents
Inhaltsverzeichnis