SIMPLE COMPONENTS
version 2.1
by Dmitry A. Kazakov
(mailbox@dmitry-kazakov.de)
This library is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this library; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
As a special exception, if other files instantiate generics from this unit, or you link this unit with other files to produce an executable, this unit does not by itself cause the resulting executable to be covered by the GNU General Public License. This exception does not however invalidate any other reasons why the executable file might be covered by the GNU Public License.
Download components_2_1.tgz (tar + gzip, Windows users may use WinZip)The current version provides implementations of smart pointers, sets, maps, stacks, tables, string editing, unbounded arrays and expression analyzers. It grew out of needs and does not pretend to be universal. The software was tested with GNAT 3.15p. See also changes log. Tables management and strings editing are described in separate documents see Tables and Strings edit.
Quick reference
Objects and handles to
Parsers
Persistent objects and handles to
Persistent storage and handles to
Pools
Sets and maps
Stacks
Strings editing
Tables (containers of strings)
Unbounded arrays
The objects and handles are designed to provide automatic garbage collection. The objects are created explicitly, but never explicitly destroyed. An application program usually should not directly access objects, using object handles (smart pointers) instead. As long as at least one handle to an object exists the object will not be destroyed. When the last handle disappears the object is automatically destroyed. The presented implementation is oriented on large and rather complex objects. Usually it has little sense to have pointers to small objects, having no identity. For such objects by-value semantics is often safer, easier to understand and more efficient. For this reason an object-oriented approach was chosen. The object type is considered a descendant of a limited controlled type which can be extended as necessary. Same handle type can be used for the whole class of descendant types. The proxy operations can be defined on handles which implementations may dispatch according to the actual type of the pointed object.
A specialization of objects is provided to support object's persistence. Such objects can be stored in an external persistent storage and then restored from there. The persistent storage interface itself is an object. This allows implementation of object serving as proxies of external objects permanently resident in an external storage.
The package Object provides the base type Entity for all objects:
type Entity is new
Ada.Finalization.Limited_Controlled with
record
Use_Count : Natural := 0;
end record;
type Entity_Ptr is access all Entity'Class;
It is a limited controlled type. The following operations are defined on it:
function Equal
( Left : Entity;
Right : Entity'Class;
Flag : Boolean := False
) return Boolean;
function Less
( Left : Entity;
Right : Entity'Class;
Flag : Boolean := False
) return Boolean;
These functions are used to compare objects. The meaning of comparison is usually defined by the nature of the objects. However the main reason why comparison is defined, is to support ordered sets of objects, so any order is suitable. Thus the implementations of Equal and Less use storage addresses to get Entity objects ordered. They should be overridden if a more meaningful order of objects exists. Note that Ada does not fully support multiple dispatch. Therefore the operations are declared asymmetric. The second parameter is class-wide. If the operation is overridden, an implementation should dispatch on the second parameter to emulate true multiple dispatch. The parameter Flag indicates whether the function is called recursively. The following code fragment illustrates how to do it:
function Less ( Left : A_New_Object_Type; Right : Object.Entity'Class; Flag : Boolean := False ) return Boolean is begin if ( Flag or else Right not in A_New_Object_Type'Class or else Right in A_New_Object_Type ) then -- Implement it here ... else -- Dispatch on the second parameter return not ( Less (Right, Left, True) or else Equal (Right, Left, True) ); end if; end Less; |
The idea is that a given overriding is responsible for implementation of Less if and only if Left :> Right, i.e. when Left is in the class of Right. The dispatching mechanism warranties that Left is in the type class, so if Right is of the same type or else does not belong to the type class, then Left :> Right. Otherwise, Right is used to re-dispatch and Flag is set to indicate that no more dispatch may happen. Observe, that if Left and Right are siblings and therefore neither of Left :> Right and Left <: Right is true, then Flag will stop the recursion. The if the implementation cast Right down to a known type, as it usually would do in other cases, then, in the case of siblings, that would cause propagation of Constraint_Error out of Less or Equal. If this behavior is undesirable, then another way to deal with comparison of siblings is to find the most specific common ancestor of both. In that case the code of Less might look as follows:
function Less ( Left : A_New_Object_Type; Right : Object.Entity'Class; Flag : Boolean := False ) return Boolean is begin if ( Right not in A_New_Object_Type'Class or else Right in A_New_Object_Type ) then -- Implement it here ... elsif Flag then -- Using Less of the most specific common ancestor, -- for example, the predefined Less: return Object.Less (Object.Entity (Left), Right, True); else -- Dispatch on the second parameter return not ( Less (Right, Left, True) or else Equal (Right, Left, True) ); end if; end Less; |
procedure Finalize (This : in out Entity);
The destructor raises Program_Error if the destroyed object is still in use. Note that a derived type shall call the destructor of the base if it overrides Finalize.
procedure Release (Ptr : in out Entity_Ptr);
The object pointed by Ptr is deleted if its use count in 1. Otherwise the use count is decremented. Ptr becomes null if the object it points to is deleted. The procedure does nothing if Ptr is already null. It can be used for implementation of the smart pointers to Entity and its descendants.
The generic child package Object.Handle defines the type Handle used to access objects of a given type:
generic
type Object_Type (<>) is abstract new Entity with private;
type Object_Type_Ptr is access Object_Type'Class;
package Object.Handle is
type Handle is new Ada.Finalization.Controlled with private;
The package has two generic parameters:
Handles can be assigned to copy a reference to the object. If a handle object is not initialized it is invalid. An invalid handle cannot be used to access objects, but it can be used in some comparisons, it can be copied and assigned. The constant Null_Handle defined in the package is a predefined invalid handle. The following operations are defined on a Handle:
procedure Finalize (Reference : in out Handle);
The destructor destroys the referenced object (if any) in case when the handle was the last one pointing the object.
procedure Invalidate (Reference : in out Handle);
This procedure detaches handle from the object (if any) it points to. The result handle cannot be used to access any object. The referenced object is destroyed if it was the last handle.
This function checks whether a handle points to an object.function Is_Valid (Reference : Handle) return Boolean;
This function is used to get a pointer to the object the handle points to. The pointer of to the object shall be used no longer the handle it was get from exists. A safe way to do it is to avoid declarations of any variables of the type Object_Type_Ptr.
This function is used to get a handle from a pointer to an object.
function "<" (Left, Right : Handle) return Boolean;
function "<="(Left, Right : Handle) return Boolean;
function ">="(Left, Right : Handle) return Boolean;
function ">" (Left, Right : Handle) return Boolean;
function "=" (Left, Right : Handle) return Boolean;
function "="
( Left : Handle;
Right : access Object_Type'Class
) return Boolean;
function "="
( Left : access Object_Type'Class;
Right : Handle
) return Boolean;
Valid handles are comparable. The result of comparison is one of the objects they point to. Implementations of the comparisons use Less and Equal defined on Object_Type. If one of arguments is invalid Contraint_Error is propagated for all functions except "=". For equality (and thus inequality) it is legal to compare with an invalid handle. The result of such comparison is true if and only if both handles are invalid. One of parameters in equality is allowed to be a pointer to an object.
The usage of objects and handles is illustrated by the following simplified example of an implementation of dynamic strings:
File test_my_string.ads:with Object; package Test_My_String is type My_String (Length : Natural) is new Object.Entity with record Value : String (1..Length); end record; type My_String_Ptr is access My_String'Class; end Test_My_String; |
An instance of My_String keeps the string body. But a user should rather use handles to My_String, provided by the child package:
File test_my_string-handle.ads:with Object.Handle; package Test_My_String.Handle is -- -- Though an instantiation of Object.Handle provides handles to -- My_String, we would like to have some additional operations on -- handles. -- package My_String_Handle is new Object.Handle (My_String, My_String_Ptr); -- -- So we immediately derive from the obtained type. Note that no -- additional components needed (with null record). -- type My_Safe_String is new My_String_Handle.Handle with null record; -- -- Now define useful operations on string handles: -- function Create (Value : String) return My_Safe_String; function Value (Reference : My_Safe_String) return String; -- -- Note that Copy takes handle as an inout-parameter. It does not touch -- the old object it just creates a new one and sets handle to point to -- it. The old object is automatically destroyed if no more referenced. -- procedure Copy ( Reference : in out My_Safe_String; New_Value : String ); procedure Copy ( Reference : in out My_Safe_String; New_Value : My_Safe_String ); private -- -- Note that Ref shall be overridden. This is a language requirement, -- which ensures that the results are covariant. We make it private -- because there is no need for a user to access it. -- function Ref (Pointer : My_String_Ptr) return My_Safe_String; end Test_My_String.Handle; |
This package defines the type My_Safe_String which can be used with less care about memory allocation and deallocation. A handle can be copied using the standard assignment. A new string object can be created from a string. The value it points to can be accessed using the function Value, etc. It is a good practice to provide Create returning a handle instead of a direct use of Ref on an existing object, because it prevents referring stack-allocated objects which could get out of scope before handles to them. Object.Finalize would notice that and raise Program_Error. An implementation of My_Safe_String might look like follows.
File test_my_string-handle.adb:package body
Test_My_String.Handle is function Create (Value : String) return My_Safe_String is Ptr : My_String_Ptr := new My_String (Value'Length); begin Ptr.Value := Value; return Ref (Ptr); end Create; function Value (Reference : My_Safe_String) return String is begin return Ptr (Reference).Value; end Value; procedure Copy ( Reference : in out My_Safe_String; New_Value : String ) is begin Reference := Create (New_Value); end Copy; procedure Copy ( Reference : in out My_Safe_String; New_Value : My_Safe_String ) is begin Reference := Create (Value (New_Value)); end Copy; function Ref (Pointer : My_String_Ptr) return My_Safe_String is begin return (My_String_Handle.Ref (Pointer) with null record); end Ref; end Test_My_String.Handle; |
The package Object.Handle.Generic_Bounded_Array defines the type Bounded_Array. An instance of Bounded_Array is a fixed size array of references to objects. It is same as an array of handles to objects but more efficient.
generic
type Index_Type is (<>);
type Handle_Type is new Handle with private;
package Object.Handle.Generic_Bounded_Array is ...
Here Index_Type is the type used to index the array elements. Handle_Type is any descendant of Handle including itself. The type Bounded_Array is defined in the package as:
type Bounded_Array (First, Last : Index_Type) is
new Ada.Finalization.Controlled with private;
The discriminants First and Last define the index range. The following operations are defined on Bounded_Array:
procedure Adjust (Container : in out Bounded_Array);
The assignment makes a copy of the array.
procedure Finalize (Container : in out Bounded_Array);
The destructor may delete some objects referenced by the array.
procedure Fill
( Container : in out Bounded_Array;
From : Index_Type;
To : Index_Type;
Element : Object_Type_Ptr
);
procedure Fill
( Container : in out Bounded_Array;
From : Index_Type;
To : Index_Type;
Element : Handle_Type
);
These procedures are used to put in / replace a range of array elements. The range From..To is filled with Element. Nothing happens if From > To. Otherwise Constraint_Error is propagated when From..To is not in Container.First..Constainer.Last.
function Get
( Container : Bounded_Array;
Index : Index_Type
) return Object_Type_Ptr;
This function returns either a pointer to an object or null.
procedure Put
( Container : in out Bounded_Array;
Index : Index_Type;
Element : Object_Type_Ptr
);
procedure Put
( Container : in out Bounded_Array;
Index : Index_Type;
Element : Handle_Type
);
These procedures are used to put in / replace an array element using its index. Constraint_Error is propagated when Index is illegal.
function RefThis function returns a valid handle to an object. Otherwise Constraint_Error is propagated.
( Container : Bounded_Array;
Index : Index_Type
) return Handle_Type;
The package Object.Handle.Generic_Unbounded_Array defines the type Unbounded_Array. An instance of Unbounded_Array is an unbounded array of references to objects. The package has same functionality as an instance of Generic_Unbounded_Array with Handle as Object_Type, but it is more efficient.
generic
type Index_Type is (<>);
type Handle_Type is new Handle with private;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Object.Handle.Generic_Unbounded_Array is ...
Here:
The type is declared as:
type Unbounded_Array is new Ada.Finalization.Controlled with private;
The following operations are defined on Unbounded_Array:
procedure Adjust (Container : in out Unbounded_Array);The assignment does not make a copy of the array. It just increments an internal use count. The array will be copied only when a destructive operation is applied.
procedure Erase (Container : in out Unbounded_Array);
This procedure removes all elements from Container making it empty. The objects referenced only by Container will be deleted.
procedure Finalize (Container : in out Unbounded_Array);
The destructor may delete some objects referenced by the array.
function First
( Container : Unbounded_Array;
) return Index_Type;
This function returns the current lower bound of the array. Constraint_Error is propagated when the array is empty.
function Get
( Container : Unbounded_Array;
Index : Index_Type
) return Object_Type_Ptr;
This function returns either a pointer to an object or null.
function Last
( Container : Unbounded_Array;
) return Index_Type;
This function returns the current upper bound of the array. Constraint_Error is propagated when the array is empty.
procedure Put
( Container : in out Unbounded_Array;
Index : Index_Type;
Element : Object_Type_Ptr
);
procedure Put
( Container : in out Unbounded_Array;
Index : Index_Type;
Element : Handle_Type
);
These procedures are used to put in / replace an array element using its index. The array is automatically expanded as necessary. It never happens if Element is null or an invalid handle.
function RefThis function returns a valid handle to an object. Otherwise Constraint_Error is propagated.
( Container : Unbounded_Array;
Index : Index_Type
) return Handle_Type;
The package Object.Handle.Generic_Set defines the type Set. An instance of Generic_Set is a set of references to objects. The package has same functionality as an instance of Generic_Set with Handle as Object_Type, but it is more efficient.
generic
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Object.Handle.Generic_Set is ...
Here:
The type is declared as:
type Set is new Ada.Finalization.Controlled with private;
The following operations are defined on Set:
procedure Add (Container : in out Set; Item : Handle);
procedure Add (Container : in out Set; Item : Object_Type_Ptr);
procedure Add (Container : in out Set; Items : Set);
These procedures are used to add an object to a set or all items of one set to another. The parameter Item can be either a handle or a pointer to the object. Nothing happens if an item is already in the set or pointer is an invalid handle or null.
procedure Adjust (Container : in out Set);
The assignment does not make a copy of the Container. It just increments an internal use count of the set body. A set will be physicaly copied only when a destructive operation is applied to it.
function Create return Set;
This function returns an empty set.
procedure Erase (Container : in out Set);
This procedure removes all objects from the set. The objects referenced only by Container will be deleted.
procedure Finalize (Container : in out Set);
The destructor may delete some objects referenced by Container.
function Find (Container : Set; Item : Handle)
return Integer;
function Find (Container : Set; Item : Object_Type_Ptr)
return Integer;
This function is used to Item in the set Container. The result is either a positive index of the found item or a negated index of the place where the item should be if it were in the set.
function Get (Container : Set; Index : Positive)
return Object_Type_Ptr;
This function is used to get an item of the set Container using a positive index. The result is a pointer to the object. It is valid as long as the object is in the set. See also Ref which represents a safer way of accessing the set items. Constraint_Error is propagated if Index is wrong.
function Get_Size (Container : Set) return Natural;
This function returns the number of items in the set.
function Is_Empty (Container : Set) return Boolean;
True is returned if Container is empty.
function Is_In (Container : Set; Item : Handle)
return Boolean;
function Is_In (Container : Set; Item : Object_Type_Ptr)
return Boolean;
True is returned if Item is in Container. Item can be either a pointer to an object or a handle to it. The result is always false when Item is invalid or null.
function Ref (Container : Set; Index : Positive) return Handle;
This function is used to get an item of the set Container using a positive index. The result is a handle to the object. Constraint_Error is propagated if Index is wrong.
procedure Remove (Container : in out Set; Index : Positive);
procedure Remove (Container : in out Set; Item : Handle);
procedure Remove (Container : in out Set; Item : Object_Type_Ptr);
procedure Remove (Container : in out Set; Items : Set);
These procedures are used to remove items from the set Container. An item can be removed either by its index, or explicitly by a pointer or handle to it, or else by specifying a set of items to be removed. If a particular item is not in the set, then nothing happens. Also nothing happens if a handle is illegal or pointer is null. Constraint_Error is propagated when item index is wrong.
function "and" (Left, Right : Set) return Set;
function "or" (Left, Right : Set) return Set;
function "xor" (Left, Right : Set) return Set;
These functions are conventional set operations - intersection, union, difference. Difference is defined as a set which items are only in one of the sets Left and Right.
function "=" (Left, Right : Set) return Boolean;
True is returned if both sets contain same items.
A persistent object is one stored in an external storage, independent on the application, that originally created it. A persistent object can be restored from the external storage in a fully functional state in the same or other application. The provided implementation of persistent objects was designed with the following goals in mind:
Like other objects, persistent ones are normally accessed through handles.
The package Object.Archived defines the type Deposit serving as the abstract base type for all persistent objects:
A type derived from Deposit should:type Deposit is abstract new Entity with private;
type Deposit_Ptr is access Deposit'Class;
Objects may depend on other objects, but these dependencies may not be circular. Store and Restore provide forth and back string conversions. String was chosen instead of Stream_Element_Array to make it portable across different systems.
Storing an object:
Restoring an object:
The type Backward_Link is used when it is necessary to monitor deletion of an object.
type Backward_Link is abstract new Entity with private;
type Backward_Link_Ptr is access Backward_Link'Class;
Reference counting is used to prevent deletion of a Deposit object, when it is in use. Such objects are referenced through handles. These are direct links to the object. But sometimes it is necessary to break the dependency of one object from another to delete the latter. For this the former object may become a notice about a desire to delete a referent. Upon this notification it can invalidates the handle to the referent and so allow the collector to delete it. A notification object is derived from Backward_Link, which represent a backward link from a referred object to a dependent one. Each Deposit object maintains a list of its backward links. Typically an external storage connection object tracks all persistent objects which are in the memory at the moment. Usually it has an index of such memory resident objects. A record of this index has a handle to a specialized descendant of Backward_Link. So when an object is no more in use and so the last handle to it disappears, the object is automatically destroyed. In the course of this operation the storage connection object becomes a notification via call to Destroyed. At this point the object being destroyed can be stored and then removed from the external storage index of memory resident objects.
type Deposit_Container is abstract
new Ada.Finalization.Controlled with private;
The type Deposit_Container is an abstract specialized container for Deposit objects. The container operates as a container of handles. That is when an object is put into it, then the object will not be deleted until it is in. Physically a reference to the object is placed into the container. Deposit_Container objects are used upon object storing and restoring to keep the list of things the object depends on. Deposit_Container is not limited so it can be copied when necessary. The child packages Object.Archived.Sets and Object.Archived.Lists provide unordered (set) and ordered (list) implementations of Deposit_Container.
This class-wide procedure is called before finalization of a persistent object. It cleans the list of backward links. So it plays the role of a class-wide destructor. Finalize should always call it. For example, if Derived is a descendant of Deposit overriding Finalize, then the implementation should look like:
procedure Finalize (Object : in out Derived) is
begin
Close (Object);
... -- finalization of Derived
Finalize (Deposit (Object));
end Finalize;
It is safe to call it multiple times, though it is essential to call it before any vital object data get finalized. So Finalization of a type derived from Derived may call Close as well. Note that in Ada Finalize is called prior finalization of any object's components. So it is safe to use them. However, keep in mind that task components (if any) are though not yet finalized, but completed before Finalize, thus neither Store nor Get_Referents may communicate with task components of the object.
procedure Create
( Source : String;
Pointer : in out Integer;
Class : String;
List : Deposit_Container'Class;
Object : out Deposit_Ptr
);
This procedure calls Restore for Class simulating a dispatching call. Name_Error is propagated if Class is not a registered object class. The string Source contains object description to be restored starting from the character Source (Pointer). Pointer is advanced to the first object following from the used ones. The parameter Object accepts a pointer to the newly created object.
Data_Error | Syntax error |
End_Error | Nothing matched |
Layout_Error | The value of Pointer is not in the range Source'First..Source'Last+1 |
Name_Error | Class is not a registered class |
Use_Error | Insufficient dependencies list |
This procedure is used when Object is being deleted. On each item in the Object's obituary notices delivery list, Delete is called. This has the effect that some references to Object may disappear and so the object will be collected. Note that a call to Delete does not guaranty Object's deletion, because some references to it, may still be present. It is safe to add new backward links to the Object's notification list from Delete, because the items are appended at the end of the delivery list. This also means that they will receive a Deleted callback in the course of the same notification. Though Object's deletion is not guaranteed it might happen. So to prevent undefined behavior a caller should hold a handle to Object when it calls to Delete.
Upon finalization backward links list is cleaned. All interested parties receive a notification via call to Destroyed. A derived type implementation have to call Finalize as well as Close.
procedure Free (Object : in out Deposit_Ptr);
This procedure is used to delete manually created objects. It is never called for existing objects, only for improperly constructed ones from an implementation of Restore.
function Get_Class (Object : Deposit) return String is abstract;
This function returns the class of Object. The class is a string uniquely describing the object's type. It is analogous to external type tag representation. Though, different types of objects may share same class if necessary.
procedure Get_Referents
( Object : Deposit;
Container : in out Deposit_Container'Class
);
This procedure adds objects referenced from Object to Container objects. Only immediately viewed objects are stored there. No deep search has to be made to detect all objects. Objects shall not depend recursively. The default implementation does nothing, which behavior corresponds to an independent object. An implementation may raise Use_Error on a wrong object. See also notes about Close.
function Is_Modified (Object : Deposit)
return Boolean is abstract;
This function is used to check if Object's state was changed. Persistent objects serving as proxies to a persistent storage will require synchronization if this function returns true. An implementation of a mutable object would normally have a Boolean flag to be set by any destructive operation or new object creation.
procedure Reset_Modified (Object : in out Deposit) is abstract;
This procedure is used to reset Object's state modification flag. It is called immediately after synchronization the object with the persistent storage.
type Restore is access procedure
( Source : String;
Pointer : in out Integer;
Class : String;
List : Deposit_Container'Class;
Object : out Deposit_Ptr
);
This procedure creates a new object from its string representation. It parses Source starting from Source (Pointer). Pointer is then advanced to the first character following the object's description in the string. The procedure has to be dispatching depending on the object's class, which is impossible in Ada. For this reason it is defined as an access to procedure type. Each object class has to define such a function and register it (see Register_Class). The parameter Class contains the actual object class according to which dispatch to an implementation of Restore was made. The parameter List contains the references to the objects the restored object depends on. The order of the objects in the list is same as one returned in Get_Referents. The result is a newly allocated object pointed by the Object parameter. An implementation may raise the following exceptions to indicate errors:
Data_Error | Syntax error |
End_Error | Nothing matched |
Layout_Error | The value of Pointer is not in the range Source'First..Source'Last+1 |
Use_Error | Insufficient dependencies list |
procedure Store
( Destination : in out String;
Pointer : in out Integer;
Object : Deposit
) is abstract;
An implementation places string describing Object is into Destination starting from the position specified by Pointer. Pointer is then advanced to the next position following the output. Layout_Error is propagated when Pointer not in Source'First..Source'Last + 1 or there is no room for output. Use_Error can be raised when Object is wrong. See also notes about Close.
procedure Attach
( Link : Backward_Link_Ptr;
Object : Deposit_Ptr
);
This procedure places Link at the end of Object's delivery list. If it is already in another list then it is removed from there first. Nothing happens if Object is null.
procedure Deleted
( Link : in out Backward_Link;
Temps : in out Deposit_Container'Class
) is abstract;
This procedure is used when an object is requested to be deleted. Normally Deleted is called as a result of object deletion request via call to Delete. The parameter Temps is the list of temporal objects the implementation might create. For example, some objects might be created to be notified within the course of the operation performed by the caller. Note that the caller should hold a handle to Link, to allow the callee to undertake actions which would otherwise lead to Link deletion. Note also that object's finalization does not cause a call to Delete it calls Destroyed instead.
procedure Destroyed (Link : in out Backward_Link) is abstract;
This procedure is used when an object is finally destroyed, but is still fully operable. Thus an implementation of Destroyed may safely access the object referred by Link. It may for example synchronize the object with the external storage or remove the object from the index cache etc. The caller should hold a handle to Link.
procedure Detach (Link : in out Backward_Link);
This procedure removes Link from object's delivery list, if any.
procedure Finalize (Link : in out Backward_Link);
This procedure should be called by a derived type if overridden. Link is removed for object's delivery list if any.
function Self (Link : Backward_Link) return Backward_Link_Ptr;
This function returns a pointer to the link object (to Link itself). Constraint_Error is propagated when Link is not bound to any object.
This function returns a pointer to the target of Link. Constraint_Error is propagated when Link is not bound to any object.
procedure Add
( Container : in out Deposit_Container;
Object : Deposit_Ptr;
Backward : Boolean := False
) is abstract;
This procedure puts a reference to Object into Container. The implementation should ensure that Object will not be destroyed until it is in. The parameter Backward, when true indicates a backward link. Backward links are used when the dependent object associated with the container can survive deletion of Object. It is an optional parameter which may be ignored by some implementations. When it is supported, then marking an Object as a backward link should override the effect of any placing the same object as a direct link (with Backward = false). Nothing happens if Object is null.
procedure Erase (Container : in out Deposit_Container) is abstract;
This procedure removes all objects from Container.
function Get
( Container : Deposit_Container;
Index : Positive
) return Deposit_Ptr is abstract;
This function is used to enumerate the objects in a container Objects indices start with 1. Contraint_Error is propagated when Index is wrong.
function Get_Size (Container : Deposit_Container)
return Natural is abstract;
This function returns the number of objects in Container, i.e. the largest possible index allowed in Get. 0 is returned when the container is empty. Note that the objects in a container need not to be all different. This depends on the container implementation.
function Is_Backward
( Container : Deposit_Container;
Object : Deposit_Ptr
) return Boolean is abstract;
The result of this function is true if a backward link is used for Object in Container. Constraint_Error is propagated when Object is not in Container. Use_Error is propagated when the container implementation does not distinguish direct and backward links.
function Is_Empty (Container : Deposit_Container'Class)
return Boolean;
This function returns true if Container is empty. It is class-wide.
function Is_In
( Container : Deposit_Container;
Object : Deposit_Ptr
) return Boolean is abstract;
This function returns true if Object is in Container. Note that null cannot be in any container.
function Is_Registered (Class : String) return Boolean;
This function returns true if there is a class of objects registered under the name Class.
procedure Register_Class
( Class : String;
Constructor : Restore
);
This procedure is used to register each new class of objects. It is analogous to creating a dispatching table. It is necessary to register a class to make Restore functions working. Nothing happens if the class is already registered and has same constructor. Name_Error is propagated when class is registered with a different constructor.
The package Object.Archived.Sets provides an implementation of Deposit_Container. The type Deposit_Set is derived there:
type Deposit_Set is new Deposit_Container with private;
Sets do not distinguish multiple insertion of an object. they also ignore the Backward parameter of Add. So Is_Backward will raise Use_Error. Additionally to the predefined operations, Deposit_Set provides standard set-operations:
procedure Remove
( Container : in out Deposit_Set;
Object : Deposit_Ptr
);
This procedure removes Object from Container. Nothing happens if it is null or not in.
function "and" (Left, Right : Deposit_Set) return Deposit_Set;
function "or" (Left, Right : Deposit_Set) return Deposit_Set;
function "xor" (Left, Right : Deposit_Set) return Deposit_Set;
These functions are conventional set operations - intersection, union, difference. Difference is defined as a set which items are only in one of the sets Left and Right.
function "=" (Left, Right : Deposit_Set) return Boolean;
true is returned if both sets contain same items.
The package Object.Archived.Lists provides an implementation of Deposit_Container. The type Deposit_List is derived there as:
type Deposit_List is new Deposit_Container with private;
All objects in the list are enumerated from 1. The same object can occupy several places in the list. In the external storage Deposit_List can be stored as a set of objects, where objects do not repeat, followed by a list of values identifying the objects in the set. Additionally to the predefined operations, Deposit_List provides:
function Get_Total (Container : Deposit_List) return Natural;
This function returns the number of distinct objects in Container. This value is less or equal to the one returned by Get_Size.
function Is_First
( Container : Deposit_List;
Index : Positive
) return Boolean;
This function returns true if Index is the least index of the object it specifies. I.e. the least index of the object returned by Get (Container, Index). Constraint_Error is propagated if Index is wrong.
The package Object.Archived.Iterators provides an abstract iterator of references:
type References_Iterator
( Referents : access Deposit_Container'Class
) is new Ada.Finalization.Limited_Controlled with private;
The type References_Iterator can be used directly or be extended. It provides the following operations:
procedure Enumerate
( Iterator : in out References_Iterator'Class;
Object : Deposit'Class
);
This class-wide procedure is called to enumerate references of Object. This same procedure is used for both starting the process and continuing it for each found reference. Enumerate calls Get_Referents for Object and places all found objects which Object depends on into Iterator.Referents.all. A found object is placed only once which is detected by looking into Iterator.Referents.all. The object itself is not put there. After completion the caller may inspect Iterator.Referents.all for any found objects.
procedure On_Each
( Iterator : in out References_Iterator;
Referent : Deposit_Ptr
);
This procedure can be overridden. It is called by Enumerate each time a new object is found. It may raise an exception to stop the iteration process. This exception will then propagate out of Enumerate.
Persistent objects are subject of garbage collection. The recommended way to access them is through handles, which prevents premature destruction of objects in use. Handles can be aggregated into other objects to express object dependencies. Note that circular dependencies shall be avoided. The best way to do it is to design object in a way that would exclude any possibility of circular dependencies. If that is not possible, then Is_Dependent should be used to check dependencies at run time. The generic package Object.Archived.Handle defines the type Handle used to reference persistent object. It is derived from Handle obtained by an instantiation of Object.Handle:
generic
type Object_Type is abstract new Deposit with private;
type Object_Ptr_Type is access Object_Type'Class;
package Handles is new Object.Handle (Deposit, Deposit_Ptr);
type Handle is new Handles.Handle with null record;
The formal parameters of the package are:
There is a ready-to use instantiation of Object.Archived.Handle with Deposit and Deposit_Ptr as the actual parameters: Deposit_Handles.
The package Object.Archived.Handle defines the following operations on Handle:
procedure Add
( Container : in out Deposit_Container;
Object : Handle;
Backward : Boolean := False
) is abstract;
This procedure puts Object into Container. The parameter Backward, when true indicates a backward link. Backward links are used when the dependent object associated with the container can survive deletion of Object. Constraint_Error is propagated when Object is an invalid handle.
This procedure requests deletion of the object pointed by the handle Object. As the result of the operation Object becomes an invalid handle. The object itself is deleted if possible. Nothing happens if Object is not a valid handle.
function Get_Class (Object : Handle) return String;
This function returns the class of Object. The class is a string uniquely describing the object's type. It is analogous to an external type tag representation. Though, different types of objects may have same class if necessary.
procedure Get_References
( Object : Handle;
Container : in out Deposit_Container'Class
);
This procedure adds to Container references to all objects the object specified by the handle Object depends on. No objects added if Object is an invalid handle.
procedure Invalidate (Object : in out Handle);
This procedure detaches handle from the object (if any) it points to. The result handle cannot be used to access any object. The referenced object is destroyed if it was the last handle.
function Is_Backward
( Container : Deposit_Container'Class;
Object : Handle
) return Boolean;
This function returns true if a backward link used for Object in Container. Contstraint_Error is propagated when Object is not in Container or invalid handle. Use_Error does when Container does not distinguish direct and backward links.
function Is_Dependent
( Dependant : Handle;
Referent : Handle
) return Boolean;
function Is_Dependent
( Dependant : Handle;
Referents : Deposit_Container'Class
) return Boolean;
These functions check whether Dependant refers to Referent or, when the second parameter is a container, then whether Dependant refers to any of the objects from that container. The result is false if Dependant, Referent is invalid or Referent is empty.
function Is_In
( Container : Deposit_Container'Class;
Object : Handle
) return Boolean;
This function returns true if Object is in Container. When Object is an invalid handle, the result false.
function Is_Valid (Object : Handle) return Boolean;
This function checks whether a handle points to any object, i.e. is valid.
function Ptr (Object : Handle) return Deposit_Ptr;
This function is used to get a pointer to the object the handle Object points to. The pointer of to the object shall be used no longer the handle it was get from exists. A safe way to do it is to avoid declarations of any variables of the type Deposit_Ptr.
function Ref (Thing : Object_Type_Ptr) return Handle;
This function is used to get a handle from a pointer to an persistent object.
function Ref
( Container : Deposit_Container'Class;
Index : Positive
) return Handle;
This function can be used to enumerate the objects in a container. Objects are enumerated from 1. The result is a valid handle to an object in Container. Contraint_Error is propagated when Index is wrong. Note that objects may repeat in containers of some types.
function References (Object : Handle) return Deposit_Set;
This function is used to query all objects its argument depends on. The result is a set of objects. It is empty if Object is an invalid handle.
The package Deposit_Handles provides an instantiation of Object.Archived.Handle:
package Deposit_Handles is
new Object.Archived.Handle (Deposit, Deposit_Ptr);
This paragraph describes a simplified example of persistent storage. It provides an implementation of a persistent storage based on direct access file. As an example of persistent objects serve nodes of binary trees.
The implementation uses a direct access file to store objects. Each object is stored in one file record. The record number serves as the object key. Observe that the implementation is independent from any implementation of concrete persistent object types (derived from Deposit). This example serves illustrative purpose. For abstract persistent storage interface see Persistent, Persistent.Handle. For persistent storage implementations see Persistent.Handle.Factory.
File test_persistent_file_storage.ads:with Ada.Direct_IO; with Ada.Finalization; with Generic_Map; with Object.Handle; with Object.Archived; use Object.Archived; with Deposit_Handles; use Deposit_Handles; package Test_Persistent_File_Storage is -- -- File_Storage -- Direct I/O based storage for persistent objects -- type File_Storage is new Ada.Finalization.Limited_Controlled with private; -- -- Key -- To reference stored objects = record number 1.. -- type Key is new Integer; subtype Deposit_Handle is Deposit_Handles.Handle; procedure Initialize (Storage : in out File_Storage); procedure Finalize (Storage : in out File_Storage); procedure Clean_Up; function Store ( Storage : access File_Storage; Object : Deposit_Handle ) return Key; function Restore ( Storage : access File_Storage; ID : Key ) return Deposit_Handle; |
Here we declare the type File_Storage as a limited controlled type. The procedures Initialize / Finalize are overridden to provide construction / destruction. Upon construction the file is opened. Upon destruction it is closed. The procedure Clean_Up is provided to delete the file. The function Store will be used to store an object. It returns the object key, which identifies the object there. The key has the type Key also declared in this package. It is the number of the record reserved for the object in the file. When the object is already persistent in the file, its key is returned, so it is safe to call Store multiple times. The function Restore is the operation opposite to Store. It takes the object key and returns a handle to the object. Restore is also safe to call multiple times. So when the object referenced by a key, is already memory resident, a handle to it is returned instead of creating a new memory resident copy. The type Handle from the package Deposit_Handles is used to reference persistent objects. Deposit_Handles.Handle is "renamed" to Deposit_Handle for convenience. The objects themselves are never referenced directly but through handles only.
File test_persistent_file_storage.ads (continued, the private part):private -- -- Index_Record -- One per bound object -- type Index_Record (Storage : access File_Storage) is new Backward_Link with record ID : Key; -- Object identifier end record; type Index_Record_Ptr is access all Index_Record'Class; -- -- Implementation of Backward_Link's operation -- procedure Deleted ( Link : in out Index_Record; Temps : in out Deposit_Container'Class ); procedure Destroyed (Link : in out Index_Record); |
A File_Storage object encapsulates the file and an index of all memory resident objects from that file. The index consists of Index_Records. One record is allocated per memory resident object. Index_Record is derived from Backward_Link to monitor what happens with the object. It also contains the object's key in the file. Two operations of Backward_Link need to be implemented: Deleted and Destroyed. The implementation of Deleted is called upon a request of object deletion. It does nothing in our case. Destroyed is called when the object is about to be finalized. In our case we store that object into the file. A more advanced implementation would check if the object was modified. It could also check if the object was requested for deletion and is no more referenced from other objects, in which case it can be removed from the persistent storage as well. But that would be too complex for a small illustrative example.
File test_persistent_file_storage.ads (continued, the private part):
-- -- Record_Handles -- Handles to index records -- package Record_Handles is new Object.Handle (Index_Record, Index_Record_Ptr); use Record_Handles; subtype Record_Handle is Record_Handles.Handle; -- -- Map : object pointer -> record handle -- function "<" (Left, Right : Deposit_Ptr) return Boolean; package Object_Maps is new Generic_Map ( Key_Type => Deposit_Ptr, Object_Type => Record_Handle ); use Object_Maps; subtype Object_Map is Object_Maps.Map; -- -- Map : object key -> record handle -- package Key_Maps is new Generic_Map ( Key_Type => Key, Object_Type => Record_Handle ); use Key_Maps; subtype Key_Map is Key_Maps.Map; |
To reference Index_Record we will use handles provided by Record_Handles, an instantiation of Object.Handle. A handle to Index_Record is "renamed" to Record_Handle. Then we declare two maps: one to map objects to index records, another to map keys to the records. For this the package Generic_Map is instantiated once as Object_Maps and once as Key_Maps. Both use Record_Handle to reference Index_Record. So when the index record is deleted it is enough to remove it from the both maps and the object Index_Record will be automatically collected. Note also that Object_Map uses Deposit_Ptr, a pointer to the persistent object rather than a handle to it. It is important to allow object deletion. Otherwise an object would be never deleted as long as Index_Record referring it exists, i.e. up to File_Storage finalization. It would a thinkable, but too crude implementation. Generic_Map requires map keys be comparable, so the implementation declares "<" on Deposit_Ptr.
File test_persistent_file_storage.ads (continued, the private part): -- -- File record -- type Reference_List is array (Integer range 1..256) of Key; type File_Record is record Length : Natural := 0; Count : Natural := 0; References : Reference_List; Descriptor : String (1..1024); end record; package Record_Files is new Ada.Direct_IO (File_Record); use Record_Files; -- -- File_Storage -- Implementation -- type File_Storage is new Ada.Finalization.Limited_Controlled with record File : File_Type; Object_To_Record : Object_Map; Key_To_Record : Key_Map; Last_ID : Key := 0; -- Last used object key end record; end Test_Persistent_File_Storage; |
The type File_Record describes one record in the file. The field References is the list of the keys of all the objects referred by the object. Count is the length of the list. The field Descriptor is a string describing the object. The length of the string is the field Length.
File test_persistent_file_storage.adb:with
Object.Archived.Lists; use
Object.Archived.Lists; with Strings_Edit; use Strings_Edit; package body Test_Persistent_File_Storage is function "<" (Left, Right : Deposit_Ptr) return Boolean is begin if Right = null then return False; elsif Left = null then return True; else return Less (Left.all, Right.all); end if; end "<"; procedure Clean_Up is File : File_Type; begin Create (File, Out_File, "test.dat"); Close (File); end Clean_Up; |
The implementation of "<" uses Less defined on objects to order them. Clean_Up opens the file in Out_File mode and immediately closes it. This erases the file.
File test_persistent_file_storage.adb (continued): procedure Write ( Storage : in out File_Storage; Object : Deposit'Class; ID : Key ) is References : Deposit_List; Data_Record : File_Record; Pointer : Integer := Data_Record.Descriptor'First; begin Get_Referents (Object, References); Data_Record.Count := Get_Size (References); for Item in 1..Data_Record.Count loop Data_Record.References (Item) := Store (Storage'Access, Ref (References, Item)); end loop; Put (Data_Record.Descriptor, Pointer, Get_Class (Object)); Put (Data_Record.Descriptor, Pointer, ":"); Store (Data_Record.Descriptor, Pointer, Object); Data_Record.Length := Pointer; Write (Storage.File, Data_Record, Count (ID)); end Write; |
The procedure Write is defined to store an object under the specified key. It calls to Get_Referents to obtain the list of the objects the stored object needs. Then for each such object it calls Store to ensure the object persistency in the file. The keys returned by Store are placed into the References array. After that Write starts to form the field Description. It places the object class there (Get_Class) followed by a colon. Then object's Store is called to query the object description and to add it to Description. The completed object record is then written into the file.
File test_persistent_file_storage.adb (continued): procedure Initialize (Storage :
in out File_Storage) is begin Open (Storage.File, Inout_File, "test.dat"); Storage.Last_ID := Key (Size (Storage.File)); end Initialize; procedure Finalize (Storage : in out File_Storage) is begin while not Is_Empty (Storage.Key_To_Record) loop declare Index_Item : Index_Record renames Ptr (Get (Storage.Key_To_Record, Integer'(1))).all; begin Write (Storage, This (Index_Item).all, Index_Item.ID); end; Remove (Storage.Key_To_Record, Integer'(1)); Remove (Storage.Object_To_Record, 1); end loop; Close (Storage.File); end Finalize; procedure Bind ( Storage : access File_Storage; Object : Deposit_Handle; ID : Key ) is Link_Ptr : Backward_Link_Ptr := new Index_Record (Storage); Index_Item : Index_Record renames Index_Record (Link_Ptr.all); begin Index_Item.ID := ID; Attach (Link_Ptr, Ptr (Object)); Add ( Storage.Object_To_Record, Ptr (Object), Ref (Index_Item'Unchecked_Access) ); Add ( Storage.Key_To_Record, ID, Ref (Index_Item'Unchecked_Access) ); end Bind; |
The implementation of Initialize just opens the file for input / output and initializes the field Last_ID. Finalize goes through the index of memory resident objects (the key to object map). For each record of the index it calls Write to store the corresponding object and then removes the references to the index record from both maps. This in turn deletes the record itself. Note how This is used to get the object. The procedure Bind is defined to create an index record. It calls to Attach to bind Index_Record with the object and places handles to Index_Record in each of the maps. Ref is used to obtain them
File test_persistent_file_storage.adb (continued):
function Store ( Storage : access File_Storage; Object : Deposit_Handle ) return Key is This : Deposit_Ptr := Ptr (Object); begin if This = null or else not Is_In (Storage.Object_To_Record, This) then Storage.Last_ID := Storage.Last_ID + 1; Bind (Storage, Object, Storage.Last_ID); return Storage.Last_ID; else return Ptr (Get (Storage.Object_To_Record, This)).ID; end if; end Store; |
The implementation of Store first looks into the index to check if it is already there. If yes it returns the key of the object. Otherwise it generates a new key by incrementing the field Last_ID and calls Bind to create a new index record.
File test_persistent_file_storage.adb (continued): function Restore (Storage :
access File_Storage; ID : Key) return Deposit_Handle is begin if Is_In (Storage.Key_To_Record, ID) then return Ref (This (Ptr (Get (Storage.Key_To_Record, ID)).all)); else -- -- Read the object from the file -- declare Data : File_Record; List : Deposit_List; Object : Deposit_Ptr; Result : Deposit_Handle; Pointer : Positive; begin Read (Storage.File, Data, Count (ID)); for No in 1..Data.Count loop Add (List, Restore (Storage, Data.References (No))); end loop; Pointer := Data.Descriptor'First; while Data.Descriptor (Pointer) /= ':' loop Pointer := Pointer + 1; end loop; Pointer := Pointer + 1; Create ( Data.Descriptor, Pointer, Data.Descriptor (Data.Descriptor'First..Pointer - 2), List, Object ); Result := Ref (Object); Bind (Storage, Result, ID); return Result; end; end if; end Restore; |
The procedure Restore checks the index if an object with the specified key was already created. If yes it returns a handle to the object. This is used to get an object pointer from Index_Record. When the key identifies an unknown object, Restore reads its record from the file. The key is the record number. Restore goes through the array References and for each key calls itself to ensure this object to be restored too. The returned handle to that object is placed in a Deposit_List container. The container together with Descriptor's prefix (up to the first colon) as object's class name and the rest of it as the object's description, are passed to Create. That creates the object. A handle to it is then returned after Bind is called to place the object into the storage index.
File test_persistent_file_storage.adb (continued): procedure Deleted ( Link : in out Index_Record; Temps : in out Deposit_Container'Class ) is begin null; end Deleted; procedure Destroyed (Link : in out Index_Record) is begin Write (Link.Storage.all, This (Link).all, Link.ID); Remove (Link.Storage.Object_To_Record, This (Link)); Remove (Link.Storage.Key_To_Record, Link.ID); end Destroyed; end Test_Persistent_File_Storage; |
The implementation of Deleted does nothing. Destroyed writes the object into the file and then removes it from the index.
Let's take binary tree node as an example of persistent object. A node may have up to two successors or none. Predecessor - successor relation is naturally mapped to dependant - referent.
File test_persistent_tree.ads:with Object.Archived;
use
Object.Archived; with Deposit_Handles; use Deposit_Handles; package Test_Persistent_Tree is -- -- Nothing -- No node handle -- function Nothing return Handle; -- -- Create_Node -- This function creates a new node -- -- Field - Identifies the node -- Left - Successor on the left (a handle to) -- Right - Successor on the right (a handle to) -- function Create_Node ( Field : Integer; Left : Handle := Nothing; Right : Handle := Nothing ) return Handle; -- -- Print -- Prints the tree rooted in a node -- -- Root - The root node (a handle to) -- procedure Print (Root : Handle; Indentation : String := ""); private -- -- Node -- Binary tree node type -- type Node is new Deposit with record Field : Integer; -- Node identifier Left : Handle; -- Left successor, a handle to Right : Handle; -- Right successor, a handle to end record; -- -- Implementation of Deposit's operations -- function Get_Class (Object : Node) return String; procedure Get_Referents ( Object : Node; Container : in out Deposit_Container'Class ); function Is_Modified (Object : Node) return Boolean; procedure Reset_Modified (Object : in out Node); procedure Restore ( Source : String; Pointer : in out Integer; Class : String; List : Deposit_Container'Class; Object : out Deposit_Ptr ); procedure Store ( Destination : in out String; Pointer : in out Integer; Object : Node ); end Test_Persistent_Tree; |
The public part of the package declares the function Create_Node and the procedure Print. Create_Node creates a new node and returns a handle to it. All nodes are referenced using Handle of Deposit_Handles. Each node is identified by an integer number. The next two parameters of Create_Node are the handles to the left and right successors. They are defaulted to an invalid handle for which the function Nothing is also declared. It plays role of a constant invalid handle. The procedure Print is used for control. It prints the tree rooted in the node specified by the parameter Root.
The private part is straightforward. It declares the type Node as a descendant of Deposit. The operations Get_Class, Get_Referents, Is_Modified, Reset_Modified, Restore and Store are overridden to provide implementations.
File test_persistent_tree.adb:with Ada.Text_IO;
use Ada.Text_IO; with Strings_Edit; use Strings_Edit; with Strings_Edit.Integers; use Strings_Edit.Integers; package body Test_Persistent_Tree is Class : constant String := "Node"; -- The class of function Nothing return Handle is None : Handle; begin return None; end Nothing; function Create_Node ( Field : Integer; Left : Handle := Nothing; Right : Handle := Nothing ) return Handle is Node_Ptr : Deposit_Ptr := new Node; Object : Node renames Node (Node_Ptr.all); begin Object.Field := Field; Object.Left := Left; Object.Right := Right; return Ref (Node_Ptr); end Create_Node; function Get_Class (Object : Node) return String is begin return Class; end Get_Class; procedure Get_Referents ( Object : Node; Container : in out Deposit_Container'Class ) is begin if Is_Valid (Object.Left) then Add (Container, Object.Left); end if; if Is_Valid (Object.Right) then Add (Container, Object.Right); end if; end Get_Referents; function Is_Modified (Object : Node) return Boolean is begin return True; -- Save it always, do not care about performance end Is_Modified; procedure Reset_Modified (Object : in out Node) is begin null; end Reset_Modified; |
The implementation of Get_Referents places handles to the node successors into a Deposit_Container. The left successor is placed first. Is_Modified and Reset_Modified are void for sake of simplicity. So a node is always written into the persistent storage even if it is not changed.
File test_persistent_tree.adb (continued): procedure Restore ( Source : String; Pointer : in out Integer; Class : String; List : Deposit_Container'Class; Object : out Deposit_Ptr ) is Field : Integer; Left : Handle; Right : Handle; begin if Source (Pointer) = '<' then Left := Ref (List, 1); if Source (Pointer + 1) = '>' then Right := Ref (List, 2); end if; elsif Source (Pointer + 1) = '>' then Right := Ref (List, 1); end if; Pointer := Pointer + 2; Get (Source, Pointer, Field); Object := new Node; declare Item : Node renames Node (Object.all); begin Item.Field := Field; Item.Left := Left; Item.Right := Right; end; exception when others => raise Data_Error; end Restore; |
The implementation of Restore first gets description of node dependencies from the source string. It is two characters. The first one is either '<' if there is a left successor or '-' otherwise. The second is '>' if there is a right successor or else '-'. After that it gets the node identifier (plain integer number). Then a new node object is allocated. Note that the target access type should be Deposit_Ptr to ensure right storage pool selection.
File test_persistent_tree.adb (continued): procedure Store ( Destination : in out String; Pointer : in out Integer; Object : Node ) is begin if Is_Valid (Object.Left) then Put (Destination, Pointer, "<"); else Put (Destination, Pointer, "-"); end if; if Is_Valid (Object.Right) then Put (Destination, Pointer, ">"); else Put (Destination, Pointer, "-"); end if; Put (Destination, Pointer, Object.Field); end Store; procedure Print (Root : Handle; Indentation : String := "") is begin if Is_Valid (Root) then declare The_Node : Node renames Node (Ptr (Root).all); begin Put_Line (Indentation & "\_" & Image (The_Node.Field)); Print (The_Node.Left, Indentation & " |"); Print (The_Node.Right, Indentation & " "); end; else Put_Line (Indentation & "\_*"); end if; end Print; begin Register_Class (Class, Restore'Access); end Test_Persistent_Tree; |
The procedure Store is reverse to Restore. Also the package defines a new class of persistent objects named Node. For this it calls Register_Class once upon elaboration with the class name and a pointer to Restore as parameters.
The test program is shown below. It consists of two sessions. In the first session an object is stored. In the second one it is restored.
File test_persistent_storage.adb:with Ada.Text_IO;
use Ada.Text_IO; with Test_Persistent_File_Storage; use Test_Persistent_File_Storage; with Test_Persistent_Tree; use Test_Persistent_Tree; with Deposit_Handles; use Deposit_Handles; procedure Test_Persistent_Storage is Root_Key : Key; begin Clean_Up; Put_Line ("Session 1"); declare DB : aliased File_Storage; Root : Handle; begin Root := Create_Node ( 1, Create_Node (2), Create_Node ( 3, Create_Node ( 4, Create_Node (5) ), Create_Node (6) ) ); Print (Root); Root_Key := Store (DB'Access, Root); end; Put_Line ("Session 2"); declare DB : aliased File_Storage; Root : Handle; begin Root := Restore (DB'Access, Root_Key); Print (Root); end; end Test_Persistent_Storage; |
The test program first calls Clean_Up to delete any existing storage file. Then it declares DB, a File_Storage object. After that a tree is created and Root becomes a handle to the tree root node. The tree is printed and then its root node is stored into DB. There result of the operation is the external key of the root node. This key can be used to restore the object. Note that the whole tree is stored because the any node depends on its child nodes. What Store does depends on the implementation. In our case physical file writing happens either upon finalization of the storage object (DB) or upon finalization of the persistent object (Root). Both objects are go out of scope at end closing the first session. The second session uses Restore and the external key to bring the root node back from the storage. Again, all the objects it depends on are restored as well. Finally, the restored tree is printed.
The test program that uses an ODBC data base as a persistent storage is shown below:
File test_ODBC_persistence.adb:with Ada.Text_IO;
use Ada.Text_IO; with Deposit_Handles; use Deposit_Handles; with Persistent.Handle; use Persistent.Handle; with Test_Persistent_Tree; use Test_Persistent_Tree; with Test_ODBC_Session; use Test_ODBC_Session; procedure Test_APQ_Persistence is Name : constant Wide_String := "The tree"; begin Put_Line ("Session 1"); declare DB : Storage_Handle := Open; Root : Handle; begin Root := Create_Node ( 1, Create_Node (2), Create_Node ( 3, Create_Node ( 4, Create_Node (5) ), Create_Node (6) ) ); Print (Root); Put (DB, Root, Name); end; Put_Line ("Session 2"); declare DB : Storage_Handle := Open; Root : Handle; begin Root := Get (DB, Name); Print (Root); end; end Test_APQ_Persistence; |
Then it declares DB, a Storage_Handle. The handle is initialized using the function Open defined in Test_ODBC_Session.adb. It prompts for connection parameters and then calls Persistent.ODBC.Create. After that a tree is created and Root becomes a handle to the tree root node. The tree is printed and then its root node is stored into DB as "The three". For this it calls Put. Note that the whole tree is stored because the any node depends on its child nodes. The second session uses Get and the name "The three" to bring the root node back from the storage. Again, all the objects it depends on are restored as well. Finally, the restored tree is printed. Carefully observe that the package Test_Persistent_Tree needed no modifications to be able to work with a different type of storage.
A the test program for APQ data base can be found in the file test_APQ_persistence.adb.
The package Persistent provides an abstract persistent storage communication object. The corresponding persistent storage can be implemented on the basis of a plain file, data base etc. Objects in the storage are identified by their names. Additionally anonymous objects can be created and deleted as required by the named ones. If an object depends on some other objects, then when stored into the storage, the referred objects are stored as well. If they do not already persist there, these objects will be anonymous. Anonymous persistent objects are subject of garbage collection. The way of collection is determined by the implementation. Named objects are deleted only on request. Delete can be applied to the object's handle to request object's deletion. If the object cannot be deleted immediately it becomes anonymous for later collection. Persistent storage interfaces are itself objects and are a subject of garbage collection as well. The object names are in Unicode. The implementation should provide encoding when the persistent storage does not natively support Unicode.
The package defines the abstract type Storage_Object which describes the interface of a persistent storage communication object. It is derived from Entity, so persistent storage interface objects are subject of garbage collection:
type Storage_Object is abstract new Object.Entity with private;
type Storage_Object_Ptr is access Storage_Object'Class;
for Storage_Object_Ptr'Storage_Pool
use Object.Entity_Ptr'Storage_Pool;
It is strongly recommended not to directly use derivatives of Storage_Object. For this purpose serve handles to the objects.
The subtype Deposit_Handle is provided for convenience in referring persistent objects. It "renames" the handle type of the package Deposit_Handles:
subtype Deposit_Handle is Deposit_Handles.Handle;
The package instantiates Generic_Set to obtain sets of object names.
package Catalogue is
new Generic_Set
( Object_Type => Unbounded_Wide_String,
Null_Element => Null_Unbounded_Wide_String
);
The following operations are defined on Storage_Object:
function Get
( Storage : access Storage_Object;
Name : Wide_String
) return Deposit_Handle is abstract;
This function returns a handle to a persistent object by its name. An implementation should first check if the the persistent object already has a memory-resident counterpart. Otherwise it should create one from the persistent storage.
Data_Error | Inconsistent Storage |
End_Error | No such object |
function Get_List
( Storage : access Storage_Object;
Prefix : Wide_String := "";
Suffix : Wide_String := "";
Equivalence : Wide_Character_Mapping := Identity
) return Catalogue.Set is abstract;
This function returns a complete list of all named objects persistent in Storage. The list does not include anonymous persistent objects. Only names starting with Prefix and ending with Suffix are returned. When names are compared two characters are considered same if their corresponding values in the mapping Equivalence are same. Prefix and Suffix may not overlap when matched. The list is a set of object names. Data_Error is propagated on any storage errors.
function Get_Name
( Storage : access Storage_Object;
Object : Deposit_Handle
) return Wide_String is abstract;
This function returns the object's name in Storage. The object is specified by its handle.
Constraint_Error | Invalid handle or Object does not persists in Storage |
Data_Error | Inconsistent Storage |
Name_Error | Object is anonymous |
function Is_In
( Storage : access Storage_Object;
Name : Wide_String
) return Boolean is abstract;
function Is_In
( Storage : access Storage_Object;
Object : Deposit_Handle
) return Boolean is abstract;
These functions check whether an object persists in Storage. The object can be identified either by its name or by a handle to it. When Object is not a valid handle the result is false. Data_Error is propagated on errors in Storage.
procedure Put
( Storage : in out Storage_Object;
Object : in out Deposit_Handle;
Name : Wide_String
) is abstract;
procedure Put
( Storage : in out Storage_Object;
Object : in out Deposit_Handle
) is abstract;
These procedures are used to store Object in Storage. The parameter Name specifies the object's name there. When omitted the object is stored as anonymous. Anonymous persistent objects are collected when not used, but not before its memory-resident counterpart vanishes. When Object already persists in Storage and Name is specified, then it is checked to be same. If this check fails, Name is empty or illegal, or conflicts with the name of another object Name_Error is propagated. When name is not specified, no check is made.
Constraint_Error | Invalid handle |
Data_Error | Inconsistent Storage |
Name_Error | Illegal name (such as empty) or name conflict |
procedure Rename
( Storage : in out Storage_Object;
Old_Name : Wide_String;
New_Name : Wide_String
) is abstract;
procedure Rename
( Storage : in out Storage_Object;
Object : in out Deposit_Handle;
New_Name : Wide_String
) is abstract;
These procedures change the name of the object specified by either its old name (the parameter Old_Name) or by a handle to it (the parameter Object). When renamed object was anonymous before renaming it becomes a named one. When Object is an invalid handle or does not refer a persistent object then Constraint_Error is propagated. End_Error is propagated when Old_Name does not refer any persistent object.
Constraint_Error | Object is invalid handle or does not refer to any object in Storage |
Data_Error | Inconsistent Storage |
End_Error | Old_Name indicates no object |
Name_Error | Illegal name (such as empty) or name conflict |
procedure Unname
( Storage : in out Storage_Object;
Object : Wide_String
) is abstract;
procedure Unname
( Storage : in out Storage_Object;
Object : in out Deposit_Handle
) is abstract;
These procedures make object anonymous. The object can be specified either by its name or by a handle to it. Unnamed objects are automatically deleted when no more in use. Nothing happens if the object is already unnamed. Nothing also happens if Object is an invalid handle, not a handle to a persistent object or does not exist. Note that anonymous objects are not deleted as long as they have memory-resident counterparts. Observe a difference between Unname and Delete (Object.Archived.Delete) called on an object handle. Delete requests object deletion from both memory and persistent storage. Unname does it for persistent storage only. Both may have no immediate effect if the object is still in use. Data_Error is propagated on errors in Storage.
A persistent storage interface is itself an object, which can be referenced by another object. Usually it is an persistent object which memory-resident counterpart is a proxy to the data in the storage. For example for a large data structure it might be very inefficient to load it into the memory. In this case in the memory one would create a small proxy object, which will requests persistent storage for parts of the object's data as necessary. Such proxy object will require a reference to its persistent storage. It also should prevent the persistent storage interface object from premature destruction. This is why it is strongly recommended to use handles to persistent storage interface objects.
The package Persistent.Handle provides the type Storage_Handle, which serves as a handle to an abstract persistent storage interface object. It is guarantied that a persistent storage interface object will not be destroyed as long at least one handle refer to it.
type Storage_Handle is private;
The following operations are defined on Storage_Handle:
function Get
( Storage : Storage_Handle;
Name : Wide_String
) return Deposit_Handle;
This function searches for the specified object by its name. If the object is already available a handle to it is returned. Otherwise it first is restored from the persistent storage.
Constraint_Error | Invalid handle |
Data_Error | Inconsistent Storage |
End_Error | No such object |
function Get_List
( Storage : Storage_Handle;
Prefix : Wide_String := "";
Suffix : Wide_String := "";
Equivalence : Wide_Character_Mapping := Identity
) return Catalogue.Set;
This function returns a complete list of all named objects persistent in Storage. The list does not include anonymous persistent objects. Only names starting with Prefix and ending with Suffix are eligible. When names are compared two characters are considered same if their corresponding values in the mapping Equivalence are same. Observe that Prefix may not overlap Suffix when matched. So if Prefix="AB" and Suffix="BC", then "ABC" does not fit, but "ABBC" does. The result of the function is a set of object names.
Constraint_Error | Invalid handle |
Data_Error | Inconsistent Storage |
function Get_Name
( Storage : Storage_Handle;
Object : Deposit_Handle
) return Wide_String;
This function returns the object's name in Storage. The object is specified by its handle.
Constraint_Error | Invalid handle or Object does not persists in Storage |
Data_Error | Inconsistent Storage |
Name_Error | Object is anonymous |
procedure Invalidate (Storage : in out Storage_ Handle);
This procedure makes handle pointing to nothing. If it was the last reference to the persistent storage interface object, the latter is destroyed.
function Is_In
( Storage : Storage_Handle;
Name : Wide_String
) return Boolean;
function Is_In
( Storage : Storage_Handle;
Object : Deposit_Handle
) return Boolean;
These functions check whether an object persists in Storage. The object can be identified either by its name or by a handle to it. When Object is not a valid handle the result is false.
Constraint_Error | Invalid handle (Storage) |
Data_Error | Inconsistent Storage |
This function checks whether a handle points to a persistent storage interface object.function Is_Valid (Storage : Storage_ Handle) return Boolean;
This function is used to get a pointer to the object the handle Storage points to. The pointer of to the object shall be used no longer the handle it was get from exists.function Ptr (Storage : Storage_ Handle) return Storage_Object_Ptr;
procedure Put
( Storage : in out Storage_Handle;
Object : in out Deposit_Handle;
Name : Wide_String
);
procedure Put
( Storage : Storage_Handle;
Object : in out Deposit_Handle
);
These procedure are used to store Object in Storage. The parameter Name specifies the object name there. When omitted the object is stored anonymous. Anonymous persistent objects are collected when no more used. So before either the object pointed by the handle Object or the persistent storage interface object gets destroyed, an unreferenced anonymous object will persist in the storage. So it is safe to put an anonymous object into Storage and then reference it in another persistent object. When Object already persists in Storage and Name is specified, then it is checked that it is same. If this check fails, Name is empty, illegal, or conflicts with the name of another object Name_Error is propagated.
Constraint_Error | Invalid handle |
Data_Error | Inconsistent Storage |
Name_Error | Illegal name (such as empty) or name conflict |
function Ref (Storage : Storage_Object_Ptr) return Storage_Handle;
This function is used to obtain a handle to a persistent storage interface object. Having a handle to the object prevents object's premature destruction.
procedure Rename
( Storage : in out Storage_Handle;
Old_Name : Wide_String;
New_Name : Wide_String
);
procedure Rename
( Storage : in out Storage_Handle;
Object : in out Deposit_Handle;
New_Name : Wide_String
);
These procedures change the name of the object specified by either its old name (the parameter Old_Name) or by a handle to it. When renamed object was anonymous before renaming it becomes a named one.
Constraint_Error | Invalid handle or Object is not persistent in Storage |
Data_Error | Inconsistent Storage |
End_Error | Old_Name indicates no object |
Name_Error | Illegal name (such as empty) or name conflict |
procedure Unname
( Storage : in out Storage_Handle;
Object : Wide_String
);
procedure Unname
( Storage : in out Storage_Handle;
Object : in out Deposit_Handle
);
These procedures make Object anonymous. The object can be specified either by its name or by a handle to it. Unnamed objects are automatically deleted when no more in use. Nothing happens if the object is already unnamed. Nothing also happens if Object is an invalid handle, not a handle to a persistent object or does not exist. Note that anonymous objects are not deleted before objects pointed by either Object or Storage destroyed. There is a difference between Unname and Delete called on an object handle. Delete requests object deletion from both memory and persistent storage. Unname does it for persistent storage only. Both may have no immediate effect if the object is still in use.
Constraint_Error | Storage is not a valid handle |
Data_Error | Inconsistent Storage |
The package Persistent.Handle.Factory provides a factory of persistent storage objects.
function Create_APQ
( Server_Type : Database_Type;
Data_Base_Name : String;
User_Name : String;
Password : String;
Host_Name : String := "localhost";
Port_Number : Natural := 0;
Erase : Boolean := False
) return Storage_Handle;
This function creates an APQ persistent storage interface object and returns a handle to it. The object is responsible for interacting with a data base via APQ bindings. A connection is established to the server specified by the parameter Host_Name. The parameter Server_Type identifies the data base engine. It can be Engine_PostgreSQL, Engine_MySQL etc, one of the supported by APQ engines. The enumeration type Database_Type is defined in the package APQ delivered with the APQ distribution. User_Name and Password identify the data base user. Data_Base_Name is the name of a data base managed by the server. Port_Number specifies the TCP/IP port listened by the server. When specified as 0, a reasonable default is used. The parameter Erase when true erases the data base contents by dropping all the tables used for storing persistent objects. If the data base contains any additional tables, they remain untouched.
Data_Error | Data base error |
Use_Error | Connection problem. Either of the parameters identifying server, data base or user might be wrong |
function Create_ODBC
( Server_Name : Wide_ String;
User_Name : Wide_ String;
Password : Wide_ String;
Erase : Boolean := False
) return Storage_Handle;
This function creates a connection to an ODBC data base, i.e. any data base that has an ODBC driver and returns a valid handle to the persistent storage interface object to communicate the data base. The parameter Server_Name specifies the data server name. It denotes both the data base and the server (driver). The parameters User_Name and Password specify the user and the password to access the data base. When the parameter Erase is set to true, all used tables are erased upon establishing the connection. One can use it if there is a possibility that the data base contains some corrupted or undesired data. So the data base would initially contain no persistent objects.
Data_Error | Data base error |
Use_Error | Connection problem. Either of the parameters Server_Name, User_Name, Password might be wrong |
Simple components provide ready-to-use persistent storage implementations. The package Persistent.Handle.Factory supports run-time selection of the most suitable implementation. That might be undesirable, because the implementations rely on third party products such as GNADE and APQ. So Persistent.Handle.Factory needs all of them installed. Alternatively, if it is known that only a particular implementation will be actually used, one can do it directly without the factory. This will remove any dependency on other implementations. This section describes presently available implementations.
The package Persistent.ODBC provides an implementation of abstract persistent storage based on Open Database Connectivity (ODBC) interface to data bases. ODBC is provided for a great variety of platforms and data bases. The package declares the following subroutines:
function Create
( Server_Name : Wide_ String;
User_Name : Wide_ String;
Password : Wide_ String;
Erase : Boolean := False
) return Storage_Handle;
This function creates a connection to an ODBC data base, i.e. any data base that has an ODBC driver and returns a valid handle to the persistent storage interface object to communicate the data base. The parameter Server_Name specifies the data server name. It denotes both the data base and the server (driver). The parameters User_Name and Password specify the user and the password to access the data base. When the parameter Erase is set to true, all used tables are erased upon establishing the connection. One can use it if there is a possibility that the data base contains some corrupted or undesired data. So the data base would initially contain no persistent objects.
Data_Error | Data base error |
Use_Error | Connection problem. Either of the parameters Server_Name, User_Name, Password might be wrong |
procedure Disable_Tracing
( Storage : in out Storage_Handle
);
This procedure disables tracing of SQL requests. Constraint_Error is propagated when Storage is not a handle to ODBC persistent storage.
procedure Enable_Tracing
( Storage : in out Storage_Handle;
Name : String
);
This procedure starts tracing SQL requests using trace file Name. Constraint_Error is propagated when Storage is not a handle to ODBC persistent storage. Data_Error is propagated on any other error.
function Is_ODBC (Storage : Storage_Handle) return Boolean;
This function returns true if Storage is a valid handle to an ODBC persistent storage interface object.
function Serializable (Storage : Storage_Handle) return Boolean;
This function returns true if the ODBC driver communicated through Storage supports serializable transactions. Constraint_Error is propagated when Storage is not a valid handle to an ODBC persistent storage interface object.
function Unicode (Storage : Storage_Handle) return Boolean;
This function returns true if the ODBC driver communicated through Storage natively supports Unicode. Constraint_Error is propagated when Storage is not a valid handle to an ODBC persistent storage interface object.
Implementation notes. The implementation uses a minimal set of SQL features to support a greater number of data bases. Therefore almost everything, from generating unique keys to ON DELETE CASCADE is implemented without the data base engine. The most suitable types are selected according to the results of SQLGetTypeInfo. As the result the performance might be not optimal.
The minimal requirements for an ODBC driver:
32-bit integers | SQL_INTEGER. When 64-bit integers (SQL_BIGINT) are supported, they are used for object unique keys. Otherwise, it is (signed) 32-bit ones |
Time stamping | SQL_TIMESTAMP. |
Variable character strings | SQL_LONGVARCHAR. Also when Unicode is supported (SQL_WLONGVARCHAR) it is used to keep object names. When not supported, object names are stored in plain strings. |
PRIMARY KEY | Object primary keys are integers. |
MAX() | In SELECT |
DISTINCT | In SELECT |
NOW() | In INSERT as a value for SQL_TIMESTAMP |
NULL | In INSERT as a value for string |
The implementation tries to serialize data base transactions if the ODBC driver support it. In any case the manual-commit mode is used to provide atomic data base changes. The data base structure consists of three tables:
Table objects:
Column | Type | Description |
object_id | 64- or 32-bit integer, primary key, unique | Object key |
catalogue_name | Unicode string | Object name. Unset if object is anonymous |
class_name | Plain string | Object's class |
object_data | Plain string | Object data |
parameters_list | Plain string | The dependency list |
created_at | Time stamp | Object creation time |
Tables backward_links and direct_links:
Column | Type | Description |
dependant | 64- or 32-bit integer | Object key |
referent | 64- or 32-bit integer | Object key, the object |
The software was tested with:
Some words of warning:
Installation notes. The implementation is based on GNADE 1.5.3a (GNat Ada Database Environment). The GNADE project is distributed under modified GNU Public License. To compile the package Persistent.Handle.Factory you will need a GNADE distribution, at least its part related to ODBC bindings. If you do not use Persistent.Handle.Factory, you need not compile it. For ODBC driver installation refer your data base documentation.
The package Persistent.APQ provides an implementation of abstract persistent storage based on Ada95 Database Binding to PostgreSQL/MySQL by Warren W. Gay VE3WWG (APQ). APQ supports a number of data bases accessed via a unified interface. The package Persistent.APQ provides the following subroutines:
function Create
( Server_Type : Database_Type;
Data_Base_Name : String;
User_Name : String;
Password : String;
Host_Name : String := "localhost";
Port_Number : Natural := 0;
Erase : Boolean := False
) return Storage_Handle;
This function creates an APQ persistent storage interface object and returns a handle to it. The object is responsible for interacting with a data base via APQ bindings. A connection is established to the server specified by the parameter Host_Name. The parameter Server_Type identifies the data base engine. It can be Engine_PostgreSQL, Engine_MySQL etc, one of the supported by APQ engines. The enumeration type Database_Type is defined in the package APQ delivered with the APQ distribution. User_Name and Password identify the data base user. Data_Base_Name is the name of a data base managed by the server. Port_Number specifies the TCP/IP port listened by the server. When specified as 0, a reasonable default is used. The parameter Erase when true erases the data base contents by dropping all the tables used for storing persistent objects. If the data base contains any additional tables, they remain untouched.
Data_Error | Data base error |
Use_Error | Connection problem. Either of the parameters identifying server, data base or user might be wrong |
procedure Disable_Tracing
( Storage : in out Storage_Handle
);
This procedure disables tracing of SQL requests. Constraint_Error is propagated when Storage is not a handle to APQ persistent storage.
procedure Enable_Tracing
( Storage : in out Storage_Handle;
Name : String
);
This procedure starts tracing SQL requests using trace file Name. Constraint_Error is propagated when Storage is not a handle to APQ persistent storage. Data_Error is propagated on any other error.
function Is_APQ (Storage : Storage_Handle) return Boolean;
This function returns true if Storage is a valid handle to an APQ persistent storage interface object.
Implementation notes. The data base structure consists of three tables:
Table objects:
Column | Type | Description |
object_id | 64- or 32-bit integer, primary key, unique, auto-incremented | Object key |
catalogue_name | Plain string. Unicode characters are encoded using XML &#<number>; escape sequences | Object name. Unset if object is anonymous |
class_name | Plain string | Object's class |
object_data | Plain string | Object data |
parameters_list | Plain string | The dependency list |
created_at | Time stamp | Object creation time |
Tables backward_links and direct_links:
Column | Type | Description |
dependant | 64- or 32-bit integer | Object key |
referent | 64- or 32-bit integer | Object key, the object |
The implementation tries to serialize data base transactions if the ODBC driver support it. In any case the manual-commit mode is used to provide atomic data base changes. The software was tested with:
It was not tested under Linux because APQ 2.1 was targeted to 3.x versions of MySQL.
Installation notes. The implementation is based on APQ 2.1. It can be found here, though hosting place might change soon. The APQ project is distributed under modified GNU Public License 2 and Ada Community Licenses. To compile the package Persistent.Handle.Factory you will need an APQ distribution. If you do not use Persistent.Handle.Factory, you need not compile it.
This chapter describes the internal packages used to ease implementation of a persistent storage backed by a data base.
Usually persistent storage is implemented on the basis of an external data base engine. In that data base persistent objects are represented by records or other data structures identified by keys. The packages Persistent.Data_Bank, Persistent.Data_Bank.Index and Persistent.Data_Bank.Indexed are provided for interfacing such data bases. The package Persistent.Data_Bank is the parent package providing basic types. The package Persistent.Data_Bank.Index defines a storage index object to be used at run-time by the storage object. The package Persistent.Data_Bank.Indexed provides a specialized abstract storage which implements the abstract storage operation used the interface defined in Persistent.Data_Bank. That is used to derive a concrete implementation of persistent storage object, that will override the remaining abstract operations.
The package Persistent.Data_Bank defines abstract interface of the storage objects which identify stored objects using a key. It derives the abstract base type Data_Bank_Object from Storage_Object:
type Data_Bank_Object is abstract
new Storage_Object with private;
The keys are provided by implementations by deriving from the abstract base type Persistent_Key:
type Persistent_Key is abstract
new Ada.Finalization.Controlled with null record;
type Persistent_Key_Ptr is access Persistent_Key'Class;
In order to support data base transactions a special access policy is imposed on Storage_Objects. The type Access_Mutex is used to represent transactions:
type Access_Mutex (Storage : access Data_Bank_Object'Class) is
abstract new Ada.Finalization.Limited_Controlled with private;
This type is used as the base for storage specific objects that represent atomic actions on storage, such as data base transactions.
Two concrete types are derived from it. Read_Mutex is used for
viewing storage content without modification. Write_Mutex is used for full access.
type Read_Mutex is new Access_Mutex with private;
type Write_Mutex is new Access_Mutex with private;
An operation that requires access to Storage_Object that might require data base communication should do it as follows:
declare
Transaction : Write_Mutex (DB'Access);
begin
-- Do something with DB
Commit (Transaction);
end;
When Commit is not called on Transaction, because of exception propagation for instance, then Roll_Back will be in the course of Transaction finalization.
Operations defined on mutexes:
This procedure is basically one call:
Commit (Mutex.Storage.all);
Commit can be called only once. Multiple commits cause Use_Error propagation. Any other exception indicates a data base error.
procedure Finalize (Mutex : in out Access_Mutex);
The destructor calls Roll_Back if no Commit was called before. This ensures data base consistency upon transaction errors.
Operations defined on keys. Normally an implementation of a persistent storage would provide a derived key type. That should override the following abstract operations:
function Image
( Storage : Data_Bank_Object'Class;
Key : Persistent_Key
) return String is abstract;
This function returns a string unambiguously identifying Key in Storage. Constraint_Error is propagated when Key cannot be used for Storage.
function Null_Key return Persistent_Key is abstract;
This function returns a value that serves as an illegal key which can never indicate an object.
function Value
( Storage : Data_Bank_Object;
Key : String
) return Persistent_Key'Class is abstract;
This function converts string to a key. Data_Error is propagated when Key does not identify a valid key for Storage. The implementation should not check for any objects existing under the key.
function "<" (Left, Right : Persistent_Key)
return Boolean is abstract;
function "=" (Left, Right : Persistent_Key)
return Boolean is abstract;
Persistent keys are comparable to provide ordered containers.
Operations defined to handle transactions. Data_Bank_Object declares abstract operations on persistent storage supporting transaction framework:
procedure Commit (Storage : in out Data_Bank_Object) is abstract;
This abstract procedure is called at the end of each transaction: an atomic modification of the persistent storage. There should be no difference between Roll_Back and Commit if the transaction was initiated by Seize_Read. Normally, Commit is never called directly, but only through Commit of a mutex object.
Data_Error | Data base error |
Use_Error | No transaction active |
procedure Roll_Back (Storage : in out Data_Bank_Object) is abstract;
This procedure is called when a transaction fails, due to an exception. It is always called from an exception handler which re-raises the exception. For this reason it is not recommended to raise any exceptions in Roll_Back. There is no difference between Roll_Back and Commit if the transaction was initiated by Seize_Read. For a transaction initiated by Seize_Write an implementation should discard any changes made.
procedure Seize_Read
( Storage : in out Data_Bank_Object
) is abstract;
This procedure is called to initiate a read-only transaction with Storage. Only one transaction can be active at a time. The transaction is finished by either a call to Commit or to Roll_Back. For a read-only transaction there should be no sufficient difference between Commit and Roll_Back.
Data_Error | Data base error |
Use_Error | A transaction is already active (optional) |
procedure Seize_Write
( Storage : in out Data_Bank_Object
) is abstract;
This procedure is called to initiate a read/write transaction with Storage. Only one transaction can be active at a time. The transaction is finished by either a call to Commit or to Roll_Back.
Data_Error | Data base error |
Use_Error | A transaction is already active (optional) |
Operations defined in terms of keys. It is recommended to check active transaction in implementations of abstract operations and to raise Use_Error. Though it is not mandatory. Carefully observe that object key is a class-wide parameter. An implementation would usually check if the key's specific type is one supported by the data base. If it is not then End_Error should be used to indicate an absent object, except when otherwise is explicitly specified.
The following operations should be implemented:
procedure Delete
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class
) is abstract;
This procedure deletes an object by its key. An implementation may proceed from the assumption that all dependent objects are already deleted and no object refers to the deleted one. It can be called only within a transaction following a call Seize_Write.
Data_Error | Data base error |
Use_Error | No write transaction active (optional) |
function Find
( Storage : access Data_Bank_Object;
Name : Wide_String
) return Persistent_Key'Class is abstract;
This procedure is used to determine the object's key by object's name. It is allowed only within a transaction initiated either by Seize_Read or Seize_Write.
Data_Error | Data base error |
End_Error | No such object |
Use_Error | No transaction active (optional) |
function Get
( Storage : access Data_Bank_Object;
Key : Persistent_Key'Class
) return Deposit_Handle is abstract;
This procedure restores a persistent object by its key. An implementation shall check if the object for the specified key is already memory-resident. It is allowed to call only within a transaction initiated either by Seize_Read or Seize_Write.
Data_Error | Data base error |
End_Error | No such object |
Use_Error | No transaction active (optional) |
procedure Get_Data
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class;
Class : out Unbounded_String;
Data : out Unbounded_String;
Parameters : out Unbounded_String
) is abstract;
Implementation returns the object's description stored in Storage under Key. The description is used to restore the object. The output parameters are the object's class and data as they were generated by Object.Archived.Store and internally used Parameters, which describe the dependency list of the object being restored. The procedure is allowed only within a transaction initiated either by Seize_Read or Seize_Write.
Data_Error | Data base error |
End_Error | No such object |
Use_Error | No transaction active (optional) |
function Get_Dependant
( Storage : access Data_Bank_Object;
Key : Persistent_Key'Class;
No : Positive
) return Persistent_Key'Class is abstract;
This function is used to enumerate objects having backward links to the object specified by Key. That are ones which have specified the object in the list of backward links (the parameter Backward_Links of Store and Update). All dependants are enumerated starting from 1. The parameter No specifies the number of a dependant to get. An implementation is allowed to use a cache, so the caller should not undertake any actions which may lead to updating the dependency list of the object. The function is allowed only within a transaction initiated either by Seize_Read or Seize_Write.
Data_Error | Data base error |
End_Error | No dependant found, end of list, no such object |
Use_Error | No transaction active (optional) |
function Get_Name
( Storage : access Data_Bank_Object;
Key : Persistent_Key'Class
) return Wide_String is abstract;
Implementation returns the object's name stored in Storage under Key. It is allowed only within a transaction initiated either by either by Seize_Read or Seize_Write.
Data_Error | Data base error |
End_Error | No such object |
Name_Error | Anonymous object |
Use_Error | No transaction active (optional) |
function Has_Dependants
( Storage : access Data_Bank_Object;
Key : Persistent_Key'Class
) return Boolean is abstract;
An anonymous object that has no dependants can be deleted, but not before its memory-resident counterpart disappears. When key does not specify any object, the result is false. This function is allowed only within a transaction initiated either by either by Seize_Read or Seize_Write.
Data_Error | Data base error |
Use_Error | No transaction active (optional) |
function Is_In
( Storage : access Data_Bank_Object;
Key : Persistent_Key'Class
) return Boolean is abstract;
This function checks whether Key specify an object persistent in Storage. It is allowed only within a transaction initiated either by either by Seize_Read or Seize_Write.
Data_Error | Data base error |
Use_Error | No transaction active (optional) |
procedure Put
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class;
Object : Deposit'Class
) is abstract;
This procedure updates a persistent object by its key. Usually it calls Update for this purpose. It is within a transaction initiated by Seize_Write.
Data_Error | Data base error |
End_Error | Key does not identify an object |
Use_Error | No write transaction active (optional) |
procedure Rename
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class;
Name : Wide_String
) is abstract;
This procedure changes the name of the object specified by Key to Name. When renamed object was anonymous before renaming it becomes a named one. I.e. it will not deleted when no more in use. An implementation can proceed from the assumption that the caller has already checked for illegal and conflicting names. This procedure is allowed only within a transaction initiated by Seize_Write.
Data_Error | Data base error |
End_Error | Key does not identify an object (optional) |
Name_Error | Name conflict, there is another object named so (optional) |
Use_Error | No write transaction active (optional) |
function Store
( Storage : access Data_Bank_Object;
Name : Wide_String;
Class : String;
Data : String;
Parameters : String;
Direct_Links : Deposit_Set;
Backward_Links : Deposit_Set
) return Persistent_Key'Class is abstract;
function Store
( Storage : access Data_Bank_Object;
Class : String;
Data : String;
Parameters : String;
Direct_Links : Deposit_Set;
Backward_Links : Deposit_Set
) return Persistent_Key'Class is abstract;
These functions are used to write a persistent object. They are called internally. The parameter Name specifies the name of the object in the storage. This should be an unique name. When not specified, the object is anonymous. The parameter Data contains a string unambiguously describing the object of the class specified by the parameter Class. It is normally obtained using Object.Archived.Store. Parameters is used to store additional information about links. The parameters Direct_Links and Backward_Links define the set of objects in Storage the object depends on. Objects mentioned in the set Backward_Links are those which can be deleted without deletion of the object itself. The union of Direct_Links and Backward_Links specifies only directly visible dependencies, it is not a closure. An implementation usually stores Class and Data under the name Name and corrects a persistent dependency table according to Direct_Links and Backward_Links. Note that initially written object is not referenced. Store is allowed only within a transaction by Seize_Write.
Data_Error | Data base error |
End_Error | No such object |
Use_Error | No write transaction active (optional) |
procedure Unname
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class
) is abstract;
This procedure makes the object specified by Key anonymous. The object object should be automatically deleted when no more in use, but not before it has a memory-resident counterpart. An implementation can proceed from the assumption that the caller already checked for object existence. The procedure is allowed only within a transaction by Seize_Write.
Data_Error | Data base error |
End_Error | No such object (optional) |
Use_Error | No write transaction active (optional) |
procedure Update
( Storage : in out Data_Bank_Object;
Key : Persistent_Key'Class;
Class : String;
Data : String;
Parameters : String;
Direct_Links : Deposit_Set;
Backward_Links : Deposit_Set
) is abstract;
This procedure is used to update a modified persistent object. It is called internally. The parameter Data contains a string unambiguously describing the object of the class specified by the parameter Class. Parameters is used internally to store additional information about links. It is normally obtained using Object.Archived.Store. The parameters Direct_Links and Backward_Links are same as in Store. An implementation would normally update Class and Data in the object's record and correct persistent dependency table. Update is allowed only within a transaction by Seize_Write.
Data_Error | Data base error |
End_Error | No such object (optional) |
Use_Error | No write transaction active (optional) |
The child generic package Persistent.Data_Bank.Index implements an index of persistent objects:
generic
type Data_Bank is abstract new Data_Bank_Object with private;
type Key is new Persistent_Key with private;
package Persistent.Data_Bank.Index is
type Catalogue (Storage : access Data_Bank'Class) is
new Ada.Finalization.Limited_Controlled with private;
type Catalogue_Ptr is access all Catalogue;
...
Persistent objects having memory-resident counterparts are said to be bound. For each bound object the storage index of the Catalogue type contains a record. When an object is requested from the persistent storage it is first searched in the index. The index also contains information about object's keys and names. Additionally the index registers a notification object to catch bound objects destruction. Upon destruction of a bound object index checks if the object was anonymous and no more referenced in the persistent storage, if so the object is deleted from the storage. If the memory-resident object was modified it is synchronized with the storage.
The package has two generic parameters:
The child generic package Persistent.Data_Bank.Indexed implements persistent storage interface using the operations defined in Persistent.Data_Bank:
generic
type Data_Bank is abstract new Data_Bank_Object with private;
type Key is new Persistent_Key with private;
package Persistent.Data_Bank.Indexed is ...
The generic parameters:
The package provides the abstract type Indexed_Storage_Object which can be used as the base type for a concrete implementation of a data base interface:
type Indexed_Storage_Object is abstract new Data_Bank with private;
The derived type shall implement the following remaining operations:
Persistent objects that are not fully represented by their memory-resident counterparts require a reference to the storage they persist in. This is necessary at least to prevent persistent storage interface object from being prematurely destroyed. Further the operation Object.Archived.Restore does not contain a parameter referencing the storage. Special objects of the type Self_Reference defined in the child package Persistent.Data_Bank.Reference serve this purpose. An object may put a Self_Reference in its dependency list (see Object.Archived.Get_Referents) by calling the procedure Add from this package. If it does so then in its Restore it will find a Self_Reference again. The fields of that object denote the persistent storage and the object's key there.
type Self_Reference is new Deposit with record
Storage : Storage_Handle;
Key : Persistent_Key_Ptr;
end record;
The following operations are defined on Self_Reference:
procedure Add
( List : in out Deposit_Container'Class;
Storage : Storage_Handle;
Key : Persistent_Key'Class
);
This procedure adds to List a reference to Storage. Key is the Storage key of the object that requires a reference to Storage. The reference is placed at the list end. Other operations are implementations of the interface defined in Object.Archived.
The following sample code illustrates using Self_Reference objects. A user-defined persistent object is Proxy_Object. It contains a handle to the storage where it persists and implements some of its operations through communication to the storage. For example, it can be a large array of data stored there. When a piece of data is requested Proxy_Object routes the request to the storage and returns the result. Additionally Proxy_Object contains its key in the storage. Proxy_Object should call Add from its Get_Referents to add reference to the storage in its dependency list. Then upon restore it will find a Self_Reference object in the list passed to its Restore:
type Proxy_Object is new Deposit with record
Storage : Storage_Handle; -- The storage used
Key : Storage_Key; -- The storage key of the object
...
end record;
procedure Get_Referents
( Object : Proxy_Object;
List : in out Deposit_Container'Class
) is
begin
Add
( List,
Object.Storage,
Object.Key
);
... -- adding other dependencies if any
end Get_Referents;
procedure Restore
( Source : String;
Pointer : in out Integer;
Class : String;
List : Deposit_Container'Class;
Object : out Deposit_Ptr
) is
Object : Deposit_Ptr := new Proxy_Object;
Proxy : Proxy_Object renames Proxy_Object (Result.all);
Reference : Self_Reference'Class renames
Self_Reference'Class (Get (List, 1).all);
begin
Proxy.Storage := Reference.Storage;
Proxy.Key := Storage_Key (Reference.Key.all);
... -- restoring the rest of Proxy_Object as necessary
end Restore;
Note that only objects of non-limited type can be used in sets and maps. To have sets of limited objects use pointers or handles as elements. For sets of handles also see Object.Handle.Generic_Set. For maps of strings see Tables.
The package Generic_Set defines the type Set. An instance of the type is a set of items. One can add to and remove from items of the set. Items of the set can be accessed using the positive index. They are ordered, so the set implementation may use binary search. There is a null item, which is never included into the set and is used to mark free memory slots. The package is generic and has the following generic parameters:
generic
type Object_Type is private;
Null_Element : Object_Type;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
with function "<" (Left, Right : Object_Type) return Boolean is <>;
with function "=" (Left, Right : Object_Type) return Boolean is <>;
package Generic_Set is
type Set is new Ada.Finalization.Controlled with private;
...
Here:
Sets can be assigned. Assignment makes no deep copy, which is delayed until the time moment when the original and a copy become different. Items in the set are ordered so that lesser items have lesser indices, when indexed. The first item in the set has index 1. The following operations are defined on Set:
procedure Add (Container : in out Set; Item : Object_Type);
procedure Add (Container : in out Set; Items : Set);
These procedures are used to add an item to a set or all items of one set to another. Nothing happens if the item is already in the set or is a Null_Element. Note that items are compared using the provided operations "<" and "=". It is possible that these operations treat different items as same. Only one item from such equivalence class may be in a set. To control which one will be inserted use Insert and Replace.
function Create return Set;
This function returns an empty set.
procedure Erase (Container : in out Set);
This procedure removes all items from the set.
function Find (Container : Set; Item : Object_Type)
return Integer;
This function is used to find an item in the set Container. The result is either a positive index of the found item or a negated index of the place where the item should be if it were in the set.
function Get (Container : Set; Index : Positive)This function is used to get an item of the set Container using a positive index. Constraint_Error is propagated if Index is wrong.
return Object_Type;
function Get_Size (Container : Set) return Natural;
This function returns the number of items in the set.
procedure Insert (Container : in out Set; Item : in out Object_Type);
This procedure inserts an item into a set. Nothing happens if the item is Null_Element. When Container already has an element equivalent to Item, then Item will not replace it. Instead of that the element from Container will be returned through Item. So upon completion Item always has the value of the element in Container.
function Is_Empty (Container : Set) return Boolean;
True is returned if Container is empty.
function Is_In (Container : Set; Item : Object_Type)
return Boolean;
True is returned if Item is in Container.
procedure Remove (Container : in out Set; Index : Positive);
procedure Remove (Container : in out Set; Item : Object_Type);
procedure Remove (Container : in out Set; Items : Set);
These procedures are used to remove items from the set Container. An item can be removed either by its index, or explicitly, or by specifying a set of items to be removed. If a particular item is not in the set, then nothing happens. Constraint_Error is propagated when item index is wrong.
procedure Replace (Container : in out Set; Item : Object_Type);These procedures are used to add to / replace in an item or all items of a set. Nothing an is a Null_Element. Any duplicated items are replaced by new ones. This operation has sense only if the equality operation defined on Object_Type does not distinguish some objects.
procedure Replace (Container : in out Set; Items : Set);
function "and" (Left, Right : Set) return Set;
function "or" (Left, Right : Set) return Set;
function "xor" (Left, Right : Set) return Set;
These functions are conventional set operations - intersection, union, difference. Difference is defined as a set which items are only in one of the sets Left and Right.
function "=" (Left, Right : Set) return Boolean;
True is returned if both sets contain same items.
The package Generic_Map defines the type Map which represents an associative array. One can add to and remove from items of the map. Each item has an unique key associated with it. In other word a map is a function which for a given key yields an item. Items of the map can be also accessed using the positive index. Items in the map are ordered according to their keys, so the map implementation may use binary search. Reference counting is used for the objects of the type Map, which means that assigning Map objects is relatively cheap. The package is generic and has the following generic parameters:
generic
type Key_Type is private;
type Object_Type is private;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
with function "<" (Left, Right : Key_Type) return Boolean is <>;
with function "=" (Left, Right : Key_Type) return Boolean is <>;
package Generic_Map is
type Map is new Ada.Finalization.Controlled with private;
...
Here:
Both Key_Type and Object_Type can be controlled. The implementation warranties that when an item or key is no more used in the map it is erased by assigning it a value created by the default constructor (if any). This behavior ensures that items and keys removed from the map will be always finalized upon the operation. For example, when Object_Type is controlled, then Finalize will be called upon an item of Object_Type even if the item is not replaced but removed from a map. This happens through assigning some other object of Object_Type to the removed item. So when item is a Handle, then the reference count of an object it refers will be decreased as expected. On assignment no deep copy of a map is made. Deep copy is postponed till the time moment when the original and a copy become different. Items in the map are ordered according to their keys, so that items with lesser keys have lesser indices, when indexed. The first item in the map has index 1. The following operations are defined on the type Map:
procedure Add
( Container : in out Map;
Key : Key_Type;
Item : Object_Type
);
This procedure adds a new item (Item) to the map Container. Constraint_Error propagates if Container already contains an item with the key equal to Key.
procedure Add (Container : in out Map; Items : Map);
This procedure adds all items of Items to Container. If Container already has an item with the key equal to an item from Items, then that item from Items is ignored.
function Create return Map;
This function returns an empty map.
procedure Erase (Container : in out Map);
This procedure removes all items from Container.
function Find (Container : Map; Key : Key_Type) return Integer;
This function is used to find an item in the map Container. The result is either a positive index of the found item or a negated index of the place where the item should be if it were in the map.
function Get (Container : Map; Key : Key_Type) return Object_Type;
function Get (Container : Map; Index : Positive) return Object_Type;
These functions are used to get an item of the map Container using either a key or a positive index. Constraint_Error is propagated if Index is wrong, or there is no item with the key equal to Key. Note that item index may change when items are added or removed.
function Get_Key (Container : Map; Index : Positive) return Key_Type;This functions returns the key of an item in Container. Constraint_Error is propagated if Index is wrong.
function Get_Size (Container : Map) return Natural;
This function returns the number of items in the map.
function Is_Empty (Container : Map) return Boolean;
True is returned if Container is empty.
function Is_In (Container : Map; Key : Key_Type)
return Boolean;
True is returned if Container has an item for Key.
procedure Remove (Container : in out Map; Index : Positive);
procedure Remove (Container : in out Map; Item : Key_Type);
procedure Remove (Container : in out Map; Items : Set);
These procedures are used to remove items from the map Container. An item can be removed either by its index, or by its key, or by specifying a map of items to be removed. If a particular item is not in the map, then nothing happens. Constraint_Error is propagated when item index is wrong.
procedure ReplaceThese procedures are used to add or replace items. An item can be replaced by its index. It can be either added or replaced by its key. That is when Container does contain an item with the key equal to Key, then it is replaced by Item, otherwise Item is added under Key. The third variant adds or replaces all items from the map Items. Contraint_Error is propagated when Index is wrong.
( Container : in out Map;
Index : Positive;
Item : Object_Type
);
procedure Replace
( Container : in out Map;
Key : Key_Type;
Item : Object_Type
);
procedure Replace
( Container : in out Map;
Items : Map
);
The package Generic_Unbounded_Array defines the type Unbounded_Array. An instance of the type is a dynamically expanded vector of elements. The implementation keeps vector contiguous, so it might be very inefficient to put complex data structures into the array. In many cases it is better to put pointers to elements there. See also the package Generic_Unbounded_Ptr_Array which instantiates Generic_Unbounded_Array for this purpose. The type wraps the component Vector which is a pointer to an array of elements. One can use Vector to access array elements and query its present bounds, which are rather arbitrary. The unused elements of the array vector are padded using a distinguished null-element value The package Generic_Unbounded_Array is generic and has the following generic parameters:
generic
type Index_Type is (<>);
type Object_Type is private;
type Object_Array_Type is
array (Index_Type range <>) of Object_Type;
Null_Element : Object_Type;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Generic_Unbounded_Array is ...
Here:
The type Unbounded_Array is declared as follows:
type Object_Array_Ptr is access Object_Array_Type;
type Unbounded_Array is
new Ada.Finalization.Limited_Controlled with
record
Vector : Object_Array_Ptr := null;
end record;
Array elements can be accessed through indexing the component Vector. Note that single what can be said about the length of the vector is that it is big enough to keep all elements put into the array. The unused elements in the vector are padded using the value Null_Element. The implementation is very straightforward. It does not implement any optimization of assignments, like the implementation of Generic_Set does. This choice was intentionally made to mimic arrays as close as possible. If reference counting is needed a wrapper type could be built around Unbounded_Array. The following operations are defined on Unbounded_Array:
procedure Erase (Container : in out Unbounded_Array);
This procedure removes all elements from Container making it empty.
procedure Finalize (Container : in out Unbounded_Array);
The destructor frees the memory allocated for the array vector.
function Get
( Container : Unbounded_Array;
Index : Index_Type
) return Object_Type;
This function is an equivalent to Container.Vector (Index). Constraint_Error is propagated if Index is out of vector range.
procedure PutThis procedure is used to put / replace an element in array using its index. The array vector is expanded as necessary. Unused elements are padded with Null_Element.
( Container : in out Unbounded_Array;
Index : Index_Type;
Element : Object_Type
);
The package Generic_Unbounded_Ptr_Array defines the type Unbounded_Ptr_Array. An instance of Unbounded_Ptr_Array is a dynamically expanded vector of pointers to elements. Upon destruction objects pointed by array elements are destroyed. Same happens when an element is replaced. The package has the following generic parameters:
generic
type Index_Type is (<>);
type Object_Type (<>) is limited private;
type Object_Ptr_Type is access Object_Type;
type Object_Ptr_Array_Type is
array (Index_Type range <>) of Object_Ptr_Type;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Generic_Unbounded_Ptr_Array is ...
Here:
The type Unbounded_Ptr_Array is declared through an instantiation of the package Generic_Unbounded_Array. Array elements can be accessed through indexing the component Vector which are pointers to the elements. Note that single what can be said about the length of the vector is that it is big enough to keep all elements put into the array. The unused elements in the vector are padded using null. The following operations are defined on Unbounded_Ptr_Array:
procedure Erase (Container : in out Unbounded_Array);
This procedure removes all elements from Container making it empty.
procedure Finalize (Container : in out Unbounded_Ptr_Array);
The destructor frees the memory allocated for the array vector and all elements it refers to.
function Get
( Container : Unbounded_Ptr_Array;
Index : Index_Type
) return Object_Ptr_Type;
This function is an equivalent to Container.Vector (Index). Constraint_Error is propagated if Index is out of vector range.
procedure Put
( Container : in out Unbounded_Ptr_Array;
Index : Index_Type;
Element : Object_Ptr_Type
);
This procedure is used to put in / replace an array element using its index. If the replaced array element is not null then the object it points to is destroyed. Note that the object pointed by Element is not copied. Thus it is not a responsibility of the caller to destroy the object. It will be automatically destroyed upon array destruction or replacing the element in the array. The array vector is expanded as necessary. Unused elements are padded with null.
The implementation of Generic_Segmented_Stack provides an illustration of use Generic_Unbounded_Ptr_Array. A segmented stack consists of segments of same size. The list of segments is viewed as an abstract array used to instantiate Generic_Stack. The array index is split into the high-order index indicating a segment and the low-order one specifying the element in the segment. The list of segments is implemented as an Unbounded_Ptr_Array indexed by the high-order index. Observe that once allocated a segment gets referenced in Unbounded_Ptr_Array, so there is no need to explicitly deallocate segments, Unbounded_Ptr_Array willl do it. So the implementation of Generic_Segmented_Stack can be as straightforward as:
Stack, also LIFO Stack (Last in First Out), is a container in which the only accessible element is the last one.
The package Generic_Stack defines the type Stack which provides a generic stack. The stack is built upon an array type which might be a Unbounded_Array, Unbounded_Ptr_Array, array of handles or some other type (like Unbounded_String). The package has the following generic parameters:
generic
type Index_Type is (<>);
type Object_Type (<>) is limited private;
type Array_Type is limited private;
Null_Element : Object_Type;
with function Get
( Container : Array_Type;
Index : Index_Type
) return Object_Type is <>;
with procedure Put
( Container : in out Array_Type;
Index : Index_Type;
Element : Object_Type
) is <>;
package Generic_Stack is
type Stack is new Ada.Finalization.Limited_Controlled with private;
Here the formal parameters are:
The following operations are defined on Stack:
procedure Erase (Container : in out Stack);
This procedure pops all items from the stack Container.
function Get (Container : Stack; Index : Index_Type)
return Object_Type;
This function returns the stack item with the index specified by the parameter Index. The item item on the stack top has the index returned by Mark, so that
Top (Container) = Get (Container, Mark (Container))
Constraint_Error is propagated if Index points out of stack.
function Is_Empty (Container : Stack) return Boolean;
This function returns true if Container is empty.
function Mark (Container : Stack) return Index_Type;
The value returned by this function can be used in the procedure Release to pop all the items pushed in between. When the type Index_Type is an integer type, then the difference between two values returned by Mark is the number of stack items.
procedure Pop (Container : in out Stack; Count : Natural := 1);
This procedure pops Count items from the top of Container. If the stack does not contain enough items, it is emptied.
procedure Push (Container : in out Stack; Item : Object_Type);
This procedure pushes Item onto Container.
procedure Put
( Container : in out Stack;
Index : Index_Type;
Element : Object_Type
);
This procedure replaces the stack item specified by the parameter Index with Element. The index is same as described in Get. Constraint_Error is propagated if Index points out of stack.
procedure Release (Container : in out Stack; Mark : Index_Type);
This procedure is used to pop all items pushed since a call to the function Mark which result was the value of the parameter Mark. Nothing happens if the stack was already popped below Mark.
function Top (Container : Stack) return Object_Type;
This function returns the topmost stack item. Constraint_Error is propagated if Container is empty.
The package Generic_Segmented_Stack instantiates Generic_Stack using a list of segments of same size. The number of segments is unlimited. New segments are allocated as necessary. The package is generic:
generic
type Index_Type is (<>);
type Object_Type is private;
Null_Element : Object_Type;
Segment_Size : Positive := 128;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Generic_Segmented_Stack is
...
package Segmented_Stack is new Generic_Stack ...
Here the formal parameters are:
The package can be used as follows:
package Float_Stack is
new Generic_Segmented_Stack (Integer, Float, 0.0);
use Float_Stack.Segmented_Stack;
...
LIFO : Stack;
User-defined storage pools can be used for objects which creation / destruction policy allows a more efficient memory management strategy than the standard heap offers, but yet not enough strict to allocate them on the system stack.
The package Stack_Storage provides an implementation of a user-defined pool organized as a stack. The package the type Pool derived form System.Storage_Pools.Root_Storage_Pool:
type Pool
( Initial_Size : Storage_Count;
Items_Number : Positive
) is new Root_Storage_Pool with private;
A stack pool consists of contiguous segments allocated dynamically as necessary. The discriminants control the stack segments allocation policy. Initial_Size determines the initial default size of a newly allocated segment. If this size is less than the size of the object being allocated the default size is set to the object size multiplied to Items_Number. This value will then used as the default size for all further segments. The segments allocated earlier having lesser size will be freed when possible. Otherwise, they remain allocated until pool destruction. Upon stack pool destruction, all the stack segments are deallocated. No checks made whether some objects remain allocated on the stack. Note also that no checks made whether objects allocation / deallocation order is indeed LIFO (last in, first out). The stack pool is not task-safe. If that is required it has to be protected from a concurrent access.
The generic child package Stack_Storage.Mark_And_Release provides an implementation of a mark and release pool for limited controlled objects:
generic
Stack : in out Pool'Class;
package Stack_Storage.Mark_And_Release is ...
The generic parameter Stack is a descendant of Pool, a stack pool. The package defines:
type Pool_Object is
new Ada.Finalization.Limited_Controlled with private;
This is the base type for all objects to be allocated on Stack. The pool objects should be allocated only in the pool (using an allocator new). If they are destroyed explicitly using Unchecked_Deallocation, then it should happen LIFO and never under the last pool mark. The type Pool_Object has the following operations:
procedure Finalize (Object : in out Pool_Object);
The destructor has to be called by all descendants of Pool_Object. Storage_Error is propagated if Object is not the last allocated object in the pool.
procedure Initialize (Object : in out Pool_Object);The constructor has to be called by all descendants.
type Pool_Object_Ptr is access Pool_Object'Class;
for Pool_Object_Ptr'Storage_Pool use Stack;
This is the access type, which can be used as the target for an allocator of a Pool_Object descendant. If other access type used as the target, then it has to be specific to the pool Stack.
type Pool_Mark is
new Ada.Finalization.Limited_Controlled with private;
Objects of Pool_Mark are used as snap-shots of the pool state. When such a pool mark object is created it remembers the pool state. Upon its destruction it finalizes all the objects allocated in the pool since its construction and reclaims the storage occupied by the objects. If some pool objects have to be destroyed explicitly, then that shall be ones created after the last pool mark creation only. The following operations are defined on Pool_Mark:
procedure Finalize (Snap : in out Pool_Mark);
The destructor removes all objects remaining in the pool since construction of Snap. Storage_Error is propagated on object finalization errors.
procedure Initialize (Snap : in out Pool_Mark);
The constructor remembers the pool state.
The following short code sample illustrates use of mark and release pool:
declare
Snap : Pool_Mark; -- Mark the pool state
Ptr : Pool_Object_Ptr;
begin
...
Ptr := new Derived_Pool_Object; -- Allocate
...
Ptr := new Another_Derived_Pool_Object; -- Allocate
...
end; -- Release all allocated objects
Parsers can be used for syntax analysis of infix expressions, i.e. ones containing infix (dyadic), prefix and postfix operators, brackets, function calls, array indices etc. The approach presented does not require any grammar put down to generate scanner and analyzer. Nor any code generation steps are required. An object-oriented approach is used instead. The lexical procedures are dispatching, so that implementations may be provided through overriding them. Parsers can be used both for immediate one-pass code interpretation and for parsing tree building. Parser automatically detects the expression end allowing its easy integration. Operator precedence is expressed in a native way by setting priorities controlling association with the operands. Associations with the left and right side operands are controlled independently. Commutative operators and their inverses can be optimized when necessary. Especial attention is paid to error handling allowing generating very precise error messages and source code references. Samples from a small console calculator to a complete parsing tree generator for Ada 95 expressions illustrate examples of use.
The parsing method used is an extended variant of infix to postfix notation conversion algorithm. I do not know who was its author. Already in 1975 T. Pratt in Programming Languages, design and implementation mentioned it as well known. The algorithm makes possible parsing and interpreting infix expressions in one pass without returns. The following figure drafts out the idea and its implementation.
Quick reference:
In this paragraph I present an implementation of small primitive floating point calculator. The calculator supports operations +, -, *, /, ** brackets () and unary operators +, -, abs.
File calculator.ads:with
Parsers.String_Source; use
Parsers.String_Source; with Parsers.Generic_Lexer.Blanks; with Parsers.Generic_Token.Segmented_Lexer; with Tables.Names; package Calculator is -- -- Calculate -- A primitive floating-point calculator -- -- Formula - To be evaluated -- -- Returns : -- -- The result of Formula -- -- Exceptions : -- -- Syntax_Error - Any syntax error -- Numeric_Error - Any numeric error -- function Calculate (Formula : String) return Float; private -- -- Operations -- All the operations supported -- type Operations is ( Add, Sub, Mul, Div, Pow, -- Infix operators Abs_Value, Plus, Minus, -- Prefix operators Left_Bracket, Right_Bracket -- Brackets ); -- -- "and" -- Checks operation associations, always True (Ok) -- function "and" (Left, Right : Operations) return Boolean; -- -- Is_Commutative -- No commutative operations, always False -- function Is_Commutative (Left, Right : Operations) return Boolean; -- -- Is_Inverse -- No commutative operations, always False -- function Is_Inverse (Operation : Operations) return Boolean; -- -- Group_Inverse -- No commutative operations, never called -- function Group_Inverse (Operation : Operations) return Operations; -- -- Priorities -- The levels of association -- type Priorities is mod 10; -- -- Tokens -- The lexical tokens -- package Tokens is new Parsers.Generic_Token ( Operation_Type => Operations, Argument_Type => Float, Priority_Type => Priorities, Code => Code ); use Tokens; -- -- Check_Spelling -- Of a name, no checks -- procedure Check_Spelling (Name : String); -- -- Check_Matched -- Check if no broken keyword matched -- function Check_Matched (Source : String; Pointer : Integer) return Boolean; -- -- Token_Tables -- Case-insensitive tables of tokens -- package Token_Tables is new Tokens.Vocabulary.Names; -- -- The tables of prefix, infix and postfix operations -- Prefixes : aliased Token_Tables.Dictionary; Infixes : aliased Token_Tables.Dictionary; Postfixes : aliased Token_Tables.Dictionary; -- -- Lexers -- Table driven lexers -- package Lexers is new Tokens.Segmented_Lexer; -- -- Blank_Skipping_Lexers -- Ones that skip blanks -- package Blank_Skipping_Lexers is new Lexers.Token_Lexer.Implementation.Blanks (Lexers.Lexer); -- -- Expression -- The lexer using our tables -- type Expression is new Blank_Skipping_Lexers.Lexer ( Prefixes => Prefixes'Access, Infixes => Infixes'Access, Postfixes => Postfixes'Access ) with null record; -- -- Call -- Evaluates an operator -- function Call ( Context : access Expression; Operation : Tokens.Operation_Token; List : Tokens.Arguments.Frame ) return Tokens.Argument_Token; -- -- Enclose -- Evaluates an expression in brackets -- function Enclose ( Context : access Expression; Left : Tokens.Operation_Token; Right : Tokens.Operation_Token; List : Tokens.Arguments.Frame ) return Tokens.Argument_Token; -- -- Get_Operand -- Recognizes an operand (float number) -- procedure Get_Operand ( Context : in out Expression; Code : in out Source; Argument : out Tokens.Argument_Token; Got_It : out Boolean ); end Calculator; |
The package Calculator defines the function Calculate that takes a string argument and returns the result of the expression in the string. In the private part of the package, first the set of supported operations is defined, that is the enumeration type Operations. Then the following functions are defined on Operations to be used in generic instantiations:
Next the package defines the type Priority used for operation association levels. That can be any type with "<" order defined. The types Operations and Priority are used to instantiate the package Parsers.Generic_Token. The instance Tokens provides base types describing expression lexical tokens. That is the table type used to keep the legal names of the operations defined by the type Operations. The tables obtained are case sensitive. It is not exactly what is needed, because the expression should be case-insensitive. For this reason, the child table package Tables.Names is instantiated. To do this first, there should be defined two additional subroutines:
Tables.Names is instantiated as:
package Token_Tables is new Tokens.Vocabulary.Names;
Next three tables from obtained Token_Tables are declared. They are:
The final step is to create table-driven lexers using the tables we have. For this the package Parsers.Generic_Token.Segmented_Lexer is instantiated under the name Lexers. The instance Lexers has the type Lexer which can be used to parse expressions. This type is abstract because it has some abstract operations to implement. The first operation is Get_Blank used to skip blanks in the expression. The package Parsers.Generic_Lexer.Blanks provides an implementation that skips spaces, tabs etc. To use it Parsers.Generic_Lexer.Blanks is instantiated as Blank_Skipping_Lexers. The obtained type Lexer is then extended to set the type discriminants to the corresponding tables. The resulting type Expression is still abstract, but has only three things to define:
The implementation of the package is straightforward:
File calculator.adb:with
Ada.Characters.Handling; use
Ada.Characters.Handling; with Ada.Exceptions; use Ada.Exceptions; with Ada.IO_Exceptions; use Ada.IO_Exceptions; with Strings_Edit; use Strings_Edit; with Strings_Edit.Floats; use Strings_Edit.Floats; with Ada.Numerics.Elementary_Functions; use Ada.Numerics.Elementary_Functions; package body Calculator is function "and" (Left, Right : Operations) return Boolean is begin return True; end "and"; function Is_Commutative (Left, Right : Operations) return Boolean is begin return False; end Is_Commutative; function Is_Inverse (Operation : Operations) return Boolean is begin return False; end Is_Inverse; function Group_Inverse (Operation : Operations) return Operations is begin return Minus; end Group_Inverse; procedure Check_Spelling (Name : String) is begin null; end Check_Spelling; function Check_Matched (Source : String; Pointer : Integer) return Boolean is begin return ( not Is_Alphanumeric (Source (Pointer)) or else not Is_Alphanumeric (Source (Pointer - 1)) ); end Check_Matched; |
The function Check_Matched receives the string being parsed and the index of the first character following the matched name (lexeme). It checks that no broken names be matched.
File calculator.adb (continuation): function
Call ( Context : access Expression; Operation : Tokens.Operation_Token; List : Tokens.Arguments.Frame ) return Tokens.Argument_Token is Result : Float; begin case Operation.Operation is when Abs_Value => Result := abs List (List'First).Value; when Add => Result := List (List'First).Value + List (List'Last).Value; when Sub => Result := List (List'First).Value - List (List'Last).Value; when Mul => Result := List (List'First).Value * List (List'Last).Value; when Div => Result := List (List'First).Value / List (List'Last).Value; when Pow => Result := exp (log (List (List'First).Value) * List (List'Last).Value); when Plus => Result := List (List'First).Value; when Minus => Result := -List (List'First).Value; when others => raise Program_Error; end case; if Result'Valid then return (Result, Operation.Location & Link (List)); else Raise_Exception ( Numeric_Error'Identity, ( "Numeric error in " & Operations'Image (Operation.Operation) & " at " & Image (Operation.Location) ) ); end if; exception when Program_Error => raise; when others => Raise_Exception ( Constraint_Error'Identity, ( "Numeric error in " & Operations'Image (Operation.Operation) & " at " & Image (Operation.Location) ) ); end Call; |
Call implements operators. The parameter Operation identifies what for operator. The parameter List contains the operands. Each operand has value (the Value field) and location in the source (the Location field). When evaluated operator also returns a value and location. The new location is evaluated from the locations of the operands (function Link) and the operator.
File calculator.adb (continuation): function
Enclose ( Context : access Expression; Left : Tokens.Operation_Token; Right : Tokens.Operation_Token; List : Tokens.Arguments.Frame ) return Tokens.Argument_Token is begin return ( List (List'First).Value, Left.Location & Right.Location ); end Enclose; |
Enclose implements brackets. Order brackets () just return the operand.
File calculator.adb (continuation): procedure
Get_Operand ( Context : in out Expression; Code : in out Source; Argument : out Tokens.Argument_Token; Got_It : out Boolean ) is Line : String renames Get_Line (Code); Pointer : Integer := Get_Pointer (Code); Value : Float; begin Get (Line, Pointer, Value); Set_Pointer (Code, Pointer); Argument := (Value, Link (Code)); Got_It := True; exception when End_Error => Got_It := False; when Constraint_Error => Set_Pointer (Code, Pointer); Raise_Exception ( Numeric_Error'Identity, "Too large number at " & Image (Link (Code)) ); when Data_Error => Set_Pointer (Code, Pointer); Raise_Exception ( Parsers.Syntax_Error'Identity, "Wrong number at " & Image (Link (Code)) ); end Get_Operand; |
The procedure Get_Operand gets the operand in the source. It uses Get_Line to access the current source line. Get_Pointer returns where it should start. The procedure Get from Strings_Edit.Floats is used to get a floating-point number. The Set_Pointer advances the source cursor to the position next to the number. The function Link is used to get the location of the number matched.
File calculator.adb (continuation): Reckoner : Expression; function Calculate (Formula : String) return Float is Copy : aliased String := Formula; Code : Source (Copy'Access); Result : Tokens.Argument_Token; begin Lexers.Parse (Reckoner, Code, Result); if Get_Pointer (Code) <= Copy'Last then Raise_Exception ( Parsers.Syntax_Error'Identity, ( "Unrecognized '" & Copy (Get_Pointer (Code)..Copy'Last) & "'" ) ); end if; return Result.Value; end Calculate; |
The procedure Calculate implements the calculator. It just calls Parse to interpret Formula and the checks that the whole string was matched.
File calculator.adb (continuation):begin Add_Operator (Prefixes, "abs", Abs_Value, 8, 7); Add_Operator (Prefixes, "+", Plus, 8, 7); Add_Operator (Prefixes, "-", Minus, 8, 7); Add_Bracket (Prefixes, "(", Left_Bracket); Add_Operator (Infixes, "+", Add, 1, 2); Add_Operator (Infixes, "-", Sub, 1, 3); Add_Operator (Infixes, "*", Mul, 3, 4); Add_Operator (Infixes, "/", Div, 3, 4); Add_Operator (Infixes, "**", Pow, 9, 5); Add_Bracket (Postfixes, ")", Right_Bracket); end Calculator; |
Finally upon package elaboration the tables have to be filled in. Add_Operator is used to add an operator. The operator priorities are chosen to satisfy usual association rules. Add_Bracket is used to add brackets.
A program using the calculator may look as follows:
File console_calculator.adb:with Ada.Exceptions;
use Ada.Exceptions; with Ada.Text_IO; use Ada.Text_IO; with Strings_Edit.Floats; use Strings_Edit.Floats; with Calculator; use Calculator; with Parsers; procedure Console_Calculator is Text : String (1..120); Last : Integer; begin Put_Line ("Enter an expression to calculate and hit <enter>"); Put_Line ("The operations supported are +, -, /, *, **, abs, ()"); Put_Line (" (to exit enter an empty string)"); loop Put (">"); Get_Line (Text, Last); exit when Last < Text'First; begin Put_Line ("=" & Image (Calculate (Text (1..Last)))); exception when Error : Numeric_Error | Parsers.Syntax_Error => Put_Line (Exception_Message (Error)); end; end loop; exception when Error : others => Put ("Error :"); Put_Line (Exception_Information (Error)); end Console_Calculator; |
An expression is a sequence of symbols involving operations applied to operands. In programming languages expression is a formula used to compute a value. In most general way any expression has the following syntax:
<expression> ::= <prefix> <operand> <postfix> [ <infix> <expression> ]
Here <prefix> denotes any list of prefix operations, <postfix> does any list of postfix operations, <infix> is always exactly one infix operation.
For example in the following C++ expression:
! f ( 3 + x )
Operands here are f, 3 and x. Prefix operation is ! (logical not). Postfix operations are ++ (post-increment) and right bracket ). Infix operations are the operator + and the left bracket (.
The above syntax defines three kinds of operations depending of the context they may appear. We will use different colors to highlight the operation context.
The operations are subdivided into operators, delimiters like brackets, commas and ligatures and reserved keywords used as modifiers.
Foo (Left=>X, Right=>Y);
Argument sublists are quite common in mathematical notation. For example a hypergeometric function:
F (x1, x2, x3; y1, y2, y3 | z)
Here ; and | are semicolons separating sublists of the argument list. If ; has higher priority than | the above means:
() (| |) (; ;| F x1 x2 x3 y1 y2 y3 z
Argument lists and sublists are always bound by two operations, the left and the right one. There are three kinds of semicolons:
(A, B with C, D with E)
() with) with) A B C D E
The modifiers can be used to stop expression parsing at reserved keywords. Thus in Ada the same then when does not follow and, manifests the expression end in an if-statement.
Association of the operators with the operands is usually controlled by the precedence level (the operator priority) and special rules for the case when the priorities are same. Here I propose a simpler and more general model. All operator have two priorities to control association with the operands on either side. So the left priority controls left-side association. Both unary and binary operators have the priorities. Binary infix operators normally have left and right priorities near to each other. To have left to right operand association the left priority should be slightly lower than the right one. The following example illustrates the process of operand association for A+B*C+D+E:
A + B * C + D + E = + A 1+2 B 3*4 C 1+2 D 1+2 E = A 1+2 B→*4 C 1+2 D 1+2 E = + A 1+2 B→*←C 1+2 D 1+2 E = A→+←(B*C) 1+2 D 1+2 E = + (A+(B*C))→+←D 1+2 E = ((A+(B*C))+D)→+←E = * ((A+(B*C))+D)+E A B C D E
Normally, the left priority of a prefix operator is higher than the right one and both are higher than the priorities of the infix operators so:
A - ++ -- B + C = + A 1-2 10++9 10--9 B 1+2 C = A 1-2→++9→--←B 1+2 C = - A 1-2→++←(--B) 1+2 C = A→-←(++(--B)) 1+2 C = ++ (A-(++(--B)))→+←C = -- (A-(++(--B)))+C = A B C
There might be exceptions from this rule as in the case of the exponentiation operator, where it is useful to have the left priority of ** higher than the right priority of the unary minus and the right priority of ** lower than left priority of the unary minus so, that -A**-B become:
- A ** - B = - 8-7 A 9**5 8-7 B = ** 8-7 A→**5→-→B = 8-7 A→**←(-B)= - -←(A**(-B))= -(A**(-B)) A B
The priorities of the postfix operators should be selected so that the left priority be very high, but lower than the right priorities of the prefix operators. The right priority should be slightly lower than the left one, but higher than the right priorities of the infix operators. Under these conditions:
A - ++ B ++ -- - C = - A 1-2 10++9 B 7++8 7--8 1-2 C = A 1-2→++←B 7++8 7--8 1-2 C = - A 1-2 (++B)→++←7--8 1-2 C = A 1-2 ((++B)++)→--←1-2 C = -- A→-((++B)++)--) 1-2 C = ++ (A-((++B)++)--))→-←C = ++ (A-((++B)++)--))-C A B C
The order of evaluation of the unary operations can be changed by setting appropriate priorities. In extreme cases it could involve infix operators as well:
A * ? B @ * C = @ A 3*4 1?2 B 2@1 3*4 C = * A 3*4←1?2 B 2@1 3*4 C = ?2 A→*←B 2@1 3*4 C = ? ?←(A*B) 2@1 3*4 C = * (?(A*B)) 2@1→3*4 C = (?(A*B))→*←C 2@ = A B C ((?(A*B))*C)→@ = ((?(A*B))*C)@
Association of a left index or function call bracket with the operand on the left is controlled by the left priority of the bracket. This priority is usually high because otherwise:
A ** B (C, D + E) = () A 9**5 B 4( C, D 1+2 E) = A→**←B 4( C, D 1+2 E) = ** + (A**B)→( C, D→+←E) = ()(A**B, C, D+E) A B C D E
Here "()" denotes indexing or function call. Normally most of infix operators have lower priorities, with exception of component extraction which usually has a higher priority. Left index brackets have no right priority.
The aggregate, order left brackets, commas and ligatures have no association priorities. The following table summarizes the rules of choosing the operation priorities:
Operation | Left | Right | Comment | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Unary prefix operators | High | High, but slightly lower than the left one | Right to left evaluation order. Normally unary operators
have higher priorities than binary operators. However, in Ada we find that:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Unary postfix operators | High | Higher than the left one | Usually both priorities are lower than ones of the prefix operators, so that prefix ones would be evaluated first, and the postfix ones next and left to right | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Normal infix operators | Moderate | Slightly lower than the left one | Left to right evaluation order. Operators like component extraction A.B should have both priorities very high. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Assignment operator | High | Low | This ensures that
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Left index brackets | Very high | Array indices and function usually have higher priority
than most of infix operations, but not all of them::
|
Another case of association is represented by sublist semicolons. The semicolons also have an association priority with the arguments in the list. Although semicolon association priorities do not interact with the operators' ones:
(A + B : C, D | E : F, G | H)
() (| || |) (: :| |: :| + A B C D E F G H
In this example the colon separator has higher priority than one of bar separator.
Sometimes operations cannot be arbitrarily associated with each other. There are three cases where operation compatibility can be checked:
A and B or C
is illegal. Here the infix operators "and" and "or" are incompatible. Order brackets should be used to make it legal. For instance:
A and (B or C)
Unary operators also can be checked. In Ada both
+ - A and A**+2
are illegal (see Ada Reference Manual 4.4). When association checks are used for unary operators it is important to define association incompatibility relation transitive. I.e. if an operator x cannot be associated with operator y, but can be with z, then y cannot be associated with z. Otherwise some association error may remain undetected.
( A + B ]
The right square bracket is incompatible with the left round bracket.
A commutative operation is one which result does not depend on the argument order. For example, numeric addition is commutative because a+b = b+a. When the result does not depend on the operands order, an expression can be optimized by choosing a preferable order among many possible. The preferable order, could be one evaluating the constants and invariants first. For example: 1+a+2 = (1+2)+a = 3+a. Optimization may also take advantage of an inverse operation of a commutative group: 1+a-4 = 1+a+(-4) = (1+(-4))+a = -3+a. Here addition is the operation of a commutative group. Subtraction is the inverse operation. Unary minus is defined as 0-x, where 0 is the group's zero element. To support optimizations of this kind the commutative operations and their inverses can be parsed so that multiple appearances of binary operators will be replaced by an equivalent multiple operand operation. For example:
A + B - C + D + E
can result in:
+ - A B C D E
Similarly:
A - B - C + D + E
can be parsed as:
- - - A B C D E
Note that the original order can be always restored when the inverse unary operation is prevented from being specified explicitly. For this one can have two different unary minus operations one for explicit use and another used implicitly as the group inverse. Therefore if semantic analysis of the operands involved shows that they are not commutative, then optimizations could be omitted.
The following table summarizes the most frequently used commutative groups:
Operators | Group's operation | Its inverse | Inverse unary operation |
+, - | addition | subtraction | Unary minus: 0-x |
*, / | multiplication | division | 1/x |
The operations that are commutative, but usually have no inverse: logical and bitwise and, or, xor; numerical min, max. Alternatives separator | in Ada is also commutative.
Commutative optimization can be also useful for non-commutative operations. Often it makes sense to make the component selector . commutative to simplify parsing tree.
The package Parsers is the parent package of all others. It defines:
type Token_Class is
( Operator,
Bracket,
Comma,
Ligature,
Index,
Sublist_Close,
Sublist_Separator,
Sublist_Open,
Postmodifier,
Premodifier
);
This type enumerates the classes of operations. It also defines the subtypes of the sublist separators (semicolons) and modifiers:
subtype Semicolon_Class is Token_Class
range Sublist_Close..Sublist_Open;
subtype Modifier_Class is Token_Class
range Postmodifier..Premodifier;
Further the package defines the exceptions:
Syntax_Error : exception;
The exception Syntax_Error is used by lexers. Usually it has information attached containing the error description and location.
Association_Error : exception;
Missing_Right_Bracket : exception;
Unexpected_Comma : exception;
Unexpected_Right_Bracket : exception;
Wrong_Comma_Type : exception;
Wrong_Right_Bracket_Type : exception;
These exceptions are used when dealing with operation stacks. The are low-level ones that normally never not propagate out of a lexer.
The parser can scan different kinds of sources from plain strings to text files. The generic package Parsers.Generic_Source specifies the abstract interface of a source:
generic
type Source_Type (<>) is limited private;
type Location_Type is private;
with function Get_Line (Code : Source_Type)
return String is <>;
with function Get_Pointer (Code : Source_Type)
return Integer is <>;
with function Get_Backup_Pointer (Code : Source_Type)
return Integer is <>;
with function Image (Link : Location_Type)
return String is <>;
with function Link (Code : Source_Type)
return Location_Type is <>;
with procedure Next_Line (Code : in out Source_Type) is <>;
with procedure Reset_Pointer (Code : in out Source_Type) is <>;
with procedure Set_Pointer
( Code : in out Source_Type;
Pointer : Integer
) is <>;
with function "&" (Left, Right : Location_Type)
return Location_Type is <>;
package Parsers.Generic_Source is
end Parsers.Generic_Source;
The package does not provide any operations of its own. It only defines the interface of a source. Here
type Source_Type (<>) is limited private;
is the type of a source. An implementation should maintain two source cursors (pointers). As the parser consumes the source text it advances the cursors. The source slice between two cursors specifies the last token recognized by the parser. It may return back to the token beginning. However if the source contains several lines or records, then cursors always point to same line, so a return may never require the previous line:
type Location_Type is private;
Objects of this type are used to identify a contiguous slice of the source. This can be any part of the source, if multiple lines are supported, then Location_Type should allow to specify several source lines.
function End_Of (Code : Source_Type) return Boolean;
This function returns true at the source end.
This function gets the current source code line. It remains valid until the first call to Next_Line. End_Error is propagated when end source was reached either because the source is empty or because of a call to Next_Line before.
This function gets the current cursor. The result is an index in the current line which would be returned by Get_Line. It is in the range Line'First..Line'Last+1 provided that Line is the value returned by Get_Line. The character pointed by Get_Pointer is the first one to parse. The characters before are the recognized ones. At the source end, when Get_Line would raise End_Error, 1 is the result.
function Get_Backup_Pointer (Code : Source_Type)
return Integer;
This function returns the saved cursor. It is one to which Restore_Pointer would return. At the source end, when Get_Line would raise End_Error, 1 is the result. The slice of the current line starting from the result of Get_Backup_Pointer and ending in the character before one pointed by Get_Pointer is usually the last recognized token.
This procedure advances to the next source line. After a successful completion Get_Line can be used to access the newly read source line. Both cursors are set to Get_Line'First. So when the line is not empty Get_Pointer will return the index of the first character in the new source line. Data_Error is propagated on I/O errors. End_Error is propagated when the source end is reached.
This procedure moves the second cursor back to the first cursor, so that Get_Pointer would return the value of Get_Backup_Pointer. The depth of the "unget" need not to be deeper than 1. Consequent calls to Reset_Pointer may have no effect. It is also not required to implement return to the previous line.
procedure Set_Pointer
( Code : in out Source_Type;
Pointer : Integer
);
This procedure is used to move the cursors forward. The parameter Pointer is the new position of the second cursor, it should be in the range between the position returned by Get_Pointer and the position following the last character of the current line, i.e. Get_Line (Code)'Last + 1. At the source end when Get_Line would raise End_Error, the only valid value to set is 1. Otherwise Layout_Error is propagated. The first cursor is moved to the old position of the second one. The following small example illustrates an implementation of a routine to skip spaces in the source line:
procedure Skip (Code :
in out Source_Type) is Line : String renames Get_Line (Code); Pointer : Integer := Get_Pointer (Code); begin while Pointer <= Line'Last and then Line (Pointer) = ' ' loop Pointer := Pointer + 1; end loop; Set_Pointer (Code, Pointer); end Skip; |
This function gets the source code location between two cursors. The second cursor is one returned by Get_Pointer. The first cursor is the previous value of the second one returned by Get_Backup_Pointer. The slice in between is usually the last recognized lexical token. It includes the character pointed by the first cursor, and does not one pointed by the second one. Empty slices are allowed, so Link should never fail even at the end of a source. Should Link (Code) called immediately after a call to Skip above, it would return a location identifying the blank slice matched by Skip in the source code line.
function Image (Link : Location_Type) return String;
This function returns a text description of a location. The result is a string;
function "&" (Left, Right : Location_Type)
return Location_Type;
This function is used to combine two, usually adjacent a source code locations. The result is a consecutive code fragment containing positions from both Left and Right locations. For example if Left and Right are locations of "(" and ")" then the result is everything in the brackets including the brackets.
The child package Parsers.Generic_Source.Text_IO can be used for debugging. It provides:
procedure Put_Line (File : File_Type; Code : Source_Type);
procedure Put_Line (Code : Source_Type);
These procedures output the current source code line following current source cursors. The output might look like:
123.0 + ( Value - 1)
^^^^^|
The parameter File is the text file to write. It is the standard output if missing. Code is the source code.
There are three child procedures of Parsers.Generic_Source which can be used to skip the following blanks and comments:
generic
procedure Parsers.Generic_Source.Get_Blank
( Code : in out Source_Type;
Got_It : out Boolean
);
This procedure skips spaces, tabulations (HT), line feeds (LF), carriage returns (CR), vertical tabulations (VT), form feeds (FF) and requests new source lines when necessary. Upon completion Got_It is set to true if the source end is not yet reached. Otherwise it is false.
generic
procedure Parsers.Generic_Source.Get_Ada_Blank
( Code : in out Source_Type;
Got_It : out Boolean
);
This procedure skips Ada comments, spaces, HT, LF, CR, VT, FF and requests new source lines when necessary. Upon completion Got_It is set to true if the source end is not yet reached. Otherwise it is false. Beware that according to ARM 2.2(2) Ada comment ends either at the physical line end or in either of format effectors: LF, CR, VT, FF. This may lead to surprises when format effectors appear in the middle of what the operating system counts for a single line.
generic
procedure Parsers.Generic_Source.Get_Cpp_Blank
( Code : in out Source_Type;
Got_It : out Boolean;
Error : out Boolean;
Error_At : out Location_Type
);
This procedure skips C++ comments, spaces, HT, LF, CR, VT, FF and requests new source lines when necessary. A C++ comment either starts with // (double forward slash) and continues to the end of the current line or it does with /* (forward slash, asterisk) and continues to the first appearance of closing */. In the latter case nested /*..*/ comments are not recognized. Upon completion Got_It is set to true if the source end is not yet reached. Otherwise it is false. Error is set to true when no closing */ is found before the source end. In this case Error_At contains the location of /* in the source. Otherwise, Error is false and Error_At is not defined.
There is a child procedure of Parsers.Generic_Source which can be used to skip a text in the source:
generic
procedure Parsers.Generic_Source.Get_Text
( Code : in out Source_Type;
Text : String;
Got_It : out Boolean;
Map : Character_Mapping := Identity
);
Upon completion Got_It is set to true if Text was recognized and skipped. Otherwise it is false. It can be used for creating simple recursively descending parsers. The parameter Map specifies the character equivalence. A character in the source and in Text are equivalent when they are equivalent in Map. The default value considers all characters distinct. To have case-insensitive match one can use Ada.Strings.Maps.Constants.Lower_Case_Map.
When writing recursive descent parsers it is common to match the source against a list of keywords. The child procedure Get_Token of Parsers.Generic_Source can be used for this purpose. It has a generic formal parameter Tokens which is an instance of the package Tables:
generic
with package Tokens is new Tables (<>);
procedure Parsers.Generic_Source.Get_Token
( Code : in out Source_Type;
Folder : Tokens.Table'Class;
Token : out Tokens.Tag;
Got_It : out Boolean
);
The procedure matches the source Code against the table Folder. If a token from Folder is matched, then it is skipped in Code, the value associated with it is stored in Token and Got_It is set to true. Otherwise Got_It is set to false, and Token is not changed. Note that Folder can be a descendant of the table type defined in Tables. This means that one can also use case-insensitive tables from the package Tables.Names.
When keywords to match are plain case-insensitive words, the generic child package Keywords can be used instead of Get_Token. An enumeration type is the generic parameter of the package. The literals of the type are the keywords to match:
generic
type Keyword is (<>);
package Parsers.Generic_Source.Keywords is
...
The package provides the procedure:
procedure Get
( Code : in out Source_Type;
Token : out Keyword;
Got_It : out Boolean
);
This procedure matches a keyword in Code. Matching is case-insensitive. When matched the keyword value is set into Token and Got_It is set to true. The source cursor is then advanced behind the text matched. The longest possible token is always matched. When no token matches the source Got_It is set to false. The following code sample illustrates usage of the package:
with Parsers.Multiline_Source; --
Muiltiline sources |
This package has a limited use, because many words are reserved in Ada, and thus cannot be enumeration literals.
The package Parsers.String_Source provides an implementation of code source based on standard strings. The package is non-generic. It instantiates Parsers.Generic_Source under the name Code. So the package Parsers.String_Source.Code can be used everywhere an instance of Parsers.Generic_Source is required. Additionally the package defines:
type Location is record
From : Integer;
Length : Natural;
end record;
This is the type used for string source locations.
type Source (Text : access String) is limited record
Pointer : Integer := Text'First;
Last : Integer := Text'First;
end record;
This is the type of a string source. The discriminant Text points to the string being parsed.
The package Parsers.Multiline_Source provides an implementation of code sources consisting of several lines. The package is non-generic. It instantiates Parsers.Generic_Source under the name Code. So the package Parsers.Multiline_Source.Code can be used everywhere an instance of Parsers.Generic_Source is required. The package defines an abstract base type Source which should be concretized by overriding its abstract operations. The package defines:
type Line_Number is new Natural;
The source line numbers.
type Position is record
Line : Line_Number;
Column : Integer;
end record;
The source position.
function "<" (Left, Right : Position) return Boolean;
The source positions are comparable using both "=" and "<" .
type Location is record
First : Position;
Next : Position;
end record;
The source locations are specified by the first character position and the position of the first character next to location.
type Source is
abstract new Ada.Finalization.Limited_Controlled with
record
Buffer : String_Ptr;
Line : Line_Number := 0;
Length : Natural;
Pointer : Integer;
Last : Integer;
end record;
Here the fields are:
The field Buffer points to a string, which is used to keep the current source line. The constructor allocates the buffer of some reasonable size. When a new line is requested the buffer can be replaced by a larger one if necessary.
procedure Finalize (Code : in out Source);
The destructor deallocates the buffer.
procedure Initialize (Code : in out Source);
The constructor creates the buffer.
procedure Get_Line (Code : in out Source) is abstract;
This is an abstract procedure to be overridden. An implementation should read a complete next line into Code.Buffer.all. It may reallocate the buffer if necessary. After a successful completion Code.Buffer should point to a buffer containing the line and Code.Length should be the line length. The rest of the buffer is ignored. End_Error is propagated if no more lines available. Other exceptions can be used on I/O error.
The package Parsers.Multiline_Source.Text_IO provides sources based on text files. It declares the type Source:
type Source (File : access File_Type) is
new Multiline_Source.Source with private;
The discriminant File points to the file to read. The type File_Type is defined in Ada.Text_IO.
The expression tokens are operations and operands. The generic package Parsers.Generic_Token defines the token types:
generic
type Argument_Type is private;
type Operation_Type is private;
type Priority_Type is private;
with package Code is new Generic_Source (<>);
with function "and" (Left, Right : Operation_Type)
return Boolean is <>;
with function Is_Commutative (Left, Right : Operation_Type)
return Boolean is <>;
with function Is_Inverse (Binary_Operator : Operation_Type)
return Boolean is <>;
with function Group_Inverse (Binary_Operator : Operation_Type)
return Operation_Type is <>;
with function "<" (Left, Right : Priority_Type)
return Boolean is <>;
package Parsers.Generic_Token is ...
The parameters of the package defines the operations and the operands to use:
The package provides implementation of the following data structures used during expression parsing:
package Descriptors is
new Generic_Operation (Operation_Token, Priority_Type);
Token tables are used by parser to recognize operation symbols in the source. In accordance with three expression contexts there are at least tree token tables: the table of prefixes, the table of infixes and the table of postfixes. Each table contains the tokens which may appear in the corresonding context.
To make these instantiations the package defines the following types:
type Argument_Token is record
Value : Argument_Type;
Location : Location_Type;
end record;
This type describes an appearance of an argument (operand) in the source. The field Value identifies the argument. The field Location is the argument location.
type Operation_Token is record
Operation : Operation_Type;
Location : Location_Type;
end record;
This type describes an appearance of an operation in the source. Operation identifies the operation, Location is the operation location.
type Table_Token (Class : Token_Class := Operator) is record
Operation : Operation_Type;
case Class is
when Operator =>
Left : Priority_Type;
Right : Priority_Type;
when Index | Semicolon_Class'Range =>
Priority : Priority_Type;
when Bracket | Comma | Ligature | Modifier_Class'Range =>
null;
end case;
end record;
This type describes the tokens associated with expression operations to recognize in the source. The discriminant Class specifies the class of the operation. The field Operation identifies the operation. Additional fields determine the operation priorities, if applied. The following operations on tables of Table_Table token can be used for filling the tables:
procedure Add_Operator
( Table : in out Vocabulary.Table'Class;
Name : String;
Operator : Operation_Type;
Left : Priority_Type;
Right : Priority_Type
);
This procedure is used to add an operator to a token table. It can be either a prefix unary operator for a table of prefixes, or a binary infix operator for an table of infixes or a postfix unary operator for a table of postfixes. The parameter Table is the corresponding token table. Name is the operator name. Note that the same operator can be specified under different names. Operator is the operation associated with the name. Left and Right are the operator priorities. Constraint_Error is propagated on an incorrectly spelled name, if spelling is checked (see Table). Name_Error is propagated if the name is already in the table.
procedure Add_Bracket
( Table : in out Vocabulary.Table'Class;
Name : String;
Bracket : Operation_Type
);
This procedure is used to add a left order bracket or one of an aggregate to a table of prefixes. It can also be used to add a right bracket of any kind to a table of postfixes. The parameter Table is the corresponding token table. Name is the bracket name. Note that the same bracket can be specified under different names. Bracket is the operation associated with the name. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
procedure Add_Comma
( Table : in out Vocabulary.Table'Class;
Name : String;
Comma : Operation_Type
);
This procedure is used to add a comma to a table of infixes. The parameter Table is the table. Name is the comma name. The same comma can be specified under different names. Comma is the operation associated with the name. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
procedure Add_Index
( Table : in out Vocabulary.Table'Class;
Name : String;
Index : Operation_Type;
Left : Priority_Type
);
This procedure is used to add a left index bracket to a table of infixes. The parameter Table is the table. Name is the name of the left array index or function call bracket. Note that the same bracket can be specified under different names. Index is the operation associated with the name. Left is the left priority of the bracket. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
procedure Add_Ligature
( Table : in out Vocabulary.Table'Class;
Name : String;
Ligature : Operation_Type
);
This procedure is used to add a ligature to a table of infixes. The parameter Table is the table. Name is the name of the ligature. Note that the same ligature can be specified under different names. Ligature is the operation associated with the name. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
procedure Add_Postmodifier
( Table : in out Vocabulary.Table'Class;
Name : String;
Modifier : Operation_Type
);
This procedure is used to add a postmodifier to a table. The parameter Table is the table. Name is the name of the modifier. Modifier is the operation associated with the name. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
Modifiers cause call to On_Postmodifier handler when recognized.
A postmodifier added to the table of infixes is always discarded, because an infix operation is mandatory to appear after an operand. This has the effect of stopping parsing at the modifier, which can be utilized when there are reserved keywords used to bound expressions.
A postmodifier added to a prefix or postfix table may appear several times. This behavior may require special handling. For example, let "in" be a postmodifier for "is". Then "x is in y" and "x is in in y" will all be legal. To prevent this, one should change the operation to an equivalent one upon a call to On_Postmodifier so that by the next call to it one could detect that "in" was already applied and discard the second "in".
procedure Add_Premodifier
( Table : in out Vocabulary.Table'Class;
Name : String;
Modifier : Operation_Type
);
This procedure is used to add a premodifier either to a table of prefixes or to a table of postfixes. The premodifier when recognized causes a call to On_Premodifier when the operation following it appears. If the latter does not, On_Missing_Operation is called. Note that a premodifier cannot be returned back, thus a dangling premodifier is usually a severe error if it cannot be ignored. So it is preferable to use postmodifiers whenever possible. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table.
procedure Add_Semicolon
( Table : in out Vocabulary.Table'Class;
Name : String;
Semicolon : Operation_Type;
Class : Semicolon_Class;
Priority : Priority_Type
);
This procedure is used to add a semicolon to a table of infixes. The parameter Table is the table. Name is the semicolon name. The same semicolon can be specified under different names. Semicolon is the operation associated with the name, argument sublists separated by the semicolon will be merged by this operation. The parameter Class specifies the semicolon type. Constraint_Error is propagated on an incorrectly spelled name. Name_Error is propagated if the name is already in the table. When a sublist obtains all it is elements Enclose is called with the parameters indicating left index bracket, bracket or semicolon enclosing the sublist. Among the operations of Enclose at least one is a semicolon. The parameter Priority control the association priority of the separator among other separators. It is the left association priority when Class is Sublist_Close or Sublist_Separator. It is the right priority if Class is Sublist_Open or Sublist_Separator.
Additionally the package defines:
function Link (List : Arguments.Frame) return Location_Type;
This function merges the locations of all arguments in List. The result is a minimal contiguous location containing locations of all arguments from List.
The child generic package Parsers.Generic_Token.Generic_Token_Lexer an abstract type for table driven infix expression lexers:
generic
type Argument_Stack is new Arguments.Stack with private;
with package Operations is new Descriptors.Generic_Stack (<>);
type Operation_Stack is abstract new Operations.Stack with private;
package Parsers.Generic_Token.Generic_Token_Lexer is ...
The package has the following formal parameters:
The package instantiates Parsers.Generic_Lexer to provide lexers operating on the Argument_Stack and Operation_Stack:
package Implementation is
new Parsers.Generic_Lexer
( Arguments => Arguments,
Descriptors => Descriptors,
Operations => Operations,
Argument_Stack => Argument_Stack,
Operation_Stack => Operation_Stack,
Code => Code
);
The obtained lexer type is then extended:
type Lexer
( Prefixes : access Vocabulary.Table'Class;
Infixes : access Vocabulary.Table'Class;
Postfixes : access Vocabulary.Table'Class
) is abstract new Implementation.Lexer with private;
The type Lexer has the following discriminants:
A type derived from Lexer has to implement the following abstract subroutines defined for the base type in Parsers.Generic_Lexer:
The following error handlers can be overridden if other behavior required:
procedure Parse
( Context : in out Implementation.Lexer'Class;
Code : in out Source_Type;
Result : out Argument_Token
) renames Implementation.Parse;
This class-wide procedure renames Parsers.Generic_Lexer.Parse. Upon successful completion Result is one of the expression. Note that Result is of Argument_Token type. So it contains both the expression result and its location, which is usually the expression location. The state of Code indicates how far the expression parsing advanced even in case of an exception. Parse is recursive-call safe as long as implementations of the abstract operations do not change Context and Code in an inappropriate way.
The child generic package Parsers.Generic_Token.Segmented_Lexer provides table driven infix expression lexers based on segmented stack implementations:
generic
Argument_Frame_Segment_Size : Positive := 128;
Argument_Frame_Minimal_Size : Positive := 64;
Argument_Frame_Increment : Natural := 50;
Argument_Stub_Minimal_Size : Positive := 64;
Argument_Stub_Increment : Natural := 50;
Operation_Segment_Size : Positive := 128;
Operation_Minimal_Size : Positive := 64;
Operation_Increment : Natural := 50;
package Parsers.Generic_Token.Segmented_Lexer is ...
The formal parameters Argument_* control argument stack allocation policy, see Parsers.Generic_Argument.Segmented_Stack. The formal parameters Operation_* control operation stack allocation policy, see Parsers.Generic_Operation.Segmented_Stack.
The package instantiates Parsers.Generic_Token.Generic_Lexer under the name Token_Lexer. The type Lexer defined there is used as an abstract base:
subtype Lexer is Token_Lexer.Lexer;
A type derived from Lexer has to implement the following abstract subroutines defined for the base type in Parsers.Generic_Lexer:
procedure Parse
( Context : in out Token_Lexer.Implementation.Lexer'Class;
Code : in out Source_Type;
Result : out Argument_Token
) renames Token_Lexer.Implementation.Parse;
This class-wide procedure renames Parsers.Generic_Lexer.Parse. Upon successful completion Result is one of the expression. Note that Result is of Argument_Token type. So it contains both the expression result and its location, which is usually the expression location. The state of Code indicates how far the expression parsing advanced even in case of an exception. Parse is recursive-call safe as long as implementations of the abstract operations do not change Context and Code in an inappropriate way.
The package Parsers.Generic_Lexer provides abstract infix expression lexer. A lexer scan source for an expression. It stops scanning at the expression end. As it scans the source it uses two stacks to arrange the operands and operations it recognizes. Operands are stored on the argument stack, operations are pushed onto the operation stack. To recognize the expression tokens abstract subroutines are used to be implemented by concrete derived types.
generic
with package Arguments is new Generic_Argument (<>);
type Argument_Stack is new Arguments.Stack with private;
with package Descriptors is new Generic_Operation (<>);
with package Operations is new Descriptors.Generic_Stack (<>);
type Operation_Stack is abstract new Operations.Stack with private;
with package Code is new Generic_Source (<>);
package Parsers.Generic_Lexer is ...
The package generic parameters:
The package defines the abstract type Lexer:
type Lexer is abstract
new Ada.Finalization.Limited_Controlled with private;
and the type of lexical tokens used to communicate with lexical callbacks:
type Lexical_Token (Class : Token_Class := Operator) is record
Operation : Operation_Type;
case Class is
when Operator =>
Left : Priority_Type;
Right : Priority_Type;
when Index | Semicolon_Class'Range =>
Priority : Priority_Type;
when Bracket | Comma | Ligature | Modifier_Class'Range =>
null;
end case;
end record;
The class-wide operation
procedure Parse
( Context : in out Lexer'Class;
Code : in out Source_Type;
Result : out Argument_Type
);
is used to scan the source Code. Upon successful completion Result is one of the expression. In any outcome the source cursor indicates how far the expression parsing has managed to proceed. Parse is recursive-call safe as long as implementations of the abstract operations do not change Context and Code in an inappropriate way. It means that an implementation of an operation may in turn call Parse to get a subexpression from source if that necessary.
A type derived from Lexer has to implement the following abstract subroutines:
function Call
( Context : access Lexer;
Operation : Operation_Type;
List : Frame
) return Argument_Type is abstract;
This procedure is called to execute an operator when all its arguments become known. Ligatures and semicolons are also executed by making a call to this procedure. The parameter Operation identifies the operator being called. List is the list of the arguments. The first element of List is the first argument for Operation. The result is one of the operation. Any exception raised in Call will abort parsing and propagate out of Parse. Note that binary commutative operations for which Is_Commutative returns true are optimized, may have List larger than of just two arguments. For example: for a commutative "+" the expression A+B+C will result in one call "+"(A,B,C) instead of two: "+"("+"(A,B),C).
function Enclose
( Context : access Lexer;
Left : Operation_Type;
Right : Operation_Type;
List : Frame
) return Argument_Type is abstract;
This procedure is called to execute brackets. Brackets could be order, aggregate, array index or function calls. In the latter two cases the first argument in List is the array to be indexed or the function to be called. The parameter Left identifies the left bracket. The parameter Right does the right one. The result is one of the bracket operation. Any exception raised in Enclose will abort parsing and propagate out of Parse.
procedure Get_Blank
( Context : in out Lexer;
Code : in out Source_Type;
Got_It : out Boolean
) is abstract;
An implementation should skip everything till the next valid lexeme in the source Code. It starts from the current source position (see Get_Line and Get_Pointer) and advances it to the first non-blank character (see Set_Pointer). The parameter Got_It is set to false when the end of expression reached. This could be the end of file or a reserved keyword. Usually Get_Blank skips spaces, tabs, linefeeds and comments. Get_Blank may raise an exception to be propagated out of Parse on an error, like unclosed comment etc. There are child packages implementing different variants of Get_Blank: see Parsers.Generic_Lexer.Blanks / Ada_Blanks / Cpp_Blanks.
procedure Get_Infix
( Context : in out Lexer;
Code : in out Source_Type;
Token : out Lexical_Token;
Got_It : out Boolean
) is abstract;
An implementation should recognize a valid infix token and skip it in the source Code by advancing the source cursor (see Set_Pointer). Got_It indicates success. When Got_It is set to true, Token contains a valid infix token. That is either of:
When Got_It is false, it means that no infix operation was detected. Get_Infix may raise an exception to be propagated out of Parse on an unrecoverable error. Lexer automatically processes assumed infix operations. When it is necessary to parse expressions like 5a+1, then the multiplication is an implied operator. Lexer recognizes this case when the source pointers are equal. Implied operators are discarded at the expression end without call to On_Missing_Operand.
procedure Get_Operand
( Context : in out Lexer;
Code : in out Source_Type;
Argument : out Argument_Type;
Got_It : out Boolean
) is abstract;
An implementation should recognize a valid operand token and skip it in the source Code. Got_It indicates success. If Got_It is true then Argument contains or references to the operand such as literal, name etc. Otherwise, it means that no operand was recognized in the source. Get_Operand may raise an exception to be propagated out of Parse on an unrecoverable error. Note that syntax errors in operands should not necessarily be fatal. Argument can be set into a special value indicating a syntactically wrong operand, which would allow to continue parsing.
procedure Get_Postfix
( Context : in out Lexer;
Code : in out Source_Type;
Token : out Lexical_Token;
Got_It : out Boolean
) is abstract;
An implementation should recognize a valid postfix token and skip it in the source Code. Got_It indicates success. When Got_It is true then Token contains a valid postfix token. That is either of:
When Got_It is false, it means that no postfix operation was detected. Get_Postfix may raise an exception to be propagated out of Parse on an unrecoverable error.
procedure Get_Prefix
( Context : in out Lexer;
Code : in out Source_Type;
Token : out Lexical_Token;
Got_It : out Boolean
) is abstract;
An implementation should recognize a valid prefix token and skip it in the source Code. Got_It indicates success. When Got_It is true then Token contains a valid prefix token. That is either of:
When Got_It is false, it means that no postfix operation was detected. Get_Prefix may raise an exception to be propagated out of Parse on an unrecoverable error.
The following procedures are used for error handling. They are called from Parse to handle an exceptional state detected during expression parsing.
procedure On_Association_Error
( Context : in out Lexer;
Code : in out Source_Type;
Left : in out Operation_Type;
Right : in out Operation_Type
) is abstract;
This procedure is called when two operators sharing or associated in an argument are incompatible. The handler may modify any of the parameters Left and Right to make them compatible. In this case upon return the operation Right will be pushed onto the operation stack again with all association checks suppressed. Alternatively it may raise an exception which would then abort parsing and propagate out of Parse.
procedure On_Missing_Operand
( Context : in out Lexer;
Code : in out Source_Type;
Argument : out Argument_Type
) is abstract;
This procedure is called when an operand is expected. That is either when no expression was recognized at all or when an infix operation or comma is not followed by an operand. The handler can return the default operand into the parameter Argument. This could be a special kind of Argument_Type reserved for such cases. The expression parsing will be then continued. In some languages, like C++ it is legal to omit operands in some cases. For example by calling parameterless functions. Alternatively On_Missing_Operand may raise an exception which would then abort parsing and propagate out of Parse.
procedure On_Missing_Operation
( Context : in out Lexer;
Code : in out Source_Type;
Modifier : Operation_Type;
Token : out Lexical_Token;
Got_It : out Boolean
) is abstract;
This procedure is called when an operation expected after a premodifier was not found there. The parameter Modifier identifies it. The handler may ignore the modifier and continue parsing the expression as if there where no modifier by setting Got_It to false. It may simulate an operation by setting it to true and placing the operation token into Token. Alternatively it may raise an exception which would then abort parsing propagating out of Parse.
procedure On_Missing_Right_Bracket
( Context : in out Lexer;
Code : in out Source_Type;
Left : in out Operation_Type;
Right : out Operation_Type
) is abstract;
This procedure is called when the lexer detects an unclosed left bracket by finishing expression parsing. The handler can modify the left bracket specified by the parameter Left and should specify a suggested right one in the parameter Right. Alternatively it may raise an exception which would then abort parsing and propagate out of Parse.
procedure On_Postmodifier
( Context : in out Lexer;
Code : in out Source_Type;
Operation : in out Operation_Type;
Modifier : Operation_Type;
Got_It : out Boolean
);
procedure On_Postmodifier
( Context : in out Lexer;
Code : in out Source_Type;
Argument : in out Argument_Type;
Modifier : Operation_Type;
Got_It : out Boolean
);
These procedures are called to process postmodifiers. The parameter Modifier is the operation associated with the modifier. A postmodifier can be applied to either an operation or an argument it follows. The parameter Operation / Argument refers to the thing to modify. The procedure may observe it and change it. The parameter Got_It is set to true to indicate that the modifier was successfully processed. It is set to false to finish parsing in which case it recommended to call Reset_Pointer (Code) to bring source back to the position before the modifier. For an operation modifier this will cause a call to the On_Missing_Operand handler if that is not a postfix operation. On_Postmodifier may raise an exception which would then abort parsing. The default implementation returns the modifier back and sets Got_It to false.
procedure On_Premodifier
( Context : in out Lexer;
Code : in out Source_Type;
Token : in out Lexical_Token;
Modifier : Operation_Type;
Got_It : out Boolean
);
This procedure is called to process a premodifier. The parameter Modifier is the operation associated with the modifier. A premodifier is applied to the operation it precedes. The operation is specified by the parameter Token. The handler sets Got_It to true to indicate that Token was modified as necessary. When Got_It is set to false then the operation is discarded and parsing proceeds either to the next context allowing the modifier to be applied to an operation of another class or by discarding the modifier. That means:
On_Premodifier may raise an exception which would then abort parsing. The default implementation returns the modifier back and sets Got_It to false.
procedure On_Unexpected
( Context : in out Lexer;
Code : in out Source_Type;
Right : Operation_Type
);
This procedure is called when the lexer meets an unexpected comma, right bracket, ligature or semicolon specified by the parameter Right. The default implementation returns the unexpected delimiter back and then tries to complete the expression evaluation, so that the lexer will stop at the delimiter and return the expression result. This behavior is useful when expressions bounded by commas or brackets are parsed in case of nested expressions or descending code parsers. An override may raise an exception which would then abort parsing and propagate out of Parse.
procedure On_Wrong_Comma
( Context : in out Lexer;
Code : in out Source_Type;
Left : in out Operation_Type;
Comma : in out Operation_Type
) is abstract;
This procedure is called when the lexer finds incompatible comma, ligature or semicolon. It can modify the left bracket (the parameter Left), the delimiter (the parameter Comma) or both to make them compatible. Upon return Comma will be pushed onto the operation stack again with all checks suppressed. Alternatively it may raise an exception which would then abort and propagate out of Parse.
procedure On_Wrong_Right_Bracket
( Context : in out Lexer;
Code : in out Source_Type;
Left : in out Operation_Type;
Right : in out Operation_Type
) is abstract;
This procedure is called when the lexer finds incompatible brackets. It can modify the left bracket (the parameter Left), the right one (the parameter Right) or both to make them compatible. Upon return Right will be pushed onto the operation stack again with all checks suppressed. Alternatively it may raise an exception which would then abort and propagate out of Parse.
There are three child packages of Parsers.Generic_Lexer providing lexers with Get_Blank defined to skip the following blanks and comments:
These packages are generic:
generic
type Lexer_Type (<>) is
abstract new Parsers.Generic_Lexer.Lexer with private;
package Parsers.Generic_Lexer... is ...
The generic parameter Lexer_Type is a descendant of the type Lexer defined in the parent package (an instance of Parsers.Generic_Lexer). They derive from this type a new type and override Get_Blank as described above:
type Lexer is abstract new Lexer_Type with ...
Operations are operators, brackets, commas and ligatures. Each operation denotes some action to be executed. The parser uses several data structures based operations. These are defined using the generic package Parsers.Generic_Operation as the base package:
generic
type Operation_Type is private;
type Priority_Type is private;
with function "and" (Left, Right : Operation_Type)
return Boolean is <>;
with function Is_Commutative (Left, Right : Operation_Type)
return Boolean is <>;
with function Is_Inverse (Operation : Operation_Type)
return Boolean is <>;
with function Group_Inverse (Operation : Operation_Type)
return Operation_Type is <>;
with function "<" (Left, Right : Priority_Type)
return Boolean is <>;
package Parsers.Generic_Operation is ...
The formal generic parameters define the set of operations and their priorities:
type Operation_Type is private;
Objects of Operation_Type identify an appearance of an operation in the source. Usually it is the operation identifier with the source location attached to it. If no error diagnostic required Operation_Type can directly represent operations. The following operations should be defined on Operation_Type:
function "and" (Left, Right : Operation_Type) return Boolean;
This function is used to check operation compatibility. The parameters is one in which the corresponding operations appear in the source. The result is true if Left and be associated with Right. It is safe to return true if all operators are compatible and there is only one pair of brackets. See association checks for further information.
function Is_Commutative (Left, Right : Operation_Type)
return Boolean;
This function returns true if Left and Right are either the same commutative operation or either the group or inverse operation of the same group. See commutative operators for further information. It is safe to define this function as false for any pair of operations if no commutative operator optimization required.
function Is_Inverse (Binary_Operator : Operation_Type)
return Boolean;
For binary operators on which Is_Commutative is true, this function returns true if Binary_Operator is an inverse operation of the corresponding commutative group. For example, for addition it should be false, for subtraction it should be true.
function Group_Inverse (Binary_Operator : Operation_Type)
return Operation_Type;
For the binary operators on which Is_Commutative is true this function returns the unary inverse operation of the corresponding commutative group. For example for either addition or subtraction it should return unary minus.
type Priority_Type is private;
This is the operation priority type. Higher priority operations have higher association with the operands. Priorities are ordered using "<":
function "<" (Left, Right : Priority_Type) return Boolean;
The package Parsers.Generic_Operation also defines the type Descriptor used for operation stack items:
type Descriptor_Class is (Stub, Operator, Default, Sublist, Tuple, Ligature);
type Descriptor (Class : Descriptor_Class := Stub) is record
case Class is
when Operator..Ligature =>
Operation : Operation_Type;
case Class is
when Operator..Tuple =>
Count : Natural;
case Class is
when Operator..Sublist =>
Right : Priority_Type;
when others =>
null;
end case;
when others =>
null;
end case;
when others =>
null;
end case;
end record;
Operation stack is one of the basic data structures used while parsing. It is used to store operations as they are recognized and rearrange them according to the precedence rules. When an operator is recognized in the source, it is pushed onto the stack. Before that all operators with right priorities higher or equal to the left priority of the new operator are popped. They get their arguments from another stack and push the result there. Left brackets are treated as stack stubs. They are popped when a right bracket appears. This simple algorithm allows to parse infix expressions without resorting to grammars.
The operation stack is provided by the generic child package Parsers.Generic_Operation.Generic_Stack:
generic
type Descriptor_Stack is limited private;
type Index_Type is (<>);
with function Get
( Container : Descriptor_Stack;
Index : Index_Type
) return Descriptor is <>;
with function Is_Empty (Container : Descriptor_Stack)
return Boolean is <>;
with function Mark (Container : Descriptor_Stack)
return Index_Type is <>;
with procedure Pop
( Container : in out Descriptor_Stack;
Count : Natural := 1
) is <>;
with procedure Push
( Container : in out Descriptor_Stack;
Item : Descriptor
) is <>;
with procedure Put
( Container : in out Descriptor_Stack;
Index : Index_Type;
Element : Descriptor
) is <>;
with function Top (Container : Descriptor_Stack)
return Descriptor is <>;package Parsers.Generic_Operation.Generic_Stack is ...
The generic parameters of the package define interface of a raw stack of Descriptor items. The type Descriptor is defined in the parent package. The stack interface is same as one of Generic_Stack. In the code snippet above it is highlighted. To instantiate of Parsers.Generic_Operation.Generic_Stack one could first instantiate Generic_Stack (or else Generic_Segmented_Stack) using Descriptor for the stack item (Object_Type), then use the instance package, and finally instantiate Parsers.Generic_Operation.Generic_Stack in this context with Descriptor_Stack parameter set to the raw stack type. The result package provides higher level operation stack interface:
type Stack is abstract
new Ada.Finalization.Limited_Controlled with private;
The operation stack type is an abstract controlled type. The following primitive operations has to be implemented by derived types:
procedure Call
( Container : in out Stack;
Operation : Operation_Type;
Count : Natural
) is abstract;
This procedure is called to execute an operator when all its arguments become known. Ligatures are also executed by making a call to this procedure. The parameter Operation identifies the operator being called. Count is the number of the arguments. It is not specified where arguments are located. However, it is assumed that they are accessed in LIFO order. Binary commutative operations for which Is_Commutative returns true are optimized, so that one Call is used instead of a sequence of calls in cases like A+B+C. Which will result in "+"(A,B,C) instead of "+"("+"(A,B),C).
procedure Enclose
( Container : in out Stack;
Left : Operation_Type;
Right : Operation_Type;
Count : Natural
) is abstract;
This procedure is called to execute brackets and sublists. Brackets could be order, aggregate, array index or function calls. In the latter two cases the first argument is the array to be indexed or the function to be called. The parameter Left identifies the left bracket or sublist separator. The parameter Right does the right one. Count is the number of the arguments. Note one extra argument for array indices and function calls.
The following operations are defined on operation stacks:
function Is_Empty (Container : Stack'Class) return Boolean;
This function returns true if the raw stack is empty.
This procedure cleans the stack to remove the side-effects of a call to Push_Start. It is used upon an unrecoverable expression evaluation errors.
This procedure is called when the right margin of the expression reached. This can be a source end or a reserved keyword. It also can be an extra delimiter (see Unexpected_Comma, Unexpected_Right_Bracket exceptions). After successful completion the stack is returned to its state before the call to Push_Start. Missing_Right_Bracket is propagated when some left brackets of the expression remain open. A handler should either close them using Push_Right_Bracket and then try Push_End again or call Push_Abort.
This procedure pushes a stub onto the stack. A stub is removed by either a successful call to Push_End or by a call to Push_Abort. Parsing an expression starts with a call to Push_Start and ends by either Push_End (normal completion) or Push_Abort (abnormal completion). The operation stack is safe for recursive calls, so the same stack can be used for parsing nested expressions.
procedure Push_Binary
( Container : in out Stack'Class;
Operation : Operation_Type;
Left : Priority_Type;
Right : Priority_Type;
Unchecked : Boolean := False;
Explicit : Boolean := True
);
This procedure is called when a binary infix operator is recognized in the source. The parameter Operation identifies the operator. Left and Right are the operator's priorities. Association_Error is propagated on an incompatible operator on the left. See association checks for further information. No checks made if the parameter Unchecked is set to true. The parameter Explicit should be false if the operator was assumed in place of a missing one.
procedure Push_Comma
( Container : in out Stack'Class;
Operation : Operation_Type;
Comma : Boolean;
Unchecked : Boolean := False
);
This procedure is called when either a comma or a ligature is recognized in the source. When a plain comma matches the left bracket, it increases the number of arguments the list in the brackets has. A ligature does not increase the number of arguments, but binds two arguments it separates. Ligatures can be viewed as binary non-commutative operations which may appear only within brackets and have no priority. Call is applied to to execute a ligature. The parameter Comma is true when Operation identifies a comma, and false if it does a ligature. Unexpected_Comma is propagated if there is no any left bracket to match. Wrong_Comma_Type is propagated when the left bracket does not match. See association checks for further information. No checks made if the parameter Unchecked is set to true.
procedure Push_Left_Bracket
( Container : in out Stack'Class;
Operation : Operation_Type
);
This procedure is called when a left order bracket or a left bracket of an aggregate is detected. Operation identifies the bracket. Do not confuse them with the left brackets of array indices and function calls.
procedure Push_Left_Bracket
( Container : in out Stack'Class;
Operation : Operation_Type;
Left : Priority_Type;
Unchecked : Boolean := False
);
This procedure is called when a left bracket of an array index or a function call is recognized in the source. Operation identifies the bracket. Left is the left priority of the bracket. Association_Error is propagated upon an incompatible operator on the left. See association checks for further information. No checks made if the parameter Unchecked is set to true.
procedure Push_Right_Bracket
( Container : in out Stack'Class;
Operation : Operation_Type;
Unchecked : Boolean := False
);
This procedure is called to process a right bracket of any kind. The parameter Operation specifies the bracket. Unexpected_Right_Bracket is propagated when there is no any left bracket to match. Wrong_Right_Bracket_Type is propagated when the left bracket does not match.
procedure Push_Postfix
( Container : in out Stack'Class;
Operation : Operation_Type;
Left : Priority_Type;
Right : Priority_Type;
Unchecked : Boolean := False
);
This procedure is called when a postfix unary operator is detected. The parameter Operation identifies the operator. Left and Right are the operator's priorities. Association_Error is propagated on an incompatible operator on the left. No checks made if the parameter Unchecked is set to true.
procedure Push_Prefix
( Container : in out Stack'Class;
Operation : Operation_Type;
Left : Priority_Type;
Right : Priority_Type;
Unchecked : Boolean := False
);
This procedure is called when a prefix unary operator is detected. The parameter Operation identifies the operator. Left and Right are the operator's priorities. Association_Error is propagated on an incompatible operator on the left. No checks made if the parameter Unchecked is set to true.
procedure Push_Semicolon
( Container : in out Stack'Class;
Operation : Operation_Type;
Class : Semicolon_Class;
Priority : Priority_Type;
Unchecked : Boolean := False
);
This procedure is used to process a sublist separator (semicolon). The paramter Operation identifies the semicolon and is used in Call when the operands: items of a sublist become all known. The parameter Class specifies the semicolon type. Priority is the association priority. Unexpected_Comma is propagated if there is no any left bracket to match. Association_Error is propagated on an operation association error, Wrong_Comma_Type does on bracket error. See association checks for further information. No checks made if the parameter Unchecked is set to true.
procedure Replace
( Container : in out Stack'Class;
Replacement : Descriptor
);
This procedure can be used to replace an item (operation descriptor) on the stack top. Immediately after a call to Push_Start, the stack is semantically empty and contains a stub, which should never be replaced by any descriptor of other type. Constraint_Error is propagated when Container is physically empty.
function Top (Container : Stack'Class) return Descriptor;
This function returns the stack top. Constraint_Error is propagated when the raw stack is empty.
The child generic package Parsers.Generic_Operation.Segmented_Stack instantiates Parsers.Generic_Operation.Generic_Stack using the segmented stacks from the package Generic_Segmented_Stack. The package has the following generic parameters:
generic
Segment_Size : Positive := 128;
Minimal_Size : Positive := 64;
Increment : Natural := 50;
package Parsers.Generic_Operation.Segmented_Stack is ...
These parameters controls stack allocation (see). The package instantiates Parsers.Generic_Operation.Generic_Stack under the name Operation:
package Operation is new Generic_Stack (...);
So the stack type can be denoted as instance-name.Operation.Stack, where instance-name is the name of under which Parsers.Generic_Operation.Segmented_Stack is instantiated.
The following example illustrates a direct use of the operation stack without source parsing. That is when an external lexer is used for lexical analysis. In this case the operation stack can be used as part of syntax analysis dealing with operation association. The example starts from defining the operations, their priorities and finally instantiates Parsers.Generic_Operation.Segmented_Stack providing operation stacks.
File operation_stack_expressions.ads:with
Generic_Segmented_Stack; with Parsers.Generic_Operation.Segmented_Stack; package Operation_Stack_Expressions is -- -- Integer_Stack -- Stacks of integers to keep arguments -- package Integer_Stack is new Generic_Segmented_Stack ( Index_Type => Integer, Object_Type => Integer, Null_Element => 0 ); -- -- Operations -- The set of operations -- type Operations is (Add, Mul, Inc, Left_Bracket, Right_Bracket); function "and" (Left, Right : Operations) return Boolean; function Is_Commutative (Left, Right : Operations) return Boolean; function Is_Inverse (Operation : Operations) return Boolean; function Group_Inverse (Operation : Operations) return Operations; -- -- Priorities -- The operation priorities -- type Priorities is range 1..10; -- -- Raw_Descriptors -- The raw operation stack descriptors -- package Raw_Descriptors is new Parsers.Generic_Operation (Operations, Priorities); -- -- Descriptor_Stacks -- Operation stack based on raw descriptors -- package Descriptor_Stacks is new Raw_Descriptors.Segmented_Stack; -- -- Use the package of operation stacks deployed there -- use Descriptor_Stacks.Operation; -- -- Expression_Stack -- Derived from abstract operation stack to -- provide implementation of operation calls. -- type Expression_Stack is new Stack with record Data : Integer_Stack.Segmented_Stack.Stack; end record; -- -- Call -- Overrides to implement operators -- procedure Call ( Stack : in out Expression_Stack; Operation : Operations; Count : Natural ); -- -- Enclose -- Overrides to implement brackets -- procedure Enclose ( Stack : in out Expression_Stack; Left : Operations; Right : Operations; Count : Natural ); end Operation_Stack_Expressions; |
Here Integer_Stack is an instance of Generic_Segmented_Stack to keep arguments (integers). The type Operations is the set of defined operations: addition, multiplication, post-increment, left and right order brackets. The function "and" is provided for association checks. Is_Commutative and Is_Inverse always return false. Group_Inverse may return anything, it will never be called. The type Priorities defines the operation priorities. Raw_Descriptors is an instance of Parsers.Generic_Operation based on Operation and Priority. At this point Parsers.Generic_Operation.Segmented_Stack can be instantiated. Desciptor_Stacks is the instance which provides the abstract operation stack. Expression_Stack is derived from it. It has one additional data member of Integer_Stack type, that will keep the arguments of the operations. The abstract procedures Call and Enclose are overridden to implement the operations semantic.
File operation_stack_expressions.adb:with Ada.Text_IO;
use Ada.Text_IO; package body Operation_Stack_Expressions is use Integer_Stack.Segmented_Stack; function "and" (Left, Right : Operations) return Boolean is begin return True; end "and"; function Is_Commutative (Left, Right : Operations) return Boolean is begin return False; end Is_Commutative; function Is_Inverse (Operation : Operations) return Boolean is begin return False; end Is_Inverse; function Group_Inverse (Operation : Operations) return Operations is begin raise Program_Error; return Inc; end Group_Inverse; procedure Call ( Stack : in out Expression_Stack; Operation : Operations; Count : Natural ) is L, R : Integer; begin if Count > 0 then R := Top (Stack.Data); Pop (Stack.Data); end if; if Count > 1 then L := Top (Stack.Data); Pop (Stack.Data); end if; case Operation is when Add => Push (Stack.Data, L + R); when Mul => Push (Stack.Data, L * R); when Inc => Push (Stack.Data, R + 1); when others => raise Constraint_Error; end case; end Call; procedure Enclose ( Stack : in out Expression_Stack; Left : Operations; Right : Operations; Count : Natural ) is begin null; end Enclose; end Operation_Stack_Expressions; |
Implementation of Call is straightforward it gets arguments from the stack evaluates the operation and pushes the result back. Enclose need nothing to do because order brackets is do not change the argument. Now the operation stack is ready to use:
File test_operation_expressions.adb:with
Operation_Stack_Expressions; use Operation_Stack_Expressions; procedure Test_Operation_Stack is use Operation_Stack_Expressions.Raw_Descriptors; use Operation_Stack_Expressions.Descriptor_Stacks.Operation; use Operation_Stack_Expressions.Integer_Stack.Segmented_Stack; Expression : Expression_Stack; begin -- 1 + (2 + 3 + 4 * 5)++ + 6 * 7 + 8++ Push_Start (Expression); Push (Expression.Data, 1); -- 1 Push_Binary (Expression, Add, 5, 6); -- + Push_Left_Bracket (Expression, Left_Bracket); -- ( Push (Expression.Data, 2); -- 2 Push_Binary (Expression, Add, 5, 6); -- + Push (Expression.Data, 3); -- 3 Push_Binary (Expression, Add, 5, 6); -- + Push (Expression.Data, 4); -- 4 Push_Binary (Expression, Mul, 7, 8); -- * Push (Expression.Data, 5); -- 5 Push_Right_Bracket (Expression, Right_Bracket); -- ) Push_Postfix (Expression, Inc, 9, 10); -- ++ Push_Binary (Expression, Add, 5, 6); -- + Push (Expression.Data, 6); -- 6 Push_Binary (Expression, Mul, 7, 8); -- * Push (Expression.Data, 7); -- 7 Push_Binary (Expression, Add, 5, 6); -- + Push (Expression.Data, 8); -- 8 Push_Postfix (Expression, Inc, 9, 10); -- ++ Push_End (Expression); if Top (Expression.Data) /= 78 then raise Constraint_Error; end if; end Test_Operation_Stack; |
This procedure evaluates 1 + (2 + 3 + 4 * 5)++ + 6 * 7 + 8++ just by pushing arguments and the operations onto the corresponding stacks. Push_End finishes the expression evaluation and the stack of arguments contains the only one item, the expression result.
The argument stack is the basic data structure used for parsing. It contains the expression arguments, that is the operands of operations and their results. When expression is interpreted to immediately obtain its result, arguments are usually just values. When expression is compiled into some intermediate representation, arguments are leaves and nodes of the parsing tree.
The generic package Parsers.Generic_Argument defines the base abstract type for argument stacks:
generic
type Argument_Type is private;
package Parsers.Generic_Argument is ...
The generic parameter Argument_Type identifies appearance of an argument in the source. Usually it is the argument and a source location link. The package defines:
type Argument_No is new Positive;
type Frame is array (Argument_No range <>) of Argument_Type;
The type Frame is used to pass argument lists to the expression operations.
type Stack is abstract
new Ada.Finalization.Limited_Controlled with private;
The abstract base type of argument stacks. An implementation should provide the following abstract subprograms:
function Is_Empty (Container : Stack) return Boolean;
This function returns true if the current stack fragment is empty.
This procedure creates a new stack fragment. A stack fragment represents an independent argument stack. No arguments below mark can be accessed in any way until Release is called.
procedure Pop
( Container : in out Stack;
List : in out Frame
) is abstract;
This procedure pops an argument frame from the stack. The arguments fill the list provided by the parameter List. The number of arguments is defined by List'Length. Constraint_Error is propagated when Container does not contain enough arguments in the current fragment (above the last mark).
procedure Push
( Container : in out Stack;
Argument : Argument_Type
) is abstract;
This procedure pushes one argument onto the stack.
This procedure should be called for each call to Mark to remove the stack fragment created by the mark. If there are any arguments on the stack pushed after the mark, they are removed. Constraint_Error is propagated when the stack does not contain any mark.
The generic child package Parsers.Generic_Argument.Segmented_Stack provides an implementation of argument stack using segmented stacks from the package Generic_Segmented_Stack. The package has the following generic parameters:
generic
Frame_Segment_Size : Positive := 128;
Frame_Minimal_Size : Positive := 64;
Frame_Increment : Natural := 50;
Stub_Minimal_Size : Positive := 64;
Stub_Increment : Natural := 50;
package Parsers.Generic_Argument.Segmented_Stack is ...
The parameters Frame_Segment_Size, Frame_Minimal_Size and Frame_Increment controls allocation of stack segments. The parameters Stub_Minimal_Size and Stub_Increment controls a Generic_Unbounded_Array used to keep stack stubs (fragments bounds). The type Stack defined in the package:
type Stack is new Parsers.Generic_Argument.Stack with private;
The package Parsers.Ada provides a full Ada 95 expression analyzer. The analyzer has the type Ada_Expression defined in the package. The analyzer recognizes an Ada expression in the source and stops at its end. The type Operations defines the Ada operations:
Name (Operation) |
Ada 95 |
Comment. References to the corresponding sections of Ada Reference Manual are given in round brackets where appropriate |
Logical_And | and | Logical operators and short-circuit
control
forms (4.5.1). Logical and/or are implemented as
premodifiers turning into infix operators when no short-circuit suffix
defined as a postmodifier follows. As the operators
all of them are declared
commutative in the sense that the adjacent operators of same type will be
merged into one as in the case ofA and then B and then CParsed to And_Then (A, B, C). All these operators have the priority level 2 (both the left and the right ones). |
Logical_Or | or | |
Logical_Xor | xor | |
And_Then | and then | |
Or_Else | or else | |
EQ | = | Relational operators and membership tests (4.5.2). The operator not in is implemented as a premodifier not applied to the following operator in. Both membership tests are defined as an infix operators. All these operators have the priority level 3. |
NE | /= | |
LT | < | |
LE | <= | |
GE | > | |
GT | >= | |
Member | in | |
Not_Member | not in | |
Add | + | Binary adding operators (4.5.3). The operators + and - are declared commutative with the group inverse Add_Inv. The infix adding operators have the priority level 4. |
Sub | - | |
Concatenate | & | |
Plus | + | Unary adding operators (4.5.4). Association checks prevent multiple unary operations association with themselves, adding, multiplying and highest precedence operators. The priority level is 5. |
Minus | - | |
Mul | * | Multiplying operators (4.5.5). The operators * and / are declared commutative with the group inverse Mul_Inv. The priority level is 6. |
Div | / | |
Modulus | mod | |
Remainder | rem | |
Pow | ** | Highest precedence operators (4.5.6). These cannot be associated with themselves. They have the priority 7. |
Abs_Value | abs | |
Logical_Not | not | |
Allocator | new | Allocator (4.8). An allocator is treated as a prefix operator with priority 8. |
Attribute | ' | Attribute is treated as an infix operator with the priority 9. |
Alternative | | | Separates choices in aggregates, treated as an infix operator. The operator is defined commutative to keep lists of alternatives whole. The operator has the priority 0. |
Ellipsis | .. | Used in slices (4.1.2), treated as an infix operator with the priority 1. |
Component | . | Component selector (4.1.3), treated as an infix operator. The operator is defined commutative to merge nested component selectors in one list. |
Left_Bracket | ( | Order and aggregate brackets (4.3) |
Left_Index | ( | Indexed components (4.1.1) and function calls (6.4). It has the left priority 9, lower than one of component selector and same as one of attributes. |
Right_Bracket | ) | Right bracket |
Comma | , | In brackets of all sorts |
Associate | => | Named association, treated as a ligature |
Extend | with | Extension aggregate separator, treated as a semicolon introducing a sublist |
The identifiers are not checked against the reserved keywords. That can be done on later stages when necessary. Similarly the attributes names can be any expressions. The association checks can be relaxed by overriding corresponding error handlers. One might also wish to override the handlers to provide a more advanced error messaging mechanism than exception information.
The package defines the type Node a tagged abstract base type of all parsing tree nodes:
type Node is abstract tagged limited null record;
function Image (Item : Node) return String is abstract;
type Node_Ptr is access Node'Class;
for Node_Ptr'Storage_Pool use Tree_Pool;
The nodes of the tree are allocated on a stack. The stack is provided by a stack pool. This allows to remove the whole tree by deallocating its first allocated node or any other pool object allocated before it. Tree_Pool is the stack storage pool used for this. Nodes have the primitive operation Image used for dumping a parsing tree. The following concrete types are derived from Node:
Parsing tree nodes | Comment |
Character_Literal | Character expression term |
Expression | An non-terminal node. The dicriminant Count identifies the number of successors. The field Operation is the operation associated with the node. The field Operands is the list of successors. |
Identifier | Identifier expression term |
Integer_Literal | Universal_Integer expression term |
Mark | Used as a stub for the stack pool to mark its state for future stack release |
Missing_Operand | An expression term used where no operand was found |
Real_Literal | Universal_Real expression term |
String_Literal | String expression term |
All expression nodes have the field Location specifying its source location.
The root directory contains only the packages required for software use. Tests and examples are located in the subdirectory test.
The following table describes the packages provided by the software.
Package | Provides | |||
APQ | Implementation packages used for Persistent.APQ | |||
Common | Thicker bindings to the APQ abstracting away data base specific details | |||
Keys | Persistent objects identification in APQ | |||
Sets | Sets of persistent objects keys | |||
Links | Management of persistent objects dependencies in APQ | |||
Deposit_Handles | An instantiation of Object.Archived.Handle | |||
Generic_Unbounded_Array | The type Unbounded_Array and operations on it | |||
Generic_Unbounded_Ptr_Array | The type Unbounded_Ptr_Array | |||
Generic_Map | The type Map | |||
Generic_Set | The type Set | |||
Generic_Stack | The type Stack, a generic stack | |||
Generic_Segmented_Stack | Stacks built of segments of same size (generic) | |||
GNU.DB.CLI.API | Thick bindings to GNADE ODBC | |||
Keys | Persistent objects identification in ODBC | |||
Edit | String conversions for objects keys | |||
Sets | Sets of persistent objects keys for ODBC | |||
Links | Management of persistent objects dependencies in ODBC | |||
Object | The type Entity | |||
Archived | The types Deposit, Backward_Link and Deposit_Container for handling persistency | |||
Handle | The type Handle (to persistent objects) and operations on it. The package is generic | |||
Iterators | The type References_Iterator, an iterator of object's references | |||
Lists | The type Deposit_List, a list of persistent objects | |||
Sets | The type Deposit_Set, a set of persistent objects | |||
Handle | The type Handle and operations on it | |||
Generic_Bounded_Array | The type Bounded_Array, a generic bounded array of objects | |||
Generic_Unbounded_Array | The type Unbounded_Array, a generic unbounded array of objects | |||
Generic_Set | The type Set, a generic set of objects | |||
Parsers | The base package of syntax analyzers | |||
Ada | The type Ada_Expression, an implementation of Ada 95 expression parser | |||
Generic_Argument | The type Stack, the abstract base for argument stacks (generic) | |||
Segmented_Stack | An implementation of argument stacks based on segmented stacks | |||
Generic_Lexer | The type Lexer, the abstract base for expression analyzers (generic) | |||
Ada_Blanks | An analyzer supporting Ada 95 comments and blanks | |||
Blanks | An analyzer supporting blanks of spaces and formatting characters HT, LF, CR, VT, FF | |||
Cpp_Blanks | An analyzer supporting C++ comments and blanks | |||
Generic_Operation | The expression operations and their descriptors (generic) | |||
Generic_Stack | The type Stack, the abstract base of operation stacks | |||
Segmented_Stack | An implementation of operation stacks based on segmented stacks | |||
Generic_Source | An abstract interface of code source (generic) | |||
Get_Text | Matching a text in the code source (generic procedure) | |||
Get_Token | Matching a table against the code source (generic procedure) | |||
Keywords | Keyword-matching generated from an enumeration type (generic) | |||
Text_IO | Debugging output for source code cursors | |||
Generic_Token | The table tokens for table-driven analyzers (generic) | |||
Generic_Token_Lexer | The type Lexer, the abstract base for table-driven analyzers | |||
Segmented_Lexer | An implementation of table-driven analyzers based on segmented argument and operation stacks | |||
Multiline_Source | The type Source, the abstract base for sources having multiple lines of code | |||
Text_IO | The type Source, a code source based on Ada.Text_IO | |||
String_Source | The type Source, a single string code source | |||
Persistent | Abstract persistent storage interface | |||
APQ | APQ implementation of persistent storage | |||
Data_Bank | Abstract persistent storage with objects identified by keys | |||
Index | Index for an abstract persistent storage | |||
Indexed | Abstract indexed persistent storage | |||
Reference | Persistent storage references for proxy objects | |||
Handle | Handles to persistent storage objects | |||
Factory | Persistent storage factory | |||
ODBC | ODBC implementation of persistent storage | |||
Stack_Storage | The stack pools implemented by the type Pool | |||
Mark_And_Release | Mark and release storage pools of limited controlled objects |
The packages related to tables management are described in Tables.
For the packages dealing with strings editing see Strings edit.
The subdirectory test contains various tests and examples:
Compilation unit | Executable | Provides | Requires |
test_APQ_persistence | yes | APQ persistent storage test | APQ |
test_APQ_session | no | A package used to open an APQ session. Queries for connection parameters using text I/O | APQ |
test_association | yes | Test for infix operation associations. Uses the file test_association.txt as the source to parse. | |
test_association_expression | no | Used by test_association | |
test_handles | yes | Test for handles to objects | |
test_my_string | no | An implementation of varying strings using objects, illustrative example. Used in test_handles. | |
test_my_string.handle | no | An implementation of handles to string objects, example. Used in test_handles. | |
test_object | no | A test object used in test_handles. | |
test_object.handle | no | Handles to test objects. Used in test_handles. | |
test_object.handle_array | no | Arrays of test objects. Used in test_handles. | |
test_ODBC_persistence | yes | ODBC persistent storage test, example. | GNADE |
test_ODBC_session | no | A package used to open ODBC session. Queries for connection parameters using text I/O | GNADE |
test_persistent_file_storage | no | An implementation of persistent storage using direct I/O, example | |
test_persistent_storage | yes | Persistent storage test, example. | |
test_persistent_tree | no | An implementation of tree nodes as persistent objects, example | |
test_set | no | A set of string objects declared in test_my_string.handle. Used in test_handles. | |
test_stack | yes | Test for mark and release stacks | |
test_stack_item | no | Used in test_stack |
Tests that do not require nether GNADE not APQ can be build using the following command:
gnatmake -I.. <file-name>
Note that <file-name> should refer an *.adb file. For example:
gnatmake -I.. test_handles.adb
The tests depending on GNADE will require an installation of GNADE. On a Linux box they could be built like:
gnatmake -I.. -I/usr/local/gnade/include test_odbc_persistence.adb
Under Windows it might be (one command line):
gnatmake -I.. -Ic:/gnade/win32-include test_odbc_persistence.adb -largs -Lc:/gnade/win32-lib -lgnadeodbc -LC:/gnade/lib/win32 -lodbc32
Refer to GNADE documentation for more information. GNADE is under development, so paths might change.
The tests that use APQ should require no additional parameters. At least under Windows APQ is fully integrated with GNAT compiler. If you use a source distribution of APQ, it would probably be required to specify additional directory paths.
The subdirectory parser-examples contains examples of using parsers. It has the following subdirectories:
The following command in the corresponding subdirectory can be used to build an parser example:
gnatmake -I../.. <file-name>
Changes to the version 2.0:
Changes to the version 1.10:
Changes to the version 1.9:
Changes to the version 1.8:
Changes to the version 1.7:
Changes to the version 1.6:
Changes to the version 1.5:
Changes to the version 1.4:
Changes to the version 1.3:
Changes to the version 1.2:
Changes to the version 1.1:
Changes to the version 1.0:
1. Objects and handles (smart pointers)
1.1. Objects
1.2. Handles to objects
1.3. An example of use
1.4. Bounded arrays of objects
1.5. Unbounded arrays of objects
1.6. Sets of objects
1.7. Persistent objects
1.8. Handles to persistent objects
1.9. Persistent storage implementation example
1.10. Abstract persistent storage
1.11. Handles to persistent storage
1.12. Persistent storage factory
1.13. Persistent storage implementations
1.14. Implementation of a new persistent storage
2. Sets and maps
2.1. Sets
2.2. Maps
3. Unbounded arrays
4. Unbounded arrays of pointers
5. Stacks
5.1. Stacks based on abstract arrays
5.2. Segmented stacks
6. Pools
6.1. Stack pool
6.2. Mark and release pool for controlled objects
7. Parsers
7.1. Example first, a small calculator
7.2. Basic considerations
7.3. The base package
7.4. Sources
7.5. Tokens
7.6. Lexers
7.7. Operations
7.8. Arguments
7.9. Parsing tree example. Ada 95 expression parser
8. Packages
8.1. Source packages
8.2. Tests and examples
Tables (a separate document)
Strings edit (a separate document)
9. Changes log
10. Table of contents