!standard A.18.18(1) 10-01-15 AI05-0136-1/06 !class Amendment 09-02-04 !status ARG Approved 9-0-3 09-11-07 !status work item 09-02-04 !status received 09-01-27 !priority Medium !difficulty Medium !subject Multiway tree container !summary Add a multiway tree container. !problem One of the most important advantages of using predefined containers over hand-building data structures is that the memory management is handled by the container. This is especially an advantage when the elements are indefinite types such as class-wide types. However, Ada 2005 does not provide any way to create a tree structure out of existing containers without requiring explicit memory management. Consider representing a multiway tree using only the existing containers. Clearly, the list of child nodes of an element can easily be represented as a list container. That list would have to be part of the element as inserted in the tree - which of course is also the type of the elements in the list. Thus, the container has to appear in its own elements; implementing that necesarily includes an incomplete type and an access type (which could be anonymous). (If Ada did not require that, then the implementation of the containers would be significantly constrained; in particular, bounded forms would be incompatible with direct inclusion in the elements). The use of an access type here necessarily requires the use of allocators and thus hand-written memory management. Indeed, it probably harder to use one of the existing container in this way than it is to write the operations directly. This situation cries out for a solution, and the easiest one is to provide one or more appropriate containers. !proposal Add a multiway tree container. A multiway tree allows multiple child elements for each parent element. (Such a tree is sometimes known as parent-child-sibling tree.) A multiway tree is the most general tree; many problems map directly into it (see examples for a number of such cases). !wording Add the multiway tree to the list of containers defined in A.18(4.j). A.18.xx The Package Containers.Multiway_Trees [Editor's Note: What should the actual clause numbers be for these containers? A natural insertion would require renumbering all of the indefinite containers.] The language-defined generic package Containers.Multiway_Trees provides private types List and Cursor, and a set of operations for each type. A multiway tree container is well-suited to represent nested structures. A multiway tree container object manages a tree of internal *nodes*, each of which contains an element and pointers to the parent, first child, last child, next (successor) sibling, and previous (predecessor) sibling internal nodes. A cursor designates a particular node within a tree (and by extension the element contained in that node, if any). A cursor keeps designating the same node (and element) as long as the node is part of the container, even if the node is moved in the container. A *subtree* is a particular node (which *roots the subtree*) and all of its child nodes (including all of the children of the child nodes, recursively). There is a special node, the *root*, which is always present and has no associated element value. The root node provides a place to add nodes to an otherwise empty tree and represents the bottom of the tree. A node that has no children is called a *leaf node*. The *ancestors* of a node are the parent node, the parent of the parent node, and so on until a node with no parent is reached. Similarly, the *descendants* of a node are the child nodes, the children of each child node, and so on. The nodes of a subtree could be visited in several different orders. For a *depth-first order*, the last step of visiting a node is to visit the nodes of its child list in order, recursively. AARM Ramification: For the depth-first order, when each child node is visited, the child list of the child node is visited before the next sibling of the child node is visited. [Editor's note: I didn't define a breadth-first order. Such an order requires two trips through the child list (one to visit the child nodes, and once to to visit the children of those nodes. It also doesn't make sense for debugging/output applications, so it wasn't clear that it was very useful. Thus I didn't use it in any of the defined operations.] Static Semantics The generic library package Containers.Multiway_Trees has the following declaration: generic type Element_Type is private; with function "=" (Left, Right : Element_Type) return Boolean is <>; package Ada.Containers.Multiway_Trees is pragma Preelaborate(Multiway_Trees); pragma Remote_Types(Multiway_Trees); [Editor's note: Remote_Types to be consistent with AI05-0084-1.] type Tree is tagged private; pragma Preelaborable_Initialization(Tree); type Cursor is private; pragma Preelaborable_Initialization(Cursor); Empty_Tree : constant Tree; No_Element : constant Cursor; function Equal_Subtree (Left_Position : Cursor; Right_Position: Cursor) return Boolean; function "=" (Left, Right : Tree) return Boolean; function Is_Empty (Container : Tree) return Boolean; function Node_Count (Container : Tree) return Count_Type; function Subtree_Node_Count (Position : Cursor) return Count_Type; function Depth (Position : Cursor) return Count_Type; function Is_Root (Position : Cursor) return Boolean; function Is_Leaf (Position : Cursor) return Boolean; function Root (Container : Tree) return Cursor; procedure Clear (Container : in out Tree); function Element (Position : Cursor) return Element_Type; procedure Replace_Element (Container : in out Tree; Position : in Cursor; New_Item : in Element_Type); procedure Query_Element (Position : in Cursor; Process : not null access procedure (Element : in Element_Type)); procedure Update_Element (Container : in out Tree; Position : in Cursor; Process : not null access procedure (Element : in out Element_Type)); procedure Assign (Target : in out Tree; Source : Tree); function Copy (Source : Tree) return Tree; procedure Move (Target : in out Tree; Source : in out Tree); procedure Delete_Leaf (Container : in out Tree; Position : in out Cursor); procedure Delete_Subtree (Container : in out Tree; Position : in out Cursor); procedure Swap (Container : in out Tree; I, J : in Cursor); function Find (Container : Tree; Item : Element_Type) return Cursor; function Find_In_Subtree (Container : Tree; Item : Element_Type; Position : Cursor) return Cursor; function Ancestor_Find (Container : Tree; Item : Element_Type; Position : Cursor) return Cursor; function Contains (Container : Tree; Item : Element_Type) return Boolean; function Has_Element (Position : Cursor) return Boolean; procedure Iterate (Container : in Tree; Process : not null access procedure (Position : in Cursor)); function Child_Count (Parent : Cursor) return Count_Type; function Child_Depth (Parent, Child : Cursor) return Count_Type; procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Position : out Cursor; Count : in Count_Type := 1); procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; Position : out Cursor; Count : in Count_Type := 1); procedure Prepend_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Append_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Delete_Children (Container : in out Tree; Parent : in Cursor); procedure Copy_Subtree (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in Cursor); procedure Splice_Subtree (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree; Position : in out Cursor); procedure Splice_Subtree (Container: in out Tree; Parent : in Cursor; Before : in Cursor; Position : in Cursor); procedure Splice_Children (Target : in out Tree; Target_Parent : in Cursor; Before : in Cursor; Source : in out Tree; Source_Parent : in Cursor); procedure Splice_Children (Container : in out Tree; Target_Parent : in Cursor; Before : in Cursor; Source_Parent : in Cursor); function Parent (Position : Cursor) return Cursor; function First_Child (Parent : Cursor) return Cursor; function First_Child_Element (Parent : Cursor) return Element_Type; function Last_Child (Parent : Cursor) return Cursor; function Last_Child_Element (Parent : Cursor) return Element_Type; function Next_Sibling (Position : Cursor) return Cursor; function Previous_Sibling (Position : Cursor) return Cursor; procedure Next_Sibling (Position : in out Cursor); procedure Previous_Sibling (Position : in out Cursor); procedure Iterate_Children (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); procedure Reverse_Iterate_Children (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); private ... -- not specified by the language end Ada.Containers.Multiway_Trees; The actual function for the generic formal function "=" on Element_Type values is expected to define a reflexive and symmetric relationship and return the same result value each time it is called with a particular pair of values. If it behaves in some other manner, the functions Find, Reverse_Find, Equal_Subtrees, and "=" on tree values return an unspecified value. The exact arguments and number of calls of this generic formal function by the functions Find, Reverse_Find, Equal_Subtrees, and "=" on tree values are unspecified. The type Tree is used to represent trees. The type List needs finalization (see 7.6). Empty_Tree represents the empty Tree object. It contains only the root node (Count (Empty_Tree) returns 1). If an object of type Tree is not otherwise initialized, it is initialized to the same value as Empty_Tree. No_Element represents a cursor that designates no element. If an object of type Cursor is not otherwise initialized, it is initialized to the same value as No_Element. The predefined "=" operator for type Cursor returns True if both cursors are No_Element, or designate the same element in the same container. Execution of the default implementation of the Input, Output, Read, or Write attribute of type Cursor raises Program_Error. Tree'Write writes Tree.Count-1 elements to the stream. Tree'Read reads Tree.Count-1 elements from the stream. Some operations of this generic package have access-to-subprogram parameters. To ensure such operations are well-defined, they guard against certain actions by the designated subprogram. In particular, some operations check for "tampering with cursors" of a container because they depend on the set of elements of the container remaining constant, and others check for "tampering with elements" of a container because they depend on elements of the container not being replaced. A subprogram is said to tamper with cursors of a tree object T if: * it inserts or deletes elements of T, that is, it calls the Clear, Delete_Leaf, Insert_Child, or Delete_Children procedures with T as a parameter; or AARM To be honest: Operations which are defined to be equivalent to a call on one of these operations also are included. Similarly, operations which call one of these as part of their definition are included. * it reorders the elements of T, that is, it calls the Splice or Splice_Children procedures; or * it finalizes T; or * it calls Assign with T as the Target parameter; or * it calls the Move procedure with T as a parameter. AARM Reason: Swap copies elements rather than reordering them, so it doesn't tamper with cursors. A subprogram is said to tamper with elements of a tree object T if: * it tampers with cursors of T; or * it replaces one or more elements of T, that is, it calls the Replace_Element or Swap procedures with T as a parameter. AARM Reason: Complete replacement of an element can cause its memory to be deallocated while another operation is holding onto a reference to it. That can't be allowed. However, a simple modification of (part of) an element is not a problem, so Update_Element does not cause a problem. function Equal_Subtree (Left_Position : Cursor; Right_Position: Cursor) return Boolean; If Left_Position or Right_Position equals No_Element, propagates Constraint_Error. If the number of child nodes of the element designated by Left_Position is different than the number of child nodes of the element designated by Right_Position, the function returns False. If Left_Position designates a root node and Right_Position does not, the function returns False. If Right_Position designates a root node and Left_Position does not, the function returns False. Unless both cursors designate a root node, the elements are compared using the generic formal equality operator. If the result is False, the function returns False. Otherwise, it calls Equal_Subtree on a cursor designating each child element of the element designated by Left_Position and a cursor designated the corresponding child element of the element deisgnated by Right_Position. If any such call returns False, the function returns False; otherwise it returns True. Any exception raised during the evaluation of element equality is propagated. [Editor's note: I needed this to define "=" on Trees, as a recursive definition is the only one that makes sense.] AARM Ramification: Left_Position and Right_Position do not need to be from the same tree. AARM Implementation Note: This wording describes the canonical semantics. However, the order and number of calls on the formal equality function is unspecified for all of the operations that use it in this package, so an implementation can call it as many or as few times as it needs to get the correct answer. Similarly, a global rule (see the introduction of Annex A) says that language-defined routines are not affected by overriding of other language-defined routines. This means that no reasonable program can tell how many times Equal_Subtrees is called, and thus an implementation can call it as many or as few times as it needs to get the correct answer. Specifically, there is no requirement to call the formal equality or Equal_Subtrees additional times once the answer has been determined. function "=" (Left, Right : Tree) return Boolean; If Left and Right denote the same tree object, then the function returns True. Otherwise, it calls Equal_Subtree with a cursor designating the root node of Left and Right; the result is returned. Any exception raised during the evaluation of Equal_Subtree is propagated. AARM Implementation Note: Similar considerations apply here as apply to Equal_Subtrees. The actual number of calls performed is unspecified. function Node_Count (Container : Tree) return Count_Type; Node_Count returns the number of nodes in Container. AARM Ramification: Since all tree objects have a root node, this can never return a value of 0. Node_Count (Some_Tree) should always equal Subtree_Node_Count (Root (Some_Tree)). function Subtree_Node_Count (Position : Cursor) return Count_Type; If Position is No_Element, Subtree_Count returns 0; otherwise, Subtree_Node_Count returns the number of nodes in the subtree that is rooted by Position. function Is_Empty (Container : Tree) return Boolean; Equivalent to Count (Container) = 1. AARM Ramification: An empty tree contains just the root node. function Depth (Position : Cursor) return Count_Type; If Position equals No_Element, Depth returns 0; otherwise Depth returns the number of ancestor nodes of the element at Position (including the element itself). AARM Ramification: Depth (Root (Some_Tree)) = 1. [Editor's note: Matt suggests that we have this function as well as Child_Depth because this is the most common case. We can't default the Parent parameter in Child_Depth since the correct value depends on the container. Should we ignore the root node here and return a Depth of 0 in the previous case?] function Is_Root (Position : Cursor) return Boolean; Is_Root returns True if the Position designates the root node of the tree; and returns False otherwise. function Is_Leaf (Position : Cursor) return Boolean; Is_Leaf returns True if Position designates an element that does not have any child nodes; and returns False otherwise. AARM Ramification: Is_Leaf returns False if passed No_Element, as No_Element does not designate an element. Is_Leaf can be passed a cursor that designates the root node; Is_Leaf will return True if passed the root node of an empty tree. function Root (Container : Tree) return Cursor; Root returns a cursor that designates the root node of Container. AARM Ramification: There is always a root node, even in an empty container, so this function never returns No_Element. procedure Clear (Container : in out Tree); Removes all the elements from Container. AARM Ramification: The root node is not removed; all trees have a root node. function Element (Position : Cursor) return Element_Type; If Position equals No_Element, then Constraint_Error is propagated; if Position designates the root node of a tree, then Program_Error is propagated. Otherwise, Element returns the element designated by Position. AARM Ramification: The root node does not contain an element, so that value cannot be read or written. procedure Replace_Element (Container : in out Tree; Position : in Cursor; New_Item : in Element_Type); If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container (including if it designates the root node), then Program_Error is propagated. Otherwise Replace_Element assigns the value New_Item to the element designated by Position. procedure Query_Element (Position : in Cursor; Process : not null access procedure (Element : in Element_Type)); If Position equals No_Element, then Constraint_Error is propagated; if Position designates the root node of a tree, then Program_Error is propagated. Otherwise, Query_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated. procedure Update_Element (Container : in out Tree; Position : in Cursor; Process : not null access procedure (Element : in out Element_Type)); If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container (including if it designates the root node), then Program_Error is propagated. Otherwise Update_Element calls Process.all with the element designated by Position as the argument. Program_Error is propagated if Process.all tampers with the elements of Container. Any exception raised by Process.all is propagated. If Element_Type is unconstrained and definite, then the actual Element parameter of Process.all shall be unconstrained. procedure Assign (Target : in out Tree; Source : Tree); If Target denotes the same object as Source, the operation has no effect. Otherwise, it calls Clear (Target), and then each element of Source is assigned to a corresponding element in Target. AARM To Be Honest: The "corresponding element in Target" has a parent element that corresponds to the parent element of the Source element, and has child elements that correspond to the child elements of the Source element. function Copy (Source : Tree) return Tree; Returns a tree with the same structure as Source and whose elements are initialized from the corresponding elements of Source. procedure Move (Target : in out Tree; Source : in out Tree); If Target denotes the same object as Source, then Move has no effect. Otherwise, Move first calls Clear (Target). Then, the nodes other than the root node in Source are moved to Target (in the same positions). After Move completes, Count(Target) is the number of nodes originally in Source, and Count(Source) is 1. procedure Delete_Leaf (Container : in out Tree; Position : in out Cursor); If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container (including if it designates the root node), then Program_Error is propagated. If the element designated by position has any child elements, then Constraint_Error is propagated. Otherwise Delete_Leaf removes (from Container) the element designated by Position. Finally, Position is set to No_Element. AARM Ramification: The check on Position checks that the cursor does not belong to some other Container. This check implies that a reference to the container is included in the cursor value. This wording is not meant to require detection of dangling cursors; such cursors are defined to be invalid, which means that execution is erroneous, and any result is allowed (including not raising an exception). The root node cannot be deleted. End AARM Ramification. procedure Delete_Subtree (Container : in out Tree; Position : in out Cursor); If Position equals No_Element, then Constraint_Error is propagated. If Position does not designate an element in Container (including if it designates the root node), then Program_Error is propagated. Otherwise Delete removes (from Container) the subtree designated by Position (that is, the node designated by Position and all of the descendant nodes of that node), and Position is set to No_Element. AARM Ramification: The root node cannot be deleted. To delete the entire contents of the tree, call Clear(Container). procedure Swap (Container : in out Tree; I, J : in Cursor); If either I or J equals No_Element, then Constraint_Error is propagated. If either I or J do not designate an element in Container (including if either designates the root node), then Program_Error is propagated. Otherwise, Swap exchanges the values of the elements designated by I and J. AARM Ramification: After a call to Swap, I designates the element value previously designated by J, and J designates the element value previously designated by I. The elements position do not change; for instance, the parent node and the first child node of I are unchanged by the operation. The root nodes do not contain element values, so they cannot be swapped End AARM Ramification. AARM To be honest: The implementation is not required to actually copy the elements if it can do the swap some other way. But it is allowed to copy the elements if needed. function Find (Container : Tree; Item : Element_Type) return Cursor; Find searches the elements of Container for an element equal to Item (using the generic formal equality operator). The search starts at the root node. The The search checks the tree in a depth-first order. If no equal element is found, then Find returns No_Element. Otherwise, it returns a cursor designating the first equal element encountered. function Find_In_Subtree (Container : Tree; Item : Element_Type; Position : Cursor) return Cursor; If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate a node in Container, then Program_Error is propagated. Find_In_Subtree searches a subtree of the elements of Container for an element equal to Item (using the generic formal equality operator). The search starts at the element designated by Position. The search checks the subtree rooted by Position in a depth-first order. If no equal element is found, then Find returns No_Element. Otherwise, it returns a cursor designating the first equal element encountered. AARM Ramification: Find_In_Subtree does not check any siblings of the element designated by Position. The root node does not contain an element, and therefore it can never be returned, but it can be explicitly passed to Position. [Editor's note: The subtle difference in wording for belonging to the correct container is needed as a cursor designating the root node is allowed here, and it does not "designate an element" (anywhere), but it does "designate a node" in Container.] function Ancestor_Find (Container : Tree; Item : Element_Type; Position : Cursor) return Cursor; If Position equals No_Element, then Constraint_Error is propagated; if Position does not designate an element in Container (including if it designates the root node), then Program_Error is propagated. Otherwise, Ancestor_Find searches for an element equal to Item (using the generic formal equality operator). The search starts at the element designated by Position, and checks each ancestor proceeding toward the root of the subtree. If no equal element is found, then Ancestor_Find returns No_Element. Otherwise, it returns a cursor designating the first equal element encountered. function Contains (Container : Tree; Item : Element_Type) return Boolean; Equivalent to Find (Container, Item) /= No_Element. function Has_Element (Position : Cursor) return Boolean; Returns True if Position designates an element, and returns False otherwise. Redundant[This returns False if the cursor designates the root node or equals No_Element.] AARM To Be Honest: This function might not detect cursors that designate deleted elements; such cursors are invalid (see below) and the result of Has_Element for an invalid cursor is unspecified (but not erroneous). procedure Iterate (Container : in Tree; Process : not null access procedure (Position : in Cursor)); Iterate calls Process.all with a cursor that designates each element in Container, starting with the root node and proceeding in a depth-first order. Program_Error is propagated if Process.all tampers with the cursors of Container. Any exception raised by Process.all is propagated. AARM Ramification: Process is not called with the root node, which does not have an associated element. [Editor's note: It is a bit weird to skip the root node, but it seems better that every cursor has an associated element. The alternative that the root node is returned would require most Process implementations to test Is_Root explicitly before doing any element operations.] AARM Implementation Note: The purpose of the tamper with cursors check is to prevent erroneous execution from the Position parameter of Process.all becoming invalid. This check takes place when the operations that tamper with the cursors of the container are called. The check cannot be made later (say in the body of Iterate), because that could cause the Position cursor to be invalid and potentially cause execution to become erroneous -- defeating the purpose of the check. See Iterate for vectors (A.18.2) for a suggested implementation of the check. End AARM Implementation Note. function Child_Count (Parent : Cursor) return Count_Type; Child_Count returns the number of child nodes of the node designated by Parent. function Child_Depth (Parent, Child : Cursor) return Count_Type; If Child or Parent is equal to No_Element, then Constraint_Error is propagated. Otherwise, Child_Depth returns the number of ancestor nodes of Child (including Child itself), up to but not including Parent; in this case Program_Error is propagated if Parent is not an ancestor of Child. AARM Ramification: Program_Error is propagated if Parent and Child are nodes in different containers. Child_Depth (Root (Some_Tree), Child) + 1 = Depth (Child) as the root is not counted. End AARM Ramification. procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and Container.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. Otherwise, Insert_Child allocates Count nodes containing copies of New_Item and inserts them as children of Parent prior to the element designated by Before. If Before equals No_Element, the new nodes are inserted after the last child node of Parent (if any). Any exception raised during allocation of internal storage is propagated, and Container is not modified. procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Position : out Cursor; Count : in Count_Type := 1); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and Container.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. Otherwise, Insert_Child allocates Count nodes containing copies of New_Item and inserts them as children of Parent prior to the element designated by Before. If Before equals No_Element, the new nodes are inserted after the last child node of Parent (if any). Position designates the first newly-inserted node. Any exception raised during allocation of internal storage is propagated, and Container is not modified. procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; Position : out Cursor; Count : in Count_Type := 1); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and Container.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. Otherwise, Insert_Child allocates Count nodes and inserts them as children of Parent prior to the element designated by Before. If Before equals No_Element, the new elements are inserted after the last child node of Parent (if any). The elements contained in the new nodes are initialized by default (see 3.3.1). Position designates the first newly-inserted node. Any exception raised during allocation of internal storage is propagated, and Container is not modified. procedure Prepend_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); Equivalent to Insert_Child (Container, Parent, First_Child (Container, Parent), New_Item, Count). procedure Append_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); Equivalent to Insert_Child (Container, Parent, Last_Child (Container, Parent), New_Item, Count). procedure Delete_Children (Container : in out Tree; Parent : in Cursor); If Parent equals No_Element, then Constraint_Error is propagated. If Parent does not designate a node in Container, Program_Error is propagated. Otherwise, Delete_Children removes (from Container) all of the child elements of Parent along with their dependent elements. AARM Discussion: This routine deletes all of the child subtrees of Parent at once. Use Delete_Subtree to delete an individual subtree. [Editor's note: We don't provide a Delete_Leaf_Children (which would only delete leaf nodes), even though we do provide such a routine for the entire tree, as it would require pretesting the nodes to verify that they are leaves before doing any deletions, eliminating any performance advantage.] procedure Copy_Subtree (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in Cursor); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Target, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Target, then Program_Error is propagated. If Before is not equal to No_Element, and Target.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. If Source designates a root node, Constraint_Error is propagated. If Source is equal to No_Element, then the operation has no effect. Otherwise, the subtree rooted by Source (which can be from any tree; it does not have to be a subtree of Target) is copied (new nodes are allocated to create a new subtree with the same structure as the Source subtree, with each element initialized from the corresponding element of the Source subtree) and and inserted into Target immediately prior to Before, or, if Before equals No_Element, after the last child node of Parent (if any). The parent of the newly created subtree is set to Parent, and the overall count of Target is incremented by Subtree_Count (Source). AARM Discussion: We only need one routine here, as the source object is not modified, so we can use the same routine for both copying within and between containers. AARM Ramification: We do not allow copying a subtree that includes a root node, as that would requiring inserting a node with no value in the middle of the target tree. To copy an entire tree to another tree object, use Copy. [Editor's note: Normally, we would call the Source parameter "Position"; and Target would be "Container" but that seems to imply that the same container is required for the Source parameter (which it is not).] procedure Splice_Subtree (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree; Position : in out Cursor); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Target, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Target, then Program_Error is propagated. If Before is not equal to No_Element, and Target.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. If Position equals No_Element, Constraint_Error is propagated. If Position does not designate a node in Source or designates a root node, then Program_Error is propagated. If Source denotes the same object as Target, then there is no effect; if Position designates an ancestor of Parent or is equal to Parent, Constraint_Error is propagated; else the subtree rooted by the element designated by Position is moved immediately prior to Before, or, if Before equals No_Element, after the last child node of Parent (if any). In each of these cases, Position and the count of Target are unchanged, and the parent of the element designated by Position is set to Parent. AARM Reason: We can't allow moving the subtree of Position to a descendant node of the subtree, as the descendant node will be part of the subtree being moved. The result would be a circularly linked tree, or one with inaccessible nodes. Thus we have to check Position against Parent, even though such a check is O(Depth(Source)). End AARM Reason. Otherwise, the subtree designated by Position is removed from Source and moved to Target. The subtree is inserted as a child of Parent prior to the element designated by Before. If Before equals No_Element, the subtree is inserted after the last child node of Parent (if any). In each of these cases, the count of Target is incremented by Subtree_Count (Position), and the count of Source is decremented by Subtree_Count (Position), Position is updated to represent an element in Target. AARM Ramification: If Source is the same as Target, and Position = Before, or Next_Sibling(Position) = Before, Splice has no effect, as the subtree does not have to move to meet the postcondition. We do not allow splicing a subtree that includes a root node, as that would requiring inserting a node with no value in the middle of the target tree. Splice the children of the root node instead. For this reason there is no operation to splice an entire tree, as that would necessarily involve splicing a root node. End AARM Ramification. procedure Splice_Subtree (Container: in out Tree; Parent : in Cursor; Before : in Cursor; Position : in Cursor); If Parent equals No_Element, the Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and Target.Parent (Before) is not equal to Parent, then Constraint_Error is propagated. If Position equals No_Element or designates a root node, Constraint_Error is propagated. If Position does not designate a node in Container, then Program_Error is propagated. If Position equals Before, there is no effect. If Position designates an ancestor of Parent or is equal to Parent, Constraint_Error is propagated. Otherwise, the subtree rooted by the element designated by Position is moved immediately prior to Before, or, if Before equals No_Element, after the last child node of Parent (if any). The parent of the element designated by Position is set to Parent. AARM Reason: We can't allow moving the subtree of Position to a descendant node of the subtree, as the descendant node will be part of the subtree being moved. procedure Splice_Children (Target : in out Tree; Target_Parent : in Cursor; Before : in Cursor; Source : in out Tree; Source_Parent : in Cursor); If Target_Parent equals No_Element, then Constraint_Error is propagated. If Target_Parent does not designate a node in Target, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate an element in Target, then Program_Error is propagated. If Source_Parent equals No_Element, Constraint_Error is propagated. If Source_Parent does not designate a node in Source, then Program_Error is propagated. If Before is not equal to No_Element, and Target.Parent (Before) is not equal to Target_Parent, then Constraint_Error is propagated. If Source denotes the same object as Target, then: * if Target_Parent equals Source_Parent there is no effect; else * if Source_Parent is an ancestor of Target_Parent, then Constraint_Error is propagated; else * the child elements (and their descendants) of Source_Parent are moved to be child elements of Target_Parent immediately prior to Before, or, if Before equals No_Element, after the last child node of Target_Parent (if any). The parent of each moved child element is set to Target_Parent. AARM Reason: We can't allow moving the children of Source_Parent to a descendant node, as the descendant node will be part of one of the subtrees being moved. Otherwise (if Source does not denote the same object as Target), the child elements (and their descendants) of Source_Parent are removed from Source and moved to Target. The child elements are inserted as children of Target_Parent prior to the element designated by Before. If Before equals No_Element, the child elements are inserted after the last child node of Target_Parent (if any). In each of these cases, the overall count of Target is incremented by Subtree_Count (Source_Parent)-1, and overall count of Source is decremented by Subtree_Count (Source_Parent)-1. AARM Ramification: The node designated by Source_Parent is not moved, thus we never need to update Source_Parent. Move (Target, Source) could be written Splice_Children (Target, Target.Root, No_Element, Source, Source.Root); End AARM Ramification. procedure Splice_Children (Container : in out Tree; Target_Parent : in Cursor; Before : in Cursor; Source_Parent : in Cursor); If Target_Parent equals No_Element, then Constraint_Error is propagated. If Target_Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and does not designate an element in Container, then Program_Error is propagated. If Source_Parent equals No_Element, Constraint_Error is propagated. If Source_Parent does not designate a node in Container, then Program_Error is propagated. If Before is not equal to No_Element, and Container.Parent (Before) is not equal to Target_Parent, then Constraint_Error is propagated. If Target_Parent equals Source_Parent there is no effect. If Source_Parent is an ancestor of Target_Parent, then Constraint_Error is propagated. Otherwise the child elements (and their descendants) of Source_Parent are moved to be child elements of Target_Parent immediately prior to Before, or, if Before equals No_Element, after the last child node of Target_Parent (if any). The parent of each moved child element is set to Target_Parent. function Parent (Position : Cursor) return Cursor; If Position is equal to No_Element or designates a root node, No_Element is returned. Otherwise, a cursor designating the parent node of the node designated by Position is returned. function First_Child (Parent : Cursor) return Cursor; If Parent is equal to No_Element, then Constraint_Error is propagated. Otherwise, First_Child returns a cursor designating the first child node of the node designated by Parent; if there is no such node, No_Element is returned. function First_Child_Element (Parent : Cursor) return Element_Type; Equivalent to Element (First_Child (Parent)). function Last_Child (Parent : Cursor) return Cursor; If Parent is equal to No_Element, then Constraint_Error is propagated. Otherwise, Last_Child returns a cursor designating the last child node of the node designated by Parent; if there is no such node, No_Element is returned. function Last_Child_Element (Parent : Cursor) return Element_Type; Equivalent to Element (Last_Child (Parent)). function Next_Sibling (Position : Cursor) return Cursor; If Position equals No_Element or designates the last child node of its parent, then Next_Sibling returns the value No_Element. Otherwise, it returns a cursor that designates the successor (with the same parent) of the node designated by Position. function Previous_Sibling (Position : Cursor) return Cursor; If Position equals No_Element or designates the first child node of its parent, then Previous_Sibling returns the value No_Element. Otherwise, it returns a cursor that designates the predecessor (with the same parent) of the node designated by Position. procedure Next_Sibling (Position : in out Cursor); Equivalent to Position := Next_Sibling (Position); procedure Previous_Sibling (Position : in out Cursor); Equivalent to Position := Previous_Sibling (Position); procedure Iterate_Children (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); If Parent equals No_Element, then Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. Iterate_Children calls Process.all with a cursor that designates each child node of Parent, starting with the first child node and moving the cursor as per the Next_Sibling function. Program_Error is propagated if Process.all tampers with the cursors of Container. Any exception raised by Process.all is propagated. procedure Reverse_Iterate_Children (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); If Parent equals No_Element, then Constraint_Error is propagated. If Parent does not designate a node in Container, then Program_Error is propagated. Iterate_Children calls Process.all with a cursor that designates each child node of Parent, starting with the last child node and moving the cursor as per the Previous_Sibling function. Program_Error is propagated if Process.all tampers with the cursors of Container. Any exception raised by Process.all is propagated. Bounded (Run-Time) Errors It is a bounded error for the actual function associated with a generic formal subprogram, when called as part of an operation of this package, to tamper with elements of any Tree parameter to the operation. Either Program_Error is raised, or the operation works as defined on the value of the Tree either prior to, or subsequent to, some or all of the modifications to the Tree. It is a bounded error to call any subprogram declared in the visible part of Containers.Multiway_Trees when the associated container has been finalized. If the operation takes Container as an in out parameter, then it raises Constraint_Error or Program_Error. Otherwise, the operation either proceeds as it would for an empty container, or it raises Constraint_Error or Program_Error. Erroneous Execution A Cursor value is invalid if any of the following have occurred since it was created: * The tree that contains the element it designates has been finalized; * The tree that contains the element it designates has been used as the Source or Target of a call to Move; * The tree that contains the element it designates has been used as the Target of a call to Assign or the target of an assignment_statement; * The element it designates has been removed from the tree that previously contained the element. AARM Discussion: We talk about which tree the element was removed from in order to handle splicing nodes from one tree to another. The node still exists, but any cursors that designate it in the original tree are now invalid. This bullet covers removals caused by calls to Clear, Delete_Leaf, Delete_Subtree, Delete_Children, Splice_Children, and Splice_Subtree. The result of "=" or Has_Element is unspecified if it is called with an invalid cursor parameter. Execution is erroneous if any other subprogram declared in Containers.Multiway_Trees is called with an invalid cursor parameter. AARM Discussion: The list above is intended to be exhaustive. In other cases, a cursor value continues to designate its original element (or the root node). For instance, cursor values survive the insertion and deletion of other nodes. While it is possible to check for these cases, in many cases the overhead necessary to make the check is substantial in time or space. Implementations are encouraged to check for as many of these cases as possible and raise Program_Error if detected. Implementation Requirements No storage associated with a multiway tree object shall be lost upon assignment or scope exit. The execution of an assignment_statement for a tree shall have the effect of copying the elements from the source tree object to the target tree object. AARM Implementation Note: An assignment of a Tree is a "deep" copy; that is the elements are copied as well as the data structures. We say "effect of" in order to allow the implementation to avoid copying elements immediately if it wishes. For instance, an implementation that avoided copying until one of the containers is modified would be allowed. Implementation Advice Containers.Multiway_Trees should be implemented similarly to a multiway tree. In particular, if N is the overall number of nodes for a particular tree, then the worst-case time complexity of Element, Parent, First_Child, Last_Child, Insert_Child with Count=1, and Delete should be O(log N). AARM Reason: We do not mean to overly constrain implementation strategies here. However, it is important for portability that the performance of large containers has roughly the same factors on different implementations. If a program is moved to an implementation that takes O(N) time to access elements, that program could be unusable when the trees are large. We allow O(log N) access because the proportionality constant and caching effects are likely to be larger than the log factor, and we don't want to discourage innovative implementations. Move should not copy elements, and should minimize copying of internal data structures. AARM Implementation Note: Usually that can be accomplished simply by moving the pointer(s) to the internal data structures from the Source container to the Target container. If an exception is propagated from a tree operation, no storage should be lost, nor any elements removed from a tree unless specified by the operation. AARM Reason: This is important so that programs can recover from errors. But we don't want to require heroic efforts, so we just require documentation of cases where this can't be accomplished. A.18.YY The Package Containers.Indefinite_Multiway_Trees The language-defined generic package Containers.Indefinite_Multiway_Trees provides private types Tree and Cursor, and a set of operations for each type. It provides the same operations and semantics as the package Containers.Multiway_Trees (see A.18.xx), with the difference that the generic formal Element_Type is indefinite. Static Semantics The declaration of the generic library package Containers.Indefinite_Multiway_Trees has the same contents and semantics as Containers.Multiway_Trees except: * The generic formal Element_Type is indefinite. * The procedure with the profile: procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; Position : out Cursor; Count : in Count_Type := 1); is omitted. AARM Discussion: This procedure is omitted because there is no way to create a default-initialized object of an indefinite type. We considered having this routine insert an empty element similar to the empty elements of a vector, but rejected this possibility because the semantics are fairly complex and very different from the existing case. That would make it more error-prone to convert a container from a definite type to an indefinite type; by omitting the routine completely, any problems will be diagnosed by the compiler. The actual Element parameter of access subprogram Process of Update_Element may be constrained even if Element_Type is unconstrained. A.18.ZZ The Package Containers.Bounded_Multiway_Trees The language-defined generic package Containers.Bounded_Multiway_Trees provides a private type Tree and a set of operations. It provides the same operations as the package Containers.Multiway_Trees (see A.18.XX), with the difference that the maximum storage is bounded. Static Semantics The declaration of the generic library package Containers.Bounded_Multiway_Trees has the same contents and semantics as Containers.Multiway_Trees except: * pragma Preelaborate is replaced with pragma Pure. AARM implementation note: Package Containers.Bounded_Multiway_Trees cannot depend on package Ada.Finalization (because that package has Preelaborate categorization). * The type Tree is declared with a discriminant that specifies the capacity (maximum number of elements) as follows: type Tree (Capacity : Count_Type) is tagged private; * The type Tree needs finalization if and only if type Element_Type needs finalization. * The allocation of internal storage includes a check that the capacity is not exceeded, and Capacity_Error is raised if this check fails. * In procedure Assign, if Source length is greater than Target capacity, then Capacity_Error is propagated. * Function Copy is declared as follows: function Copy (Source : Tree; Capacity : Count_Type := 0) return List; If Capacity is 0, then the tree capacity is the count of Source; if Capacity is equal to or greater than Source.Count, the tree capacity equals the value of the Capacity parameter; otherwise, the operation propagates Capacity_Error. * In the five-parameter procedure Splice_Subtree, if Source is not the same object as Target, if the sum of Target.Count and Subtree_Count (Position) is greater than Target.Capacity, then Splice_Subtree propagates Capacity_Error. * In the five-parameter procedure Splice_Children, if Source is not the same object as Target, if the sum of if Target.Count and Subtree_Count (Source_Parent)-1 is greater than Target.Capacity, then Splice_Children propagates Capacity_Error. Bounded Errors It is a bounded error to use a tree if it was the target of an assignment_statement whose source was in the middle of an operation that disallows tampering with cursors [Redundant: or elements]. Either Program_Error is raised, or the operation proceeds as defined. Implementation Advice Bounded tree objects should be implemented without implicit pointers or dynamic allocation. The implementation advice for procedure Move to minimize copying does not apply. !discussion The author has found that almost all of his large projects contained a least one multiway tree, as it is an ideal structure for representing nesting. Flatter structures such as the existing containers do not handle nesting well. The name "multiway tree" seems to be the most common one for this data structure in the literature. (It sometimes is written "multi-way tree".) We do not provide an indexing method in this tree; this is purely a structuring data structure. If indexing is needed, a set or map container (already provided by the Ada Containers library) is probably more appropriate. Model: The basic model is that every element in the tree has an associated doubly-linked list of children. Thus we provide almost all of the operations of the list container, each with an added Parent parameter. We did not change the basic names of the operations for familiarity. We added "_Child" to each name in order to describe clearly what is being accessed. We considered leaving the names the same (without the "_Child" suffix), but that seemed confusing, as it would be easy to believe that First always represents the first node of the root of the tree. Having "_Child" as part of the names also opens up the possibility of operations not having the "_Child" suffix. These would naturally operate on the entire tree. We're taken advantage of this to define operations like Find and Iterate over the entire tree. We've added the suffix "_Subtree" to a few operations to make it clear that they operate on entire subtrees, as opposed to a single element. While this isn't strictly necessary, it should enhance readability of code. Elements are stored in internal nodes (as in the linked list container). There is a built-in root node in each container. The root node does not contain an element value (in order to avoid complications with uninitialized values in the indefinite container forms), and it cannot be deleted. A function Root is provided to retrieve a cursor to the root node. The provision of the root node eliminates many special cases, as it is always the case that the parent cursor of an element containing node is a real cursor containing an (implicit) reference to the container. However, the root node cannot participate in any operations that require element values, including copying or splicing of subtrees. The effect is that the root node contains a list of child subtrees which are really the interesting trees. Thus the children of the root node can be thought of as a forest of trees. This organization allows bottom-up tree creation where subtrees are stashed as separate subtrees of the root until their parent is created (at which time Slice_Children or Slice_Subtree would be used to make it a subtree of the appropriate parent). Without this capability, bottom up tree creation would require the creation of many container objects (each with overhead) and potentially would require copying of the subtrees and elements from one container to another with each step (this is certainly the case for a bounded tree container). We considered using No_Element to represent the root. This however required passing the container to a number of operations like First_Child, and also caused additional wording complications. Note that we added the Assign and Copy routines here as proposed in AI05-0001-1. If we decide not to do that for the other containers, they should be removed here as well. Similarly, we define a bounded form here; in the unlikely event that we decide not to adopt AI05-0001-1, it should be removed here. Operations that were dropped from the list container are: * Delete_First and Delete_Last: These seem unlikely to occur in the context of a tree (as opposed to a sequence like a list), and they're very easy to write yourself if necessary. [Editor's Note: Dropping these are all Matt's idea. :-) Since the names aren't the same anyway (they'd be Delete_First_Child and Delete_Last_Child) the consistency argument doesn't seem to hold much water.] * Swap_Links: Swapping within a single sibling list doesn't seem very useful. Swapping such that the parents get changed on the nodes can be a nightmare to describe and implement correctly: consider swapping two nodes where one is the parent of the other, and additional children are also involved. Exactly which elements end up where?? We do provide Swap, which just swaps the elements and leaves the nodes alone; that doesn't have definitional issues. * Reverse_Find: This is a strange operation, which would be hard to define sensibly. Rather, we provide Ancestor_Find, which has a clear definition. * Reverse_Elements: This seems like a strange operation operating on a single list of children. It makes no sense on the entire container. So we don't provide it. * Sorting operations: It seems weird to sort just a single list of child elements. And it also doesn't make any sense on the entire container. More on the value-less root model: Since the top-level subtrees can be thought of as a forest of trees, it would be easy to build an implementation of the XML DOM directly using one of these tree containers. The XML DOM includes the capability of inserting elements without including them in the main tree. Such "orphan" elements could be modeled as children of the root node, with the first child representing the main tree. We hope that the valueless root model provides the right balance between the orphan model used by the XML DOM and a "classic" single-rooted tree. It appears to keep special cases to a minimum, while not straying too far from the typical terminology of trees. A pure single-roots model seems cleaner on first reflection, but in practice it does not simplify either the wording nor the use of the container. In addition, it can make some operations much more complex. We want the tree container to allow operations as naturally as possible, extra complications are more likely to cause users to abandon the container. If we defined the tree using traditional node insertion, then we would have to have rules for an empty tree and in addition we would need rules to prevent inserting a second root (this would be necessary for at least the various Inserts, Splices, Copy_Subtree). But this somewhat silly, given that the first operation on every tree object would have to be to insert the one-and-only root. The alternative of having a root node that is always present, but has an optional value. In this case, inserting any node at the root is always illegal (and can easily be prevented by simply not allowing the Parent parameter to ever be No_Element). But then, we would have to define the idea of an uninitialized element for just this one node, and we would probably want a way to "unset" the value of this node. Another alternative is to have the root node always present, and have it be default initialized. This would not have as much wording complication, but it would have some annoying effects: * A empty tree would contain one valid element (which could be read); * The value of this extra node would often need to be ignored (if a forest of subtrees is being used) or replaced - this would add many additional opportunities for errors; * The indefinite container form could not use this solution, as there is no default initial value for an indefinite object. The only solution for the last problem would be to revert to the previous alternative for the indefinite forms, making the various forms annoyingly different. In addition, single rooted trees with values have a serious usage problem. That occurs because replacing the root is impossible. The effect is that a tree transformation that needs to change the root is complicated, and a general purpose transformation usually will have to have a special case just to handle the root. One obvious example is the bottom-up tree construction noted earlier. In order to do that in a single-rooted tree, we would have to either use many tree objects (each with associated overhead, and in the case of bounded trees, each Splice_Subtree would have to copy the nodes and elements as well), or we would have to use dummy nodes to construct onto which have to be ignored in later operations. (They couldn't be deleted, because the root node of a single-rooted tree cannot be deleted. You might be able to swap them with the real root node [if that is allowed], but deleting the dummy after that would require relinking all of the child nodes of the real root before the dummy could be deleted, a complex operation.) One example problem is that the dummy nodes would get in the way if/when trees are combined. For instance, consider the expression trees in an Ada compiler. The default expressions of a procedure are likely to be stored in the symbol table; if constructed bottom-up (as they would be using a parser generator like YACC) they would have to have a dummy node at the top if the tree is single rooted. When the default expression is later spliced into a call tree, the dummy node would end up in the middle of the call tree. Care would have to be taken with every operation to avoid this happening. When the root node does not have a value, it cannot be spliced or copied into any tree, so the problem with excess dummy nodes do not occur. Even more annoying would be the complications for doing tree transformations. Consider again the expression trees in an Ada compiler. One common transformation is to change a call to "/=" to not "=". (This transformation has to wait until the compiler knows that the "/=" has a Boolean type, as otherwise it could just be an ordinary function call.) This means that a "not" node has to be inserted before the "=" node. This is easily done with the current package: procedure Change_Not_Equals (Expr : in out Tree; Position : in Cursor) is My_Not : Cursor; begin Insert_Child (Expr, Parent => Parent (Position), Before => No_Element, New_Item => (Kind => A_Not), Position => My_Not); Splice_Subtree (Expr, Parent => My_Not, Before => No_Element, Position => Position); Replace_Element (Position, New_Item => (Kind => Equals)); end Change_Not_Equals; Since the current package does not allow the root node to have a value, Position cannot designate the root node (it cannot designate a call element, obviously). Thus Insert_Child will add a node somewhere in the tree, but it won't try to add a second root. More generally, tree transformations only involve nodes with values, so they will not (directly) involve the root node. However, this does not work for a single-rooted package allowing values if the Position happens to be at the root (as it would be for the condition of an if statement that consists of a single "/="). That's because inserting a second root is never allowed (else the tree would not be single rooted!). Thus the initial Insert_Child would fail. Instead, the code would have to resort to using unnatural operations like Swap and would have to move all of the children of the root node manually: Insert_Child (Expr, Parent => Position, Before => No_Element, New_Item => (Kind => A_Not), Position => My_Not); Swap (Position, My_Not); Splice_Subtree (Expr, Parent => My_Not, Before => No_Element, Position => First_Child (Position)); Splice_Subtree (Expr, Parent => My_Not, Before => No_Element, Position => First_Child (Position)); Replace_Element (My_Not, New_Item => (Kind => Equals)); This is more complex and much less natural than the initial code. (It would be much more complex if the number of children were not fixed as in this example - imagine a similar transformation on any function call.) It also ends up with the original cursor pointing to a different node than the original code (which might be significant). Further ideas: * More iterators could be useful. One could imagine wanting to visit the child nodes before their parents, as well as a straight breadth-first iteration. One possible way to (re)structure the overall iterator is to have two process routines, one that is called when a node is first visited, and the second after the children are visited. (That would allow a more direct mapping of the HTML example below.) This would look something like: procedure Iterate (Container : in Tree; Process_Before_Children : access procedure (Position : in Cursor); Process_After_Children : access procedure (Position : in Cursor)); (Note that these subprograms are allowed to be null, so that if one or the other isn't needed, it can be skipped.) The biggest objection to such an iterator is that it can't be mapped into for-loop-like syntax (one hopes that some future version of Ada will allow that). Advantages of the multiway tree container over hand-created trees: (1) Complete memory management, including for indefinite elements. (2) Avoidance of linking mistakes during operations like insertions and deletions. (3) More safety than hand-built structures -- tampering checks and possible dangling pointer (cursor) detection. (4) Completely linked (doubly-linked sibling lists; complete parent and child pointers), so walking in any direction in the tree is always O(1). (5) Built-in iterators. (6) Container users get streaming for free; streaming of a tree is fairly hard to do correctly. Disadvantages of the multiway tree container: (1) Not as familiar as some other containers; may not be used as much for that reason. (Authors can help aleviate this situation somewhat.) (2) Not as general as a general graph structure. But a general graph structure is much harder to describe, it is harder to manage memory for, and is not as commonly needed as a tree. (3) More memory management options are available for hand-created code. But that also provides a lot more opportunities for bugs. !examples (1) Representing HTML: (XML is similar) type Root_HTML_Element is tagged private; function Init_Element return Root_HTML_Element; procedure Output_Prefix (Element : Root_Element_Type); procedure Output_Suffix (Element : Root_Element_Type); type Text is Root_HTML_Element with record The_Text : Unbounded_String; end record; procedure Output_Prefix (Element : Bold_Element_Type); -- Writes The_Text procedure Output_Suffix (Element : Bold_Element_Type); -- Writes nothing. type Bold_HTML_Element is new Root_HTML_Element with null record; procedure Output_Prefix (Element : Bold_Element_Type); -- Writes procedure Output_Suffix (Element : Bold_Element_Type); -- Writes -- Similar overridings of Output_Prefix and Output_Suffix for each type; -- we won't show them here. type Italic_HTML_Element is new Root_HTML_Element with null record; type Div_HTML_Element is new Root_HTML_Element with record Class : Class_Name; Style : Style_String; -- other attributes. end record; type Span_HTML_Element is new Root_HTML_Element with record Class : Class_Name; Style : Style_String; -- other attributes. end record; type Header_HTML_Element is new Root_HTML_Element with null record; type Body_HTML_Element is new Root_HTML_Element with null record; -- and so on. package HTML_Tree is new Ada.Containers.Multiway_Trees (Root_HTML_Element'Class); -- The following HTML code:
Sample HTML
This is normal text; this is bold text; and this is italic text
-- would be represented as the following operations: Sample : HTML_Tree.Tree; Header, HTML_Body, Paragraph, Working: HTML_Tree.Cursor; HTML_Tree.Insert_Child (Sample, Parent => Sample.Root, Position => Header, Before => HTML_Tree.No_Element, New_Element => Header_HTML_Element'(Init_Element with null record)); HTML_Tree.Insert_Child (Sample, Parent => Header, Position => HTML_Body, Before => HTML_Tree.No_Element, New_Element => Body_HTML_Element'(Init_Element with null record)); HTML_Tree.Insert_Child (Sample, Parent => HTML_Body, Position => Paragraph, Before => HTML_Tree.No_Element, New_Element => Div_HTML_Element'(Init_Element with Class => "Text-Header", Style => "")); HTML_Tree.Append_Child (Sample, Parent => Paragraph, New_Element => Text'(Init_Element with The_Text => "Sample HTML")); HTML_Tree.Insert_Child (Sample, Parent => HTML_Body, Position => Paragraph, Before => HTML_Tree.No_Element, New_Element => Div_HTML_Element'(Init_Element with Class => "Text-Body", Style => "")); HTML_Tree.Append_Child (Sample, Parent => Paragraph, New_Element => Text'(Init_Element with The_Text => "This is normal text; ")); HTML_Tree.Insert_Child (Sample, Parent => Paragraph, Position => Working, Before => HTML_Tree.No_Element, New_Element => Bold_HTML_Element'(Init_Element with null record)); HTML_Tree.Append_Child (Sample, Parent => Working, New_Element => Text'(Init_Element with The_Text => "this is bold text")); HTML_Tree.Insert_Child (Sample, Parent => Paragraph, New_Element => Text'(Init_Element with The_Text => "; and ")); HTML_Tree.Insert_Child (Sample, Parent => Paragraph, Position => Working, Before => HTML_Tree.No_Element, New_Element => Italic_HTML_Element'(Init_Element with null record)); HTML_Tree.Append_Child (Sample, Parent => Working, New_Element => Text'(Init_Element with The_Text => "this is italic text")); Outputing the original HTML could be accomplished as follows: procedure Output_Document (Document : in HTML_Tree.Tree) is procedure Output_Node (Node : HTML_Tree.Cursor) is Node_Element : Root_HTML_Element'Class := HTML_Tree.Element (Node); begin Output_Prefix (Node_Element); -- Dispatches. Iterate_Children (Document, Parent => Node, Process => Output_Node'Access); Output_Suffix (Node_Element); -- Dispatches. end Output_Node; begin Output_Node (HTML_Tree.First (Parent => Sample.Root)); end Output_Document; (2) Windows in a GUI. Windows in a GUI typically are structured as a multiway tree: each window can have a (theroetically) unlimited number of child windows. For instance, in Microsoft Windows, controls like buttons, check boxes, and edit controls are child windows. A program that manipulates a GUI in some way could represent the set of windows in an application as a multiway tree container, putting the burden of memory management on the container. [Author's note: sure would have liked that for the Claw GUI builder...] (3) E-mail messages. The MIME RFCs that define how attachments are structured in an e-mail message define that a message is divided into a series of sections, each with their own headers. One possible content of an e-mail message section is an attached e-mail message -- which has its own set of headers and sections. Thus, an e-mail parser needs to be able to structure the message so that it can look into and process these nested messages. (That's especially important for anti-spam and anti-malware processing -- it would not do to allow a bad message through simply because of nesting.) An e-mail processor could be structured to represent each section as one node in a multiway tree. The top-level sections would be root nodes, but any nested messages would be child nodes of their parent. Here again, we would put the memory management burden on the container, rather than the programmer of the processor. (4) Compiler symboltable. The symboltable in a compiler is also a multiway tree. Entities declared in a single declarative part are a list (order of elements is significant). Of course, many kinds of entities have their own declarations (in Ada, records have components and discriminants; subprograms have parameters; blocks have declarative parts; etc.) This maps naturally into a multiway tree, where the entities that belong to a particular entity are represented as a child list of the tree. If a compiler needed to save a symboltable for some reason, it could easily be done using the streaming operations of the container. [Author's note: sure would have saved a lot of work for Janus/Ada...but perhaps wouldn't be fast enough.] A debugging symboltable dump could be done as a simple iterator: procedure Dump_Symboltable (Table : Symbol.Tree) is procedure Process (Node : Symbol_Cursor) is begin Put ((1 .. Child_Depth(Node)*2) => ' '); -- Indent based on the depth. Dump_Symbol (Element (Node)); end Process; begin Iterate (Tree, Process'Access); end Process; !ACATS test !appendix From: Randy Brukardt Date: Tuesday, January 27, 2009 12:39 AM Back in November, Tucker and I had the following exchange: Tucker: Tree structures are painful to work with without access types. Tagged types, access types, and heterogeneous trees are natural bedfellows. In this era of XML everything, heterogeneous trees are not just for compiler writers anymore. Randy: I surely agree that tagged types and heterogeneous trees are natural bedfellows. I'm not so certain about the access types, though. I thought that the Set containers were supposed to cover a significant portion of the need for trees. If they don't, then I think we ought to discuss what containers will. (Let's work on tomorrow's ways to program, not yesterday's.) Tucker: You can use a tree to implement a set, but you can't really use a set to implement a tree. You could implement a tree using a map, but that's pretty painful. If I want to represent something that is tree like (e.g. an abstract syntax tree, an XML file, a CAD drawing, etc.) I really need pointers of some sort. Randy (today): Having thought about this a lot more, I tend to agree that you can't use any of the existing containers to create a tree structure. But that means giving up the benefits of containers (storage management is done for you; correctness - links get made correctly; streaming is provided) in favor of doing things the same old way (with the same old set of bugs). In the past, Tucker has contended that it would be harder to use such a container than to write it yourself. I originally bought that argument, but I'm now more dubious, since memory management (especially of indefinite types) is hard to get right. So I decided to work out a definition of a multiway tree container. Having done that, and written some example code for an HTML processor, I've pretty decided that Tucker's position here is a load of MBE(*). But decide for yourself - I've attached trial ballon #4. (*) MBE => Male Bovine Excrement. :-) [Attached was version /01 of this AI.] **************************************************************** From: Randy Brukardt Date: Thursday, January 29, 2009 9:10 PM A couple of additions to my proposal for a Multiway_Tree container: ---- Add after the definition of No_Element: Root : Cursor renames No_Element; This will make adding to the root more readable (it doesn't change any semantics). For instance, in the example: HTML_Tree.Insert_Child (Sample, Parent => HTML_Tree.Root, Position => Header, New_Element => Header_HTML_Element'(Init_Element with null record)); ---- The other two forms of Slice also should have Splice_Children variants: procedure Splice_Children (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree); procedure Splice_Children (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree; Position : in out Cursor); Obviously, these operations should be consistent: it would be odd to find some possibilities missing for no obvious reason. These could come up quite a bit in practice. For instance, imagine an HTML editor based on the example code in the proposal. If you were to remove bold facing (for example), you would have to move the children of the bold node to up to the position of the bold node, then delete the bold node. That would use the original Splice_Children routine. If you have multiple HTML documents to operate on, it would make sense to move parts from one to another (cut-and-paste, for instance), and that could need moving the children. **************************************************************** From: Niklas Holsti Date: Friday, January 30, 2009 2:30 AM > Root : Cursor renames No_Element; > > This will make adding to the root more readable (it doesn't change any > semantics). I find the name "Root" here quite confusing. More below. > For instance, in the example: > > HTML_Tree.Insert_Child ( > Sample, > Parent => HTML_Tree.Root, > Position => Header, > New_Element => Header_HTML_Element'( > Init_Element with null record)); If I read the above, without remembering the suggested definition of "Root" as "No_Element" and the special meaning of Parent => No_Element, it clearly seems to mean that the new child is inserted below the root, not as the (or a) new root. It is quite likely that application code that builds a (single-rooted) tree will have a variable called Root, and the nearly identical call > HTML_Tree.Insert_Child ( > Sample, > Parent => Root, -- CHANGED > Position => Header, > New_Element => Header_HTML_Element'( > Init_Element with null record)); then has a very different effect. I think that defining Multiway_Tree.Root in this proposed way invites mistakes, and I find the original form "Parent => No_Element" clearer. Another comment: In the call to Insert_Child in the above example, and in fact in all Insert_Child calls in the HTML example in the proposal, there is no actual for the "Before" parameter, which is required in all three forms of Insert_Child as defined in the proposal. I assume that "Before => No_Element" is intended. I do welcome this Multiway_Tree suggestion. I think it fills a gap and will be useful. **************************************************************** From: Randy Brukardt Date: Friday, January 30, 2009 9:34 PM ... > > For instance, in the example: > > > > HTML_Tree.Insert_Child ( > > Sample, > > Parent => HTML_Tree.Root, > > Position => Header, > > New_Element => Header_HTML_Element'( > > Init_Element with null record)); > > If I read the above, without remembering the suggested definition of > "Root" as "No_Element" and the special meaning of Parent => > No_Element, it clearly seems to mean that the new child is inserted > below the root, not as the (or a) new root. Humm, I see your point. Perhaps it would be better named "As_Root" or something like that? I don't think that the use of Parent => No_Element is particularly understandable. Originally, I had planned a second set of functions for accessing the root: HTML_Tree.Insert_Root ( Sample, Position => Header, New_Element => Header_HTML_Element'( Init_Element with null record)); But a lot of the time, you would end up making the code conditional on whether you're inserting at the root or in a child, which doesn't seem helpful. So I was looking for a way to make it clearer that you are inserting the root of the tree. ... > Another comment: In the call to Insert_Child in the above example, and > in fact in all Insert_Child calls in the HTML example in the proposal, > there is no actual for the "Before" > parameter, which is required in all three forms of Insert_Child as > defined in the proposal. I assume that "Before => No_Element" is > intended. Yes, it is. I had really intended to use Append_Child, but since that doesn't return the position of the inserted element, it can't be used when it is intended to insert things on children. > I do welcome this Multiway_Tree suggestion. I think it fills a gap and > will be useful. Thanks, good to know I'm not alone on this one. **************************************************************** From: Niklas Holsti Date: Saturday, January 31, 2009 3:17 AM > Humm, I see your point. Perhaps it would be better named "As_Root" or > something like that? I also thought about "As_Root", and that would be OK in a positional call: Insert_Child (Sample, As_Root, ...). The named form, "Parent => As_Root", is better than "Parent => Root", but I'm still not happy with it. > I don't think that the use of Parent => No_Element is particularly > understandable. I do. If you insert a node in a tree and specify that it shall have no parent node, obviously (to me) the node becomes a root. I would not object to defining "As_Root : Cursor renames No_Element", but I don't think it is very beneficial. Logically one would then expect also definitions like As_Last : Cursor renames No_Element; for use as the Before parameter in Insert operations, but again I prefer "Before => No_Element" to "Before => As_Last". It is again obvious (to me) that inserting an element in a list with the specification that it shall be before no other element means that the element is appended as the new last element in the list. > Originally, I had planned a second set of functions for accessing the root: > > HTML_Tree.Insert_Root ( > Sample, > Position => Header, > New_Element => Header_HTML_Element'( > Init_Element with null record)); > > But a lot of the time, you would end up making the code conditional on > whether you're inserting at the root or in a child, which doesn't seem > helpful. I agree. I think the "Parent => No_Element" mechanism is a good solution. Perhaps it could be augmented by defining some root-specific operations, such as Insert_Root, analogously with the operations Prepend and Append and defined as equivalent to Insert_Child (.. Parent => No_Element ...). These could be used to increase clarity when it is statically known that the operation is on a root. To avoid cluttering the main package with these root-specific operations, why not define a child package, say Multiway_Trees.Roots, with operations such as Multiway_Trees.Roots.Insert and so on. **************************************************************** From: Jean-Pierre Rosen Date: Saturday, January 31, 2009 7:38 PM > Humm, I see your point. Perhaps it would be better named "As_Root" or > something like that? I don't think that the use of Parent => > No_Element is particularly understandable. Well, anybody using a tree should know that the only node without a parent is the root, so I doesn't seem illogical to say that you want to insert at the root by saying that you want to insert at the place that has no parent. Not a big deal anyway. **************************************************************** From: Randy Brukardt Date: Thursday, May 28, 2009 10:45 PM [Replying to Matt's comments of April 6th, all gven below.] > I reviewed the AI for the multiway tree container. My comments are > bracketed with "[MJH]" and "[END MJH]" pairs, and immediately follow > the normative text to which they refer. Note that I marked my comments on your comments with [RLB] and [End RLB] because I couldn't figure out a sane way to "quote" the entire thing. And please, whgen replying just quote the parts you are replying to and delete the rest, else no one will be able to figure out what is going on. Better yet, wait for my next version before commenting (it'll be done tomorrow, I think). - Randy. --- !wording A multiway tree container object manages a tree of internal nodes, each of which contains an element and pointers to the parent, first child, last child, next (successor) sibling, and previous (predecessor) sibling internal nodes. A cursor designates a particular node within a tree (and by extension the element contained in that node). A cursor keeps designating the same node (and element) as long as the node is part of the container, even if the node is moved in the container. [MJH] If Swap and Splice are included, then that last sentence could say: "...even if the node is moved in the container, or moved to another container." [END MJH] [RLB] That isn't true for the existing containers, so I don't see why it would be here. For instance, in Lists, you get a *new* cursor if you splice nodes from one container into a different one. That makes sense, since the Cursor includes an indication of (usually an access to) the actual container. Besides, this wording was borrowed directly from the List container (A.18.3(2/2)). So I think this wording is correct. [END RLB] Static Semantics The generic library package Containers.Multiway_Trees has the following declaration: generic type Element_Type is private; with function "=" (Left, Right : Element_Type) return Boolean is <>; package Ada.Containers.Multiway_Trees is pragma Preelaborate(Multiway_Trees); pragma Remote_Types(Multiway_Trees); type Tree is tagged private; pragma Preelaborable_Initialization(Tree); type Cursor is private; pragma Preelaborable_Initialization(Cursor); Empty_Tree : constant Tree; No_Element : constant Cursor; function "=" (Left, Right : Tree) return Boolean; function Is_Empty (Container : Tree) return Boolean; [MJH] I don't see a length function here. I would do this: function Length (Container : Tree; Position : Cursor := No_Element) return Count_Type; With the understanding that Length has O(n) time complexity, but that Is_Empty guarantees O(1) time complexity. Position = No_Element means count from the root of the tree; otherwise count from the subtree whose root is Position. You could always break these out as separate functions: function Length (Container : Tree) return Count_Type; function Length (Position : Cursor) return Count_Type; [END MJH] [RLB] I didn't put one in because it isn't clear what "Length" you are talking about. Whenever that happened, I added a prefix to make it clear (which is why there is a Child_Length). Indeed, the function(s) you describe here is not "Length" at all; they're more like "Count". It seems useful, so I added this pair of functions. Note that we really need three functions: one to count all of the elements in the container, one to count all of the elements from the root (not including unattached subtrees), and one for other subtrees. We can't use your second function above for counting from the root, because No_Element doesn't contain a container indication. We could follow the model of Child_Length for that purpose, however. So I ended up with function Count (Container : Tree) return Count_Type; function Subtree_Count (Container : Tree; Position : Cursor) return Count_Type; You're welcome to complain in Brest. ;-) [END RLB] procedure Clear (Container : in out Tree); function Element (Position : Cursor) return Element_Type; [MJH] Here and elsewhere we should include this function: generic type Item_Type (<>) is limited private; with function Get_Item (Element : Element_Type) return Item_Type; function Generic_Item (Position : Cursor) return Item_Type; Without this function, it's awkward to return just a component of an element, especially if the component has an indefinite subtype. For example: subtype Name_Length is Natural range 0 .. 80; type Employee_Type (Length : Name_Length := 0) is record Name : String (1 .. Length); ... end record; It would be nice to be able to say: Name : constant String := Get_Name (Employee_Cursor); To implement Get_Name using function Element is awkward: function Get_Name (C : Cursor) return String is Result : String (1 .. Name_Length'Last); Length : Name_Length; procedure Process (E : Employee_Type) is begin Length := E.Length; Result (1 .. Length) := E.Name; end Process; begin Query_Element (C, Process'Access); return Result (1 .. Length); end Get_Name; Now consider the alternative: function Get_Name (E : Employee_Type) return String is begin return E.Name; end; function Get_Name is new Generic_Item (String, Get_Name); [END MJH] [RLB] I realize that you haven't been following the accessibility subgroup's work, but we've worked out the semantics of a "safe" in-place accessor for the containers. Presuming that exists, it is unlikely that anyone will be using much more complex generics. You could implement function Get_Name as (you would need to pass the container here, as the access is modifiable): function Get_Name (L : in out List; C : Cursor) return String is begin return L.Accessor(C).Element.Name; end Get_Name; (Yes, we're planning to allow "in out" parameters on functions, with some limitations on calls.) But in actual fact you wouldn't bother with a function at all - and most likely you'd rename the result of the accessor and use it for a while. So I'm completely ignoring this idea. [END RLB] procedure Replace_Element (Container : in out Tree; Position : in Cursor; New_Item : in Element_Type); procedure Query_Element (Position : in Cursor; Process : not null access procedure (Element : in Element_Type)); procedure Update_Element (Container : in out Tree; Position : in Cursor; Process : not null access procedure (Element : in out Element_Type)); procedure Move (Target : in out Tree; Source : in out Tree); procedure Delete (Container : in out Tree; Position : in out Cursor); function Find (Container : Tree; Item : Element_Type; Position : Cursor := No_Element) return Cursor; function Reverse_Find (Container : Tree; Item : Element_Type; Position : Cursor := No_Element) return Cursor; function Contains (Container : Tree; Item : Element_Type) return Boolean; function Has_Element (Position : Cursor) return Boolean; procedure Iterate (Container : in Tree; Process : not null access procedure (Position : in Cursor)); function Child_Length (Container : Tree; Parent : Cursor) return Count_Type; [MJH] I've been calling this Child_Count. Calling it Child_Length seems like we're advertising too many impl. details, e.g. ptrs to child nodes are implemented as a list. But I dunno; maybe calling it length is more consistent with how we count things in containers. [END MJH] [RLB] Well, the number of nodes in the entire subtree seems more like "Count" to me. And, as you say, it is more consistent. I didn't make much effort to make the names make more sense. [END RLB] function Child_Depth (Parent, Child : Cursor) return Count_Type; [MJH] I've been calling this just Depth, like this: function Depth (Position : Cursor) return Count_Type; But I see you're passing in two params. OK, so you want to be able to calculate the depth of a node from any subtree in the tree. Yes, that's more flexible, but you're not designing around the common case (which is depth away from the root of the tree). [RLB] There was supposed to be a function Depth that works from the root. It appears I left it out by accident. Perhaps I was thinking that you don't need it, because you can always pass in No_Parent for Parent. But since we can't default that usefully (it's not the last parameter), I agree that it is clunky this way. So I added Depth. Note that Depth (Position => No_Element) ought to be defined to give you the depth of the entire tree (the depth of the deepest element). [END RLB] In the odd event someone wanted to compute depth for any arbitrary subtree, you could always splice the subtree into its own tree. But that won't work exactly, because the tree identity has changed (because the cursor is still bound to original tree). [RLB] And it would make copies of the elements if the tree is in a bounded container. The performance would be dismal. [END RLB] Here and elsewhere (it would be useful for all of the containers) it might make sense to have a Rebind operation, to assign a cursor to a different tree. Something like: procedure Rebind (Source : Tree; -- not strictly necessary Position : Cursor; Target : Tree); Position would be bound to Target, and away from Source. [RLB] This is a horrible idea: it wouldn't work for any container implementation not using discrete nodes, like the canonical implementation of Vector and all of the bounded forms. Please forget you ever had this idea. ;-) [END RLB] Alternatively, you could pass the parent node as a default param: function Depth (Position : Cursor; Parent : Cursor := No_Element) return Count_Type; With the interpretation that when Parent = No_Element, this would mean depth from root of tree. [RLB] That would work, but it would put the parameters in the "wrong" order. Since they have the same type, that would be really confusing in a positional call. I think two separate functions are a better idea (but of course we'll discuss it). [END RLB] [END MJH] procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; New_Item : in Element_Type; Position : out Cursor; Count : in Count_Type := 1); procedure Insert_Child (Container : in out Tree; Parent : in Cursor; Before : in Cursor; Position : out Cursor; Count : in Count_Type := 1); procedure Prepend_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Append_Child (Container : in out Tree; Parent : in Cursor; New_Item : in Element_Type; Count : in Count_Type := 1); procedure Delete_Child (Container : in out Tree; Parent : in Cursor; Position : in out Cursor; Count : in Count_Type := 1); [MJH] How is this different from Delete? [END MJH] [RLB] It supports a count, allowing deletion of part or all of a child list. It didn't make sense to include a count on the plain Delete, because what precisely is being counted is completely unclear (if there are both child and sibling nodes, which ones get deleted first? Doesn't pay to define that). Note that I explained that in the !discussion section; it might have helped to read that before commenting here. (You probably should have read that first, then came back to the spec. Too late now.) [END RLB] procedure Delete_First_Child (Container : in out Tree; Parent : in Cursor; Count : in Count_Type := 1); [MJH] All of these seem like overkill. It makes sense to treat manipulation of the front or back element specially, for a sequence abstraction as a list or vector, because this models stacks and queues. However, a tree is different. It is recursive: you have trees, and subtrees. I can see iterating over the immediate children in a (sub) tree, but I see less usefulness for wanting to manipulate the first or last child specifically. [END MJH] [RLB] I wanted to stay as close to the existing List definition as possible. I think those procedures are overkill in the lists, for all of the reasons you give here. (They're easy to write yourself, and container deletion is rare to begin with.) But never mind that, I'll get rid of them. But I'm going to say that's because Matt wanted them gone. :-) [END RLB] procedure Delete_Last_Child (Container : in out Tree; Parent : in Cursor; Count : in Count_Type := 1); [MJH] See above. Both of these can be effected easily enough: procedure Op (T : in out Tree) is C : Cursor := ...; C2 : Cursor; begin ... -- need to delete last subtree, so do this: C2 := Last_Child (C); T.Delete (C2); ... end Op; [END MJH] [MJH] I don't see these listed: procedure Swap -- swap element value (Container : in out Tree; I, J : Cursor); procedure Swap_Links (Container : in out Tree; I, J : Cursor); I think you need these. For example, in the canonical implementation of a red-black tree, during rebalancing an element on a leaf node is swapped with an element value high up near the root. I admit that reasoning about behavior when one node is the immediate parent of another is a bit of a brain-teaser, though. I spent most of Sat afternoon implementing Swap and Swap_Links but still didn't finish it. [END MJH] [RLB] As I noted in the discussion, I left these out on purpose. I was thinking of these only in terms of a single sibling list, and in that case they don't seem very useful. In hindsight, I think I screwed up leaving Swap out; since it moves Elements (not nodes), it doesn't need to be limited to a single child list. Your problems figuring out what it means to swap the parents if you do a general Swap_Links convinces me that we don't want to bother. I'd hate to have to figure out what it is supposed to do in some of those cases! And I really don't relish wording it. [END RLB] procedure Splice (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree); procedure Splice (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree; Position : in out Cursor); procedure Splice (Container: in out Tree; Parent : in Cursor; Before : in Cursor; Position : in Cursor); [MJH] I've been calling these Splice_Child, but Splice probably works just as well. [END MJH] [RLB] That might be more consistent, I guess. [END RLB] procedure Splice_Children (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree); procedure Splice_Children (Target : in out Tree; Parent : in Cursor; Before : in Cursor; Source : in out Tree; Position : in out Cursor); procedure Splice_Children (Container: in out Tree; Parent : in Cursor; Before : in Cursor; Position : in Cursor); [MJH] As with Delete_First, etc, you can do this by iterating over the children of Position and splicing them. However, that turns the cost of the operation from O(1) to O(n). I don't whether this is really a useful operation, but maybe you could make the argument for its inclusion based on efficiency grounds. It is a bit confusing though about what is being spliced. In my implementation I had named as Splice_Child the things you're calling Splice. One idea is to rename this operation, say, to Reparent_Children. Another idea is to rename from Splice to Splice_Subtree. [END MJH] function Parent (Position : Cursor) return Cursor; [MJH] There are some implementation tricks you can play here. Instead of Parent (Root) returning No_Element, it could return a special value, so that First_Child (Parent (C)) always works, even if C happens to designate the root. See the implementation for how to implement this. [END MJH] [RLB] I got the feeling from the Ada-Comment discussion that people really don't want to mess with special values. The container parameter provides the needed information, anyway. Note that First_Child is a dubious operation on the root (do you get Root or the first orphan node?) Might be worth discussion once the orphan/Root model is explained properly. [END RLB] function First_Child (Container : Tree; Parent : Cursor) return Cursor; [MJH] This is a bit confusing: is it first child of root node of tree, or leftmost-node? [RLB] It's the first child of the parent. If the parent is No_Element, then it is ???. (I had intended it to be the first root, but since people don't like the multirooted model, that has to be discarded. Making it the first orphan is the natural change, but I'm not sure that is what is wanted. My gut feeling is that this ought to raise Constraint_Error if this is the root, but that's annoying in practice.) [END RLB] Why does First_Child accept a container parameter? My implementation is declared this way: function First_Child (Position : Cursor) return Cursor; [END MJH] [RLB] The above doesn't work for the root, because No_Element does not contain an indication of the container. So how do you figure out what container you are talking about?? There are a number of functions where I passed in the container for this reason. As noted above, one option is to have a separate function to get the first orphan node ("First_Orphan"?) (as there is for the root), but that seems like it would get clunky. function First_Child_Element (Container : Tree; Parent : Cursor) return Element_Type; [MJH] As with Delete_First_Child, etc, I don't know how useful this is. This is about trees and subtrees. The emphasis should be on the value of the element at the root of some subtree. I can always say: E := Element (First_Child (C)); [END MJH] [RLB] Again, I'm just following the lists. I don't think there is any reason for this function to exist in any of the other containers either. (I.e., I don't see the difference, and in this case there is no performance advantage either, for any container.) But here I'm leaving it, because there doesn't seem to be any good reason to *not* have it and remain consistent. [END RLB] function Last_Child (Container : Tree; Parent : Cursor) return Cursor; [MJH] As with First_Child, I don't understand why this accepts two paramters. My prototype has this: function Last_Child (Position : Cursor) return Cursor; [END MJH] [RLB] How do you deal with No_Element?? It's clear that no one wants a magic value that we can't use in default parameters (which is what happens if you try to include the container in it). [END RLB] function Last_Child_Element (Container : Tree; Parent : Cursor) return Element_Type; [MJH] Same comment as for First_Child_Element. [END MJH] function Next_Sibling (Position : Cursor) return Cursor; function Previous_Sibling (Position : Cursor) return Cursor; procedure Next_Sibling (Position : in out Cursor); procedure Previous_Sibling (Position : in out Cursor); [MJH] I don't see these operations: function Sibling_Count (Position : Cursor) return Count_Type; function First_Sibling (Position : Cursor) return Cursor; function Last_Sibling (Position : Cursor) return Cursor; You might think you could implement these by saying: FS : Cursor := First_Child (Parent (C)); but this won't work for the root -- unless you play tricks in the implementation of Parent (see the prototype for one way how to do this). [RLB] Right, but your implementation of First_Child cannot work for the root either: First_Orphan := First_Child (No_Element); because you don't know what container to extract the root from! Once you add the container parameter to First_Child, this problem goes away. I don't see how your First_Child can work directly on the root, unless you use magic values (previously rejected). OIC, you are going to tell us that we *have* to have magic values, at every meeting from now until ISO publishes the Standard. :-) PLEASE learn to take no for an answer (not that you have to take it from me individually, but surely from the group). Anyway, I don't think there is any value to "sibling" operations other than the simple navigation ones. As you say, this is not a sequence per-se; usually you only care about the relationships of the nodes. And it has to be the case that going to the parent and then to the First_Child always works to get all of your siblings. [END RLB] Actually, that's not quite right -- you could play tricks with the value of cursors that designate the root. Just as the cursor that designates the parent can have a special value, a cursor that designates the root itself can have a special value too. See the implementation of function Root in the prototype. [END MJH] [MJH] procedure Iterate_Child (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); procedure Reverse_Iterate_Child (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); Do we really need Tree here? I don't see why. [RLB] Same as always, if Parent = No_Element, you don't know the container to iterate over. [END RLB] Also, should this be Iterate_Children? [RLB] Yes, that does sound a bit better. [END RLB] It might also be useful to have passive iterators to iterate over siblings. procedure Iterate_Siblings (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); procedure Reverse_Iterate_Siblings (Container : in Tree; Parent : in Cursor; Process : not null access procedure (Position : in Cursor)); [END MJH] [RLB] I don't see the point, just get the parent and use Iterate_Children. That always has to work. Note that Iterate_Children will iterate the entire list of orphan nodes, not just the root when given No_Element. Iterate will iterate just from the Root. [END RLB] [MJH] The following operation is missing: function Root (Container : Tree) return Cursor; This is the most important function in the whole API! See the functions Root and Parent in the prototype for some implementation ideas. [END MJH] [RLB] Not in the model I gave. In the model I had, there are many roots. That's still true, except that now they are called orphans (nodes with no parent). It is possible to designate one of them as the Root with Set_Root. This model is necessary so that it is possible to build subtrees bottom up in a bounded container without copying elements over and over. (Remember, a bounded container is an array.) [END RLB] [MJH] These are also useful: function Is_Root (Position : Cursor) return Boolean; function Is_Leaf (Position : Cursor) return Boolean; function To_Tree (New_Item : Element_Type) return Tree; -- or possibly New_Tree (NI : ET) return T; I see that you have a version of Insert that omits the element parameter, so you can create new node(s). We would have to decide whether a constructor-style function as To_Tree should be overloaded with a parameter-less version, to create a root that has its element initialized to the default value for the type. [END MJH] [RLB] I suppose we need Is_Root and Is_Orphan (but I don't see any point in Is_Leaf = (not Is_Orphan)). There is no reason for To_Tree, because a tree with one element in it is silly; you're going to have to insert other elements anyway. Why have a special routine for the first one?? Besides, we don't have a To_List or To_Map, so I don't see why this one should be different. (Much later): I realize that by "Is_Leaf" you mean "does not have any children". I'm not sure I like this name, but the alternative of Has_No_Children will not cause a lot of excitement. [END RLB] private ... -- not specified by the language end Ada.Containers.Multiway_Trees; The model is that every element in the tree has an associated doubly-linked list of children. Thus we provide almost all of the operations of the list container, each with an added Parent parameter. We did not change the basic names of the operations for familiarity. We added "_Child" to each name in order to describe clearly what is being accessed. We considered leaving the names the same (without the "_Child" suffix), but that seemed confusing, as it would be easy to believe that First always represents the first node of the root of the tree. [MJH] I've been referring to operations as XXX_Siblings and XXX_Children. The siblings versions aren't strictly necessary, if you refer call the XXX_Child operation on the Parent of the node of interest. (But I have to think about it -- you might have to play some tricks using distinguished cursor values, so that that always works, even when the node of iterest is the root. What we definitely do not want is for the user to have to special-case the root -- but we don't want require him to pass the tree object as an extra parameter either. All the information we need is carried in the cursor itself. The only potential problem with distinguished cursor values is whether an equality test with No_Element is allowed to return False, if Has_Element also returns False. Or perhaps you guys have solved the "does equality compose" problem.) [END MJH] [RLB] We're trying to fix the equality composition problem, but it hasn't been looking that good. Not sure what is going to happen there. But in any case there is a whole bunch of orphan nodes (one of which is designated the root). So anything based on "THE ROOT" is not going to work, because you have to be able to work on unattached subtrees as well. I do agree that First_Child, etc. have to work on orphans as well. That's why they have a container parameter! But I will try to think this some more, in case there is a problem. [END RLB] Having "_Child" as part of the names also opens up the possibility of operations not having the "_Child" suffix. These would naturally operate on the entire tree. We're taken advantage of this to define operations like Find and Iterate over the entire tree. [MJH] As above, it might be useful to distinguish between operations that apply to children, vs. operations that apply to siblings. [END MJH] [RLB] I don't think there is any major value to sibling operations beyond the basic navigation ones. Unless we just want to bulk up the Standard. :-) [END RLB] Note that the root of the tree is also a list of elements. (Thus the tree is multirooted as well as multiway.) We considered allowing only a single root node, but it seemed inconsistent for no important reason, and moreover, allowing additional roots allows bottom-up tree creation where subtrees are stashed in the root until their parent is created (at which time Slice_Child would be used to make it a subtree of the appropriate parent). [MJH] I don't agree with anything in the above paragraph. [END MJH] [RLB] The ARG changed this slightly, but we did agree that a list of unattached subtrees (I'm calling them orphans) is needed. So most of this is still true, but the terminology has changed. [END RLB] For all of the subprograms that take a Parent parameter, if the parameter is No_Element, the node is inserted at the root of the tree. This is consistent with the meaning of the Before parameter in the Insert routines in the existing list containers. [MJH] Well, I think that a tree should only have a single root (where do the multiple children of the parent of the root live???). But note that: C := Parent (T.Root); might return some distinguished value that is different from No_Element. (Assuming that behavior is allowed by this API.) See the functions Parent and Root in the prototype for some ideas. [END MJH] [RLB] Well, you'll just have to read the next version of the AI, because I am not going to waste my time writing it here, too. You seem to have missed the entire point of the discussion that Tucker and I had at the meeting. We must have multiple unattached subtrees, else the bounded tree will be too slow to use (you'd have to copy everything at every insertion of a parent). They obviously live in the top-level list. What's inconsistent is for every level to be a list, except the top one! That's what gets us into trouble. [END RLB] Notes on specific subprograms: Operations that were dropped are: Swap and Swap_List; Reverse_Elements; and the sorting operations. These could have been defined, but they seem strange operating on a single list of children; they seem more like operations for an entire container (where they make no sense). [MJH] Disagree. In the canonical implementation of a red-black tree, an element is swapped from a leaf node with an element high up near the root. We have to get all the corner cases right (e.g. one node is the immediate parent of another), but these operations should probably be included. [END MJH] Delete is provided to delete an arbitrary node from the tree. We don't provide a Count parameter on that routine, because it is not obvious what is being counted (sibling nodes only? child nodes only? both sibling and child nodes?) [MJH] Right: it means "delete this subtree". [END MJH] [RLB] Actually, I meant it specifically to delete a single node without having to figure out the parent node and such cruft. I didn't do a very good job of explaining how that would work if the node is a parent (forgot to think about it)! Deleting a subtree is also a good idea. [END RLB] Iterate walks the entire tree starting with the root list of nodes. Each node in every child list is visited in order; after each node is visited, the child list of that of that node are visited. This is done recursively. [MJH] I'm not clear whether you're describing a breadth-first traversal or a depth-first traversal. Do we need two forms of passive iterator? [END MJH] [RLB] Work it out *very carefully*. It is depth-first (I think, I confuse the terms sometimes). I couldn't think of any use for breadth-first traversal, so I didn't provide it. (Among other things, you have to iterate the list twice, which is a silly thing to do.) But I'm willing to be convinced (but think of an actual use, not just that you could do it the other way). In any case, you can write it yourself using the child iterators. [END RLB] Child_Length returns the number of Children nodes of Parent. Child_Depth returns the number of ancestor nodes of Child, up to (but not including Parent). If Parent is No_Element, returns the number of ancestor nodes of Child. Raises Program_Error if Parent is not No_Element, and is not an ancestor of Child. [MJH] See my comments above about Child_Depth. [END MJH] For Insert_Child, if the Before parameter is not a child of Parent, Program_Error is propagated. Delete_Child is similar to Delete, but (a) the node to delete must be a child of the Parent; (b) a Count is provided to delete several children. [MJH] There's no real benefit to this operation. It must be done iteratively (that is, O(n)) no matter what. [END MJH] [RLB] Are you sure? If so, it would again apply in the same way to the similar List operation. In any case, consistency suggests having it. [END RLB] We didn't add "_Child" to Splice, as it usually will be splicing entire subtrees. As with Insert_Child, if Before is not a child of Parent, Program_Error is propagated. Splice moves the node(s) and all child nodes of the node to the new position in the (new) tree. Splice_Children moves only the child nodes of the position to the new position. [MJH] Another way to say "moves the node(s) and all child nodes of the node" is "the subtree whose root is node". [RLB] We didn't define the term "subtree". We could, I suppose, it surely would make things more readable. [END RLB] As I state above, I mixed feelings about calling it "Splice_Children". (Not only that: the only real reason for Splice_Children to allow the splice to be O(1) instead of O(n). Is it really worth it? I dunno.) [END MJH] Selector function Parent returns the parent of the node represented by the given cursor. [MJH] As I allude to above (and in the prototype), it's not necessarily that simple -- it all depends on what is permissible per this API. The cursor values T.Root and Parent (T.Root) could have distinguished values, that are useful treating the root as just another node, but these values don't necessarily compare equal to the distinguished cursor value No_Element. [END MJH] [RLB] I don't see the point; and if you do that, you can never default a parameter to No_Parent (equiv. No_Element). That seems bad at best. [END RLB] Next and Previous were renamed Next_Sibling and Previous_Sibling to make the meaning clearer (they work just like the list versions). [MJH] Right, and you could make that case for a few other operations as well. Even with dedicated XXX_Sibling(s) operations, it would be nice to have a guarantee that calls such as XXX_Children (Parent (T.Root)) just work -- even if the tree is empty. For example, I would expect that: First_Child (Parent (T.Root)) = T.Root Last_Child (Parent (T.Root) = T.Root First_Sibling (T.Root) = T.Root Last_Sibling (T.Root) = T.Root Assuming function Root (T : Tree) return Cursor; is defined. [END MJH] [RLB] None of this is necessarily true, because Root is just one orphan node. We could define the first orphan node to be the Root, so First_Child (Parent (T.Root)) = T.Root but surely Last_Child could be anything in that case. If we left the orphan nodes out of First_Child/Last_Child completely (and provided First_Orphan/Last_Orphan/etc to deal with them), then if you have a non-root orphan node, you would never find yourself on the sibling list of the top-level. That would make things very irregular. So I'm proposing that Root is only special for a few operations (mostly iterators). At least until I think about this more. [END RLB] Note that we left the Container parameter in routines like First_Child, because the Parent parameter can be No_Element to represent the root, and we need to know where to look for that root. An alternative design would be to have a separate set of routines for accessing the root, but that would clutter the package and complicate use (a lot of insertion code would have to explicitly check to see if it is inserting at the root: a scourge of hand-written code that we don't want to duplicate in the containers!). [MJH] Neither of these things is true. First_Child and friends should only have a single cursor parameter. [END MJH] [RLB] But how do you find the root then? There surely can be more than one Tree in existence in a program! Looking at your implementation, you have First_Child (No_Element) = No_Element. But that is wrong, it has to return the first orphan node (in your implementation, it ought to return the Root node itself). First_Child (First_Child (No_Element)) needs to get the first grandchild element (first second-level element). [END RLB] Iterate_Child and Iterate_Reverse_Child just iterate over the child nodes of Parent (and no others). [MJH] See my comment above. I don't think you need a container parameter. A cursor always points to the root of some subtree, so I don't see why you also need to pass the tree container. The only reason an operation needs to specify a container parameter is to preserve constant/variable view semantics. (I'm not sure whether the formal cursor parameter should be named "Parent" either -- but I have to think about it.) [END MJH] Further ideas: * More iterators could be useful. One could imagine wanting to visit the child nodes before their parents, as well as a straight breadth-first iteration. [MJH] Right, and this is especially true for searching. [END MJH] **************************************************************** From: Randy Brukardt Date: Monday, June 8, 2009 1:07 PM [Answering some private mail from Matt:] > Well, I confess to not knowing what "bottom-up tree construction is" > (unless we're talking about a B-tree). I also assumed that Splice and > Move would be appropriate mechanisms for moving trees and subtrees > across tree containers. It's the way trees are constructed when using a bottom-up parser - first you create the leaves, then the operation connecting the leaves, and so on. I have difficulty even imagining another way to create a tree when parsing; too many years using LALR(1) parsers I guess. (I realize that grasping how bottom-up parsers work is really hard unless you've been using them in practice for a while -- I think that is why our compiler-construction course required using them.) ... > > (It seems silly to have to create a special object solely for the > > purpose of making a copy of some nodes, especially bad in a bounded > > container type, where it would cause an extra set of copying.) > > Well, that's where we disagree -- creating a separate (tree) object to > hold (sub)trees you intend to splice onto another tree is *exactly* > what you want. It's way too expensive, especially for bounded forms. And it's not how you think of subtrees as you are creating them if done without using a container, why should they be different here? We're probably going to have to agree to disagree on this one. > Concern about efficiency of a bounded form is also misplaced. > It's better to optimize the design around the common case, which is > an unbounded form. We already accept that Move, Splice, etc is going > to require a copy for other bounded forms, so I see no reason to treat > a tree container any differently. For most (paying) Ada customers, the bounded form is the only one they'll be able to use. (Why do you think that we have so much pressure to define them??) For lists, the use of Move and Splice are going to be very rare. I can't recall ever having tried to insert something into the middle of a normal list (priority queues are not a list!). Indeed, I think Splice is one of the operations that will commonly be ignored as people will consider it "exotic" and forget about it in favor of more familiar operations. (At least half of the operations in Ada.Strings.Unbounded qualify like that; I don't even try to use them because the effort to figure out whether one applies and how to call it if it does.) For a tree, however, pretty much the only operation that you are going to use (other than Insert) is a Splice_Subtree, since it is the only way to do any sort of tree construction or transformation. It is simply unacceptable to make tree construction impossibly expensive by forcing copying everything at every step. ****************************************************************