Copyright (c) 1994 W. Brainerd, C. Goldberg, and J. Adams.
All rights reserved.
This file may not be copied without permission of the authors.

Programmer's Guide to Fortran 90, 3rd Edition

Chapter 8 Pointer Variables

8.3 Trees

One of the big disadvantages of using a linked list to sort numbers is that the resulting program has poor expected running time. In fact, for the program list_sort, the expected running time is proportional to n2, where n is the number of numbers to be sorted. A much more efficient sorting program can be constructed if a slightly more complicated data structure, the binary tree, is used. The resulting program, tree_sort, has an expected running time proportional to n log2 n instead of n2.

It is quite difficult to write nonrecursive programs to process trees, so we will think of trees as recursive structures right from the start. Using this approach, a binary tree of integers is either empty or is an integer, followed by two binary trees of integers, called the left subtree and right subtree.

8.3.1 Sorting with Trees

To sort numbers with a tree, we will construct a special kind of ordered binary tree with the property that the number at the ``top'' or ``root'' node of the tree is greater than all the numbers in its left subtree and less than or equal to all the numbers in its right subtree. This partitioning of the tree into a left subtree containing smaller numbers and a right subtree containing larger numbers is exactly analogous to the partitioning of a list into smaller and larger sublists that makes quicksort (4.3.1) an efficient algorithm. This property will hold not only for the most accessible node at the ``top'' of the tree (paradoxically called the ``root'' of the tree), but for all nodes of the tree. To illustrate this kind of tree, suppose a file of integers contains the numbers 265, 113, 467, 264, 907, and 265 in the order given. To build an ordered binary tree containing these numbers, first start with an empty tree. Then read in the first number, 265, and place it in a node at the root of the tree, as shown in Figure 8-9.

A blank box in the lower part of a node is understood to represent a null pointer.

When the next number is read, it is compared with the first. If it is less than the first number, it is placed as a node in the left subtree; if it is greater than or equal to the first number, it is placed in the right subtree. In our example, 113 < 265, so a node containing 113 is created and the left subtree pointer of the node containing 265 is set to point to it, as shown in Figure 8-10.

The next number is 467, and it is placed in the right subtree of 265 because it is larger than 265. The result is shown in Figure 8-11.

The next number is 264, so it is placed in the left subtree of 265. To place it properly within the left subtree, it is compared with 113, the occupant of the top of the left subtree. Since 264 >= 113, it is placed in the right subtree of the one with 113 at the top to obtain the tree shown in Figure 8-12.

The next number 907 is larger than 265, so it is compared with 467 and put in the right subtree of the node containing 467, as shown in Figure 8-13.

The final number 265 is equal to the number in the root node. An insertion position is therefore sought in the right subtree of the root. Since 265 < 467, it is put to the left of 467, as shown in Figure 8-14. Notice that the two nodes containing the number 265 are not even adjacent, nor is the node containing the number 264 adjacent to either node with key 265. This doesn't matter. When the tree is printed, they will come out in the right order.

8.3.2 Type Declarations for Trees

The declaration for the node of a tree is similar to the declaration for the node of a linked list, except that the node must contain two pointers, one to the left subtree and one to the right subtree. As with lists, we could have tree be a derived data type, which implies it must be a structure with one component, a pointer to the node of a tree. But just to be different, let's not put the tree operations in a module. Having made this decision, the extra syntax needed to select the pointer component using the % symbol clutters up the program enough that we will simply declare things that would be trees to be pointers to a node, the root node of the tree. Thus, the declarations needed are

type node
   integer :: value
   type (node), pointer :: left, right
end type node

type (node), pointer :: t

8.3.3 The insert Subroutine

The subroutine that inserts a new number into the tree is a straightforward implementation of the following informal recipe: if the tree is empty, make the new entry the only node of the tree; if the tree is not empty and the number to be inserted is less than the number at the root, insert the number in the left subtree; otherwise, insert the number in the right subtree.

recursive subroutine insert (t, number)

   type (node), pointer :: t  ! Really a tree.
   integer, intent (in) :: number

   ! If (sub)tree is empty, put number at root
   if (.not. associated (t)) then
      allocate (t)
      t % value = number
      nullify (t % left)
      nullify (t % right)

   ! otherwise, insert into correct subtree
   else if (number < t % value) then
      call insert (t % left, number)
   else
      call insert (t % right, number)
   end if

end subroutine insert

8.3.4 Printing the Tree in Order

The recipe for printing the nodes of the tree follows from the way the tree has been built. It is simply to print in order the values in the left subtree of the root, print the value at the root node, then print in order the values in the right subtree. This subroutine is shown in the following complete program that sorts a file of integers by reading them all in, constructing an ordered binary tree, and then printing out the values in the tree in order.

program tree_sort
! Sorts a file of integers by building a
! tree, sorted in infix order.
! This sort has expected behavior n log n,
! but worst case (input is sorted) n ** 2.

   implicit none
   type node
      integer :: value
      type (node), pointer :: left, right
   end type node

   type (node), pointer :: t  ! A tree
   integer :: number, ios

   nullify (t)  ! Start with empty tree
   do
      read (*, *, iostat = ios) number
      if (ios < 0) exit
      call insert (t, number) ! Put next number in tree
   end do
   ! Print nodes of tree in infix order
   call print_tree (t)

contains

   recursive subroutine insert (t, number)

      type (node), pointer :: t  ! A tree
      integer, intent (in) :: number

      ! If (sub)tree is empty, put number at root
      if (.not. associated (t)) then
         allocate (t)
         t % value = number
         nullify (t % left)
         nullify (t % right)
      ! Otherwise, insert into correct subtree
      else if (number < t % value) then
         call insert (t % left, number)
      else
         call insert (t % right, number)
      end if

   end subroutine insert

   recursive subroutine print_tree (t)
   ! Print tree in infix order

      type (node), pointer :: t  ! A tree

      if (associated (t)) then
         call print_tree (t % left)
         print *, t % value
         call print_tree (t % right)
      end if

   end subroutine print_tree

end program tree_sort

8.3.5 Exercise

  1. Experiment with the program tree_sort, sorting different quantities of randomly generated integers to determine an approximate formula for the complexity of the program. It should be proportional to n log2 n.