13 Binary Search Trees

Multiset ADT

Supposedly the goal of binary search trees will help implement the Multiset ADT 7 Abstract Data Types

Which supports:

Searching a tree like a normal list is O(n) however, we can add structure (augmentation) to make it more efficient.

In the case of Python lists, if we assume that the list is sorted then the binary search algorithm can greatly improve efficiency.

Binary Search Tree Properties:

  1. Every item has at most two subtrees
  2. The item in the tree is greater than or equal to all items in its left subtree or less than or equal to all items in the right subtree

This makes BSTs extremely efficient in doing operations like searching for an item; but unlike sorted Python lists, they can be much more efficient at insertion and deletion while maintaining the sortedness of the data.

from typing import Any, Optional

class BinarySearchTree:
    """Binary Search Tree class.

    This class represents a binary search tree satisfying the 
    Binary Search Tree property:
    For every node, its value is >= all items stored in its left subtree,
    and < all items stored in its right subtree.

    === Private Attributes ===
    _root: Optional[Any]
        The item stored at the root of the tree, or None if the tree is empty
    _left: Optional[BinarySearchTree]
        The left subtree, or None if the tree is empty
    _right: Optional[BinarySearchTree]
        The right subtree, or None if the tree is empty
    
	=== Representation Invariants ===
	- If _root is None, then so are _left and _right.
	This represents an empty BST.
	- If _root is not None, then _left and _right are BinarySearchTrees.
	- (BST Property) All items in _left are <= _root,
	and all items in _right are >= _root.
    """

    _root: Optional[Any]
    _left: Optional["BinarySearchTree"]
    _right: Optional["BinarySearchTree"]

    def __init__(self, root: Optional[Any]) -> None:
        """Initialize this Binary Search Tree.
        
        If <root> is None, this tree is empty.
        Otherwise, the tree contains one node with item <root>.
        """
        self._root = root
        self._left = None
        self._right = None

Empty Binary Search Tree VS Tree: Both will have None roots but the Tree will have [] as subtrees however the BinarySearchTree will have None for each of its subtrees. This is the only time Binary search trees will have a None attribute otherwise the subtrees are empty BinarySearchTree objects.

def __contains__(self, item: Any) -> bool:
	""" Return whether <item> is in this BST
	"""
	if self.is_empty():
		return False
	elif item == self._root: 
		return True 
	elif item < self._root: 
		return item in self._right # or, slef._left.__contains(item)
	else: 
		return item in self._right # or, self._right__contains__(item)

Mutating Binary Search Trees

Deleting

def delete(self, item: Any) -> None:
	""" Remove *one* occurence of <item> from this BST
	Do nothing if <item> is not in the BST
	"""
	if self.is_empty():
		pass
	elif self._root == item: 
		self.delete_root()
	elif self._root > item: 
		self._left.delete(item)
	else: 
		self._right.delete(item)
def delete_root(self) -> None:
	""" Remove the root of this tree. 
	Precondition: this tree is *non-empty*
	"""
	if self._left.is_empty() and self._right.is_empty():
		self._root = None
		self._left = None
		self._right = None
		
	elif self._left.is_empty():
		self._root, self._left, self._right = \
		self._right._root, self._roght._left, self._right._right
		
	elif self._right.is_empty():
		self._root, self._left, self._right = \ 
		self._left._root, self._left._left, self._left._right

	else:
		self._root = self._left.extract_max()
def extract_max(self) -> object:
	""" Remove and return the maximum item stored in this tree. 
	
	Precondition: this tree is *non-empty*
	"""

	if self._right.is_empty():
		max_item = self._root 
		self._root, self._right, self._left = \ 
		self._left._root, self._left._right, self._left._left
		return max_item
	else: 
		return self._right.extract_max()

Runtime

Contains: the worst case is proportional to the height of the tree, O(h)
We call this WC(n) which maps the input size n to its worst-case scenario. A tree of height n can have at most 2h1 items.

n+1=2hh=ln(n+1)ln2

So contains is O(ln(n))