The Baron wrote:
But Alien was "I'm going to piss myself" fear!
We can't have people wetting their pants. That would be uncivilised, and *pokes self with large stick* inappropriate for the Debate Club forum. So, I will metaphorically spoil the movie (Ripley gets away! With the cat!) and look through the above code in some detail.
The most important function above is, of course,
insert. Its type annotation is
Ord a => a -> RBTree -> RBTree a, which means, "this is a function on any type that accepts a total ordering, such that given an element of that type and a red-black tree over that type, a new red-black tree will be returned."
The whole definition of
insert is
insert x = blkRoot . ins_x. That is a
curried definition: though insert notionally takes two parameters, what it really does is take an element and return a function from trees to trees that depends upon that element.
The period in the expression
blkRoot . ins_x is the operation of composition, usually written as a small circle in mathematical texts, but ASCII doesn't have little circles. Thus, that expression is equivalent to
\ t -> blkRoot(ins_x(t)). Hint: the backslash character in Haskell represents "lambda".
The functions
ins_x and
blkRoot are entirely contained as sub-functions of
insert, after the "where" keyword. Haskell is a fully block-structured language, in that variables in outer functions are available to inner functions; this is not the case, for example, in C++. (Yes, it's possible to define inner functions in C++, but it's somewhat tricky, and you don't get access to the outer functions' data, so there's not much point in doing it. Exercise for the reader.)
ins_x inserts the value of the outer function's parameter
x into a red-black tree. It follows the ordinary binary-search-tree algorithm for insertion on the way down the tree, but then fixes up the tree afterward for violations of the red-black principles.
Most imperative implementations do the fixing on the way down as well, and require far more code to do it; naturally, it's slightly faster to do it that way, but there are a few drawbacks: top-down colour adjustments are almost necessarily destructive, so the tree is thereby changed in place and is thus not persistent across updates, and they are
much more complicated.
In the code above, the balancing is done by the
rbal and
lbal functions (for red-black violations on the right and left respectively) but that is something of an optimisation. Let us consider an alternate function:
Code:
bal :: RBTree a -> RBTree a
bal (Br (Br (Br lll lly Red llr) ly Red lr) y Blk r)
= (Br (Br lll lly Blk llr) ly Red (Br lr y Blk r))
bal (Br (Br ll ly Red (Br lrl lry Red lrr)) y Blk r)
= (Br (Br ll ly Blk lrl) lry Red (Br lrr y Blk r))
bal (Br l y Blk (Br rl ry Red (Br rrl rry Red rrr)))
= (Br (Br l y Blk rl) ry Red (Br rrl rry Blk rrr))
bal (Br l y Blk (Br (Br rll rly Red rlr) ry Red rr))
= (Br (Br l y Blk rll) rly Red (Br rlr ry Blk rr))
bal (Br (Br ll ly Red lr) y Blk (Br rl ry Red rr))
= (Br (Br ll ly Blk lr) y Red (Br rl ry Blk rr))
bal t = t
One can replace all references to
lbal and
rbal in the original program with references to this
bal and it will still work. (Slightly, very slightly more slowly.) However, looking at this
bal shows all the distinct rotations required by the red-black tree structure together, and hopefully understandably. In order, then, the clauses mean:
- When at the root of a tree with a red left subtree that has a red left subtree, do a double right rotation.
- When at the root of a tree with a red left subtree that has a red right subtree, do a right-left rotation.
- When at the root of a tree with a red right subtree that has a red right subtree, do a double left rotation.
- When at the root of a tree with a red right subtree that has a red left subtree, do a left-right rotation.
- When at the root of a tree with two red subtrees, redden the root and blacken the subtrees.
- Otherwise, don't do anything.
The only other really funny thing I did in the above program was in the
elems function, which performs an infix traversal of an RBTree and returns all the elements of the tree in an ordinary list. I gave a hint in the original definition of that function of what kind of trick I was pulling; I'll let you figure out what it was.