Folding trees and efficient color coding
Folds are convenient and efficient abstractions of recursive patterns for processing data structures. In 1984 John Hughes wrote “Why functional programming matters”, in which he presents very good arguments for thinking about programming in terms like folds. In fact, many popular programming languages, even though they are not functional, provide some forms of folding, typically by functions called map
and reduce
that work on lists. Any recursive data structure (and a list is such a thing) can be “folded” over. OCaml code for folding over trees can be written as follows:
(* a tree is a pair consisting of a label and a list of subtrees *)
type tree = Node of (int * tree list)
(* tree fold *)
let rec treefold f g z tree =
let rec treefolds f g z subtrees = match subtrees with
| [] -> zin
| (x::xs) -> g (treefold f g z x) (treefolds f g z xs) match tree with Node(label, subtrees) -> f label (treefolds f g z subtrees)
The attraction of folds is that they make it easy to implement different transformations of data structures. For example a function that returns the vertices of a tree in pre-order can be written using treefold
as:
let preorder = treefold List.cons (@) [];;
I wrote above code as a part of a function that compiles trees into abstract machine instructions that given streams of labels compute a function over an implicitly reconstructed tree labeled by the labels taken from each stream. The motivation for this was a functional implementation of Color Coding. A description of the algorithm and its implementation can be found in the short write-up Quickly pulling a tree out of a stream.