Catamorphisms aka folds explained

This is the fourth of a series of articles that illustrates functional programming (FP) concepts. As you go through these articles with code examples in Haskell (a very popular FP language), you gain the grounding for picking up any FP language quickly. Why should you care about FP? See this post.

In the last post, we learned about the foldable typeclass, which provides the foldr method. In this post, we'll learn what this method is with an example.

As mentioned in earlier posts, we can treat recursions as separate higher order functions known as recursion schemes. One type of recursion scheme is catamorphisms (or fold in Richard Bird and Philip Wadler’s Introduction to Functional Programming).

Why Should I Care?

Because knowing how to use recursion schemes improves your coding ability! Consider this exercise: finding the minimum element of a list. In Haskell, without using fold, we would compare the elements of the list a one by one, keeping the minimum of the element, until we reach the end of the list. For example (interested readers can also see the OCaml version):

findMinInner :: Ord a => [a] -> a -> a
findMinInner [] curMin = curMin 
findMinInner (hd : tl) curMin = 
  if hd < curMin 
    then findMinInner tl hd
    else findMinInner tl curMin

findMin :: Ord a => [a] -> Maybe a
findMin [] = Nothing
findMin (hd:tl) = Just (findMinInner tl hd)

Using catamorphisms, we can find the minimum for an Int list in two lines:

findMinFold :: [Int] -> Int
findMinFold l = foldr (\x y -> if x < y then x else y) maxBound l

Or, using the function min in the prelude to replace the anonymous function and eta-reducing l:

findMinFold :: [Int] -> Int
findMinFold = foldr min maxBound

Catamorphisms dramatically reduce the amount of code because foldr does exactly what we want it to do for us: recurse down each element of the list.

Aside: The Maybe type

findMin returns a Maybe a. The Maybe a type is used when a type can be Nothing. It either returns Nothing or the value. In this case, when the input list is empty, the minimum element is nothing. Running the code in GHCi gives:

*Main> findMin [9,2,3,1]
Just 1
*Main> findMin []

Using Maybe is not strictly necessary, Haskell's prelude (standard library) includes the minimum function which throws an exception when an empty list is the input:

*Main> minimum []
*** Exception: Prelude.minimum: empty list
*Main> minimum [4,5,6,7,8]

findMinFold, on the other hand, returns the base case when the input list is empty.

How Does It Work?

Going back to the above example, what’s going on in that one line?

First, let us look at findMinFold's type from GHCi:

Prelude> :t findMinFold 
findMinFold :: (Foldable t, Ord b, Bounded b) => t b -> b

findMinFold takes a foldable type b and returns a result of type b. More specifically, it takes a [Int] and returns an Int.

We invoke foldr in findMinFold, which has the following type signature:

*Main> :t foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b

That is, foldr takes 3 inputs:

  • A function that takes two arguments, one of type a and one of type b, and returns a result of type b.
  • An input of type b.
  • An input of foldable type a.

In findMinFold, we've given foldr 2 inputs. The first is the function min, of type:

Prelude> :t min
min :: Ord a => a -> a -> a

min happens to have the second input type the same as the first, but bear in mind that it doesn't have to be as an input function to foldr!

The second input is maxBound, of type:

Prelude> :t maxBound
maxBound :: Bounded a => a

Int is an instance of the Bounded typeclass. In findMinFold, maxBound is thus of type Int. Note that Integer and Float are not bounded!

foldr recurses the function with the earlier result as an input. In this example, foldr does the following:

  • When the list is empty, it returns the maxBound:

fold_left min max_int [] => maxBound

  • When the list has one element, it returns the min of maxBound and the element:

foldr min maxBound [e1] => min e1 maxBound

  • When the list has two elements, min takes the result of (min e2 maxBound) as an input, and compare it with e1:

foldr min [e1,e2] maxBound => min e1 (min e2 maxBound)

  • When the array has three elements, min takes the result of (min e3 maxBound) as an input to compare with e2,
    the result of which is then compared to e1:

foldr min maxBound [e1,e2,e3] => min e1 (min e2 (min e3 maxBound))

and so on...

We can generalize it to any function f, with z as the starting case and x as the list input, the result is as in the prelude with f as an infix operator:

foldr f z [x1, x2, ..., xn] == x1 f (x2 f ... (xn f z)...)

foldl works similarly, except that the list elements are the first input to the function, and the brackets are to the left:

foldl f z [x1, x2, ..., xn] == (...((z f x1) f x2) f...) f xn

The Base Case

You can see from the above example that to find the minimum element of a list, we can use foldr or foldl, naturally with the min function as the function input. But how do we choose the base case input?

The base case is what fold returns on an empty list input. When the input to findMinFold is an empty list, the base case is returned. As you can see from GHCi, the maximum Int on my system is 9223372036854775807:

*Main> findMinFold []

Other examples are:

-- adding 0 and x returns x
sum :: (Foldable t, Num a) => t a -> a
sum = foldr (+) 0

-- multiplying 1 and x returns x
product :: (Foldable t, Num a) => t a -> a
product = foldr (*) 1

Because the base case to foldr is only there to safe guard an empty input, when you can be sure that the input is not empty, the base case would not be necessary! Haskell provides such option. We’ll see this in the next post, along with more examples of folds!

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.