Folding Lists

Intro

Folds are catamorphisms. “Cata-” means “down” or or “against”, as in catacomb. “Morph” means “form”, like we already know from “monomorphism” or “polymorphism”.

Cathamorphism is a means to deconstruct data. A list has a spine defining the list’s structure, and a fold is the operation that can reduce (transform) that structure. A fold (catamorphism) can break down that structure, but evaluation can potentially reconstruct the list, which is why folds can operate on lists and still return lists as a result.

Fold Right

foldr is short for fold right.

Although not all programming languages implement the exact same concept of a reduce function, we can more or less think of foldr as reduce.

Node.js REPL
$ node --interactive
> [1, 2, 3].reduce((acc, n) => acc + n, 0);
6
IRB REPL
$ irb --simple-prompt
>> [1, 2, 3].reduce(0) { |acc, n| n + acc }
=> 6

There are more idiomatic ways of doing it in Ruby, but we kept it like this to make the similarities easier to behold.

Python REPL
$ python
from functools import reduce
>>> reduce(lambda acc, n : acc + n, [1, 2, 3], 0)
6

And finally foldr in Haskell:

GHCi REPL
$ ghci
λ> foldr (+) 0 [1, 2, 3]
6

GHCi 9.2.8 (2023) gives us this type for foldr:

type signature of foldr
λ> :type foldr
foldr :: Foldable t => (a -> b -> b) -> b -> t a -> b

Because foldr works on any foldable type (not just lists), it says “Foldable t”. For now, we can replace (mentally or on our simpler code examples) Foldable t with [a] (list of \$a\$). That is possible because we can make generic types more concrete (but not the other way around).

Type-checks perfectly fine:

myFoldR :: (a -> b -> b) -> b -> [a] -> b
myFoldR = foldr

A fold replaces each cons constructor with the function being applied and reduces the list.

This:

foldr (+) 0 [1, 2, 3]

Is the same as:

foldr (+) 0 1 : 2 : 3 : []

Which is more or less like this:

1 + (2 + (3 + ))

We could say foldr is similar to map. While map applies a function to each element of a list and returns a list, foldr maps a function of each element of the list and also reduces the list.

foldr is right associative.

fldr :: (a -> b -> b) -> b -> [a] -> b
fldr _ z []       = z
fldr f z (x : xs) = f x (fldr f z xs)

And fldr (+) 0 [1, 2, 3] evaluates like this:

fldr (+) 0 [1, 2, 3]
(+) 1 (fldr (+) 0 [2, 3])
(+) 1 ((+) 2 (fldr (+) 0 [3]))
(+) 1 ((+) 2 ((+) 3 (fldr (+) 0 [])))
(+) 1 ((+) 2 ((+) 3 0))
(+) 1 ((+) 2 3)
(+) 1 5
6

Which can be also thought like this:

1 + (2 + (3 + 0))
1 + (2 + 3)
1 + 5
6

Visualizing foldr with a helper

--
-- A “print fold right” utility to display the folding right
-- of simple lists of numbers using the `+` operator.
--
pfr :: Foldable t => t [Char] -> [Char]
pfr = foldr (\x y -> concat ["(", x, " + ", y, ")"]) "0"
--
-- λ> pfr $ map show [1 .. 5]
-- "(1 + (2 + (3 + (4 + (5 + 0)))))"
--

Folding Stages

Folding happens in two stages.

  • Traversal (recursing over the spine).

  • Evaluation (folding of the values by applying the function).

In the first stage, the spine is recured computing expressions to be evaluated at the second stage. In the second stage, those expressions are evaluated. The function is finally really applied to the arguments and the list is reduced.

Both left and right folds recur the spine from left to right, but then the folding proper can be left-associative (for left folds) or right-associative for right folds).

fldr (+) 0 [1, 2, 3]
(+) 1 (fldr (+) 0 [2, 3])                Traverse (recurse) from left
(+) 1 ((+) 2 (fldr (+) 0 [3]))           to right building up exprs.
(+) 1 ((+) 2 ((+) 3 (fldr (+) 0 [])))

(+) 1 ((+) 2 ((+) 3 0))                  Evaluates the expressions
(+) 1 ((+) 2 3)                          by applying the function (+)
(+) 1 5                                  thus reducing the list.
6

The evaluation of the expressions is the actual folding.

A fold right uses “the rest of the fold” as one of its arguments.

fldr f z (x : xs) = f x (fldr f z xs)
--                      ^-----------^
--                            /
--                           /
--                    rest of the fold

Force or not to force

Because of this two-stage process and non-strict (lazy) evaluation, if the function does not evaluate (force) its second argument (the rest of the fold), the rest of the spine is not forced, which is what allows Haskell to work with infinite streams of data.

+, for instance, is strict on both arguments. It will force the all the spine and all the values, which is why we cannot do something like this (note the infinite list):

Runs forever. Don’t do this at home.
foldr (+) 0 [1..]

Remember you can hit Ctrl+c to cancel long-running processes.