Using Unfolds to Zip and More in OCaml

This post picks up from where we left off here, where I explained anamorphisms.  In the last post we looked at iterating with anamorphisms.  Here we look at another example of applying unfolds: zip.

Zip

Zip may not be the most common example of anamorphisms but I want to cover it because it’s included in the paper I’m going through, and it’s cool! Zip is the function that turns a pair of lists to a list of pairs.  E.g., zipping the lists

Zip is the function that turns a pair of lists to a list of pairs.  E.g., zipping the lists

  • [1,2,3,4,5]
  • [6,7,8,9,10]

will give [(1,6),(2,7),(3,8),(4,9),(5,10)]. 

To write the zip function, we take the head of each of the input lists, and pair them and put them into the output list.  We repeat the process to the rest of the input lists until any of the input lists is empty.

To write this in terms of unfold, we have to determine the predicate (p) and the function (g) unfold requires.  We also have to specify the input to h (b).  Recall the definition of unfold:

Given

(1) a predicate p which returns a bool. I.e., p b = true or p b = false.

(2) a function g which returns a pair (a, b’).  I.e., g b = (a, b’).

A list-anamorphism h is

When p b = true, h b = Nil,

otherwise, h b = Cons (a, h b’).

For zip, let

  • b = ( Cons (a, as), Cons (c, cs) ),
    the input is two lists, with the heads and the tails specified.
  • p = (as = []) or (cs = []),
    the predicate to stop the zipping is when any of the remaining input lists are empty.
  • g b = ( (a, c), (as, cs) ).

Putting these in the definition of unfold:

When (as = []) or (cs = []), h b = Nil,

otherwise h b  = Cons ((a, c), h (as, cs)).

There! We have the zip function that pairs up two lists into one until any of the input lists is empty!

More than Zip

What if we want zip to stop under a different condition?  We can give h a different predicate.  E.g., when p = (as = []) or (cs = []) or (a = 0) or (c = 0), then the zipping stops when any of the list is empty, or when any of the heads of the input lists is zero.

What if we want to turn more than two lists into one?  We can given h different b, p and g.  E.g., to combine three lists:

  • b = ( Cons (a, as), Cons (c, cs), Cons (d, ds) ),
  • p = (as = []) or (cs = []) or (ds = []), and
  • g b = ( (a, c, d), (as, cs, ds) )

What if we want the output list to have different configuration? E.g., when g b = ( (a, c, c), (as, cs) ), we get the elements of the second input list two times in each tuple of the output list.

Or when g b = ( (a, (a + c) ), (as, cs) ), we get the sum of the two elements in each second item of the tuple. Let’s see these in OCaml!

In OCaml

To write zip with unfolds, we can use the unfold function we wrote last time but modify it for two inputs, and specify the b, p, g as above:

You can see that zip works as intended when you run the above code in utop:

utop # #use "zip.ml";;
val unfold :
  ('a -> 'b -> bool) -> ('a -> 'b -> 'c * ('a * 'b)) -> 'a -> 'b -> 'c list =
  <fun>
val zip : '_a list -> '_b list -> ('_a * '_b) list = <fun>
utop # zip [] [];;
- : ('_a * '_b) list = []
utop # zip [1;2;3] [4;5;6];;
- : (int * int) list = [(1, 4); (2, 5); (3, 6)]
utop # zip [1;2;3] [1];;
- : (int * int) list = [(1, 1)]

If we want zip to stop under different conditions, we can specify it in the predicate. For the example of stopping the zipping when any of the input lists is empty, or when any of the heads of the input lists is zero:

You can see that zip_no_zero works as intended when you run it in utop:

utop # zip_no_zero [1;2;3;4] [1;2;0;4];;
- : (int * int) list = [(1, 1); (2, 2)]
utop # zip_no_zero [1;2;3;4] [1;2;3;4];;
- : (int * int) list = [(1, 1); (2, 2); (3, 3); (4, 4)]

If we want to zip more than two lists at once, we can modify unfold to take more inputs:

utop # zip3 [] [] [];;
- : ('_a * '_b * '_c) list = []
utop # zip3 [1] [2] [3;4];;
- : (int * int * int) list = [(1, 2, 3)]
utop # zip3 [1;5] [2;6] [3;4];;
- : (int * int * int) list = [(1, 2, 3); (5, 6, 4)]

If we want the output list to have different configuration, we simply modify g. E.g., to get the elements of the second input list two times in each tuple of the output list:

utop # zip_repeat_input2 [1;2;3;4] [5;6;7;8;9];;
- : (int * int * int) list = [(1, 5, 5); (2, 6, 6); (3, 7, 7); (4, 8, 8)]

To get the sum of the two elements in each second item of the tuple:

utop # zip_sum [1;2;3;4] [5;6;7;8];;
- : (int * int) list = [(1, 6); (2, 8); (3, 10); (4, 12)]

In the next post, we will look at the equivalent code in Haskell, and compare the two languages.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.