Data Structures in OCaml

In this post I continue going through Chapter 1 of Real World OCaml.  I learnt that:

  • Errors may be caught at compile time or at run time.  Compile-time errors are preferred to runtime errors, because it's better to catch errors as early as possible in the development process.  An e.g. of a compile-time error is type errorExceptions like division by zero is a runtime error.
  • Concatenation operators are provided as part of the Pervasives module, which is automatically opened in every OCaml program.
  • We used the ^ operator for concatenating strings.
  • One can define an anonymous function using the fun keyword.  Anonymous functions don't need to be explicitly named.
  • Another syntax for pattern matching in OCaml is the let pattern = expression construct.  You can extract values from a data structure.  E.g., you can extract the components of a tuple:
    let a_tuple = (3,"three");;
    
    let (x,y) = a_tuple;;
    

    OCaml put the value 3 in the variable x, and "three" in the variable y.

More on Lists

Base comes with a List module that has a rich collection of functions for working with lists. We can access values from within a module by using dot notation.  E.g.

  • List.length compute the length of a list.
  • List.map takes two arguments: a list and a function for transforming the elements of that list. It returns a new list with the transformed elements and does not modify the original list.  The arguments of List.map do not have to be ordered because the function is passed on with a labeled argument ~f.  That is, the following are equivalent:
    List.map ~f:FunctionName Listname
    List.map ListName ~f:FunctionName
  • List.exists takes a list and a function as arguments, and checks if there are any elements of the list for which the provided function evaluates to true.

The common OCaml idiom uses hd to refer to the head of the list and tl to refer to the tail.

Options

An option is a data structure that is used to express that a value might or might not be present.  It can be thought of as a list that have either zero or one element.  Option is the standard way in OCaml to encode a value that might not be there.  Some and None are operators (constructors) for options.  E.g.

let divide x y =
  if y = 0 then None else Some (x / y) ;;

The function divide takes 2 arguments of type int, and returns type int option.  The type signature is:

val : divide int -> int -> int -> option = <fun>

The function divide either returns None if the divisor is zero, or Some of the result of the division otherwise.

Making Your Own Types: Records

You can define your own data type in OCaml using the following construct:

type NewTypeName = { data structure with type information } ;;

E.g. to define the type point2d with the two components of type float:

type point2d = { x : float; y : float };;

point2d is of type record.  You can think of record as a tuple where the individual fields are named, rather than being defined positionally.  In this case, the first field is named x, and the second is named y.  You can construct a variable of type record with let, as usual:

let p = { x = 3.; y = -4. };;

The type-signature of p is

val p : point2d = {x = 3.; y = -4.}

You can access the content of a record type in many ways:

  • Using pattern matching:
    let magnitude { x = x_pos; y = y_pos } =
      Float.sqrt (x_pos **. 2. +. y_pos **. 2.);;
    

    The pattern match here binds the variable x_pos to the value contained in the x field, and the variable y_pos to the value in the y field.

  • Using field pruning:
    let magnitude { x; y } = Float.sqrt (x **. 2. +. y **. 2.);;
    

    Because the name of the fields (x, y in the point2d type) and the name of the variable it is bound to (the argument input to magnitude) coincide, we don't have to write them down.

  • Using the not notation:
    let distance v1 v2 =
      magnitude { x = v1.x -. v2.x; y = v1.y -. v2.y };;
    

    Here, v1 and v2 are of type point2d.  We access the x field of v1 with the v1.x notation, and the y field of v1 with the v1.y notation.  Similarly for v2.

We can include our newly defined types as components in larger types.  E.g.:

type circle_desc  = { center: point2d; radius: float };;

type rect_desc    = { lower_left: point2d; width: float; height: float };;

type segment_desc = { endpoint1: point2d; endpoint2: point2d } ;;

Variant

The type variant combine multiple objects of different types together as one type.  E.g.:

type scene_element =
  | Circle  of circle_desc
  | Rect    of rect_desc
  | Segment of segment_desc
;;

The | character separates the different cases of the variant (the first | is optional), and each case has a capitalized tag, like Circle, to distinguish that case from the others.

I think I'll stop here before I go into the next section: imperative programming.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.