Strings and Tries; Haskell Versus OCaml

While doing my OCaml MOOC exercise I came across a trie implementation for looking up words.  I find that the way OCaml treats strings is very different from the way Haskell does it.  It leads to different ways to implement the trie in the two languages.

Strings and Tries in  Haskell

Haskell is a pure functional language and strings are just a list of chars!  You don’t even need the string library, which is for other string like types in Haskell.  Just treat it as a string of chars:

Prelude> s = "string"
Prelude> head s
Prelude> last s
Prelude> tail s

I implement the following trie:

--Define the Trie data type.
data Trie = Trie (Maybe Int) [(Char, Trie)]
--An empty trie.
empty = Trie Nothing []
--A trie example. See below.
example = Trie Nothing [('i', Trie (Just 11)
                          [('n', Trie (Just 5) [('n', Trie (Just 9) [])])]),
                        ('t', Trie Nothing [('e', Trie Nothing [
                                              ('n', Trie (Just 12) []),
                                              ('d', Trie (Just 4) []),
                                              ('a', Trie (Just 3) [])]),
                                            ('o', Trie (Just 7) [])]),
                        ('A', Trie (Just 15) [])]                     

The example represents the following trie (from wikipedia):

A natural way to implement a look-up function for a trie in Haskell is to recurse over the string.  E.g., in a .hs file:

find's type signature: it takes a (Char, Trie) list and a Char, 
and returns the corresponding trie, if any. If none, returns Nothing. 
find :: [(Char,Trie)] -> Char -> Maybe Trie 
--"\" is lambda or like "fun" in OCaml. 
find l char = foldr (\(c,t) r -> if c == char then Just t else r) Nothing l 

lookuptrie's type signature: it takes a Trie and a String, 
and returns the corresponding key of the string if in the trie. 
If not, returns Nothing.
lookuptrie :: Trie -> String -> Maybe Int 
lookuptrie (Trie v _) [] = v
--maybe is Maybe's fold. 
lookuptrie (Trie v next) (c:cs) = maybe Nothing (\t -> lookuptrie t cs) (find next c)

The functional style strings also allow us to use folds.  This look-up function is only 3 lines!

Strings and Tries in OCaml

In OCaml, strings are imperative by default.  You access each char of a string by index, similar to accessing elements of an array.  Like arrays, indexes of strings starts from 0.  E.g.:

utop # let s = "string";; 
val s : string = "string"
utop # s.[2];;
- : char = 'r'

To recurse over the string like in Haskell, one may turn the string to a list of chars in OCaml, then recurse over it.  You can use the the Core library’s String module:

(* Load the Core library in utop. *)
utop # #require "core";;
utop # open Core;;
(* Invoke the to_list function of the String module and turn s into a list of chars. *)
utop # let s_list = String.to_list s;;
val s_list : char list = ['s'; 't'; 'r'; 'i'; 'n'; 'g']
(* Invoke the Core library's List module. *)
utop # List.hd s_list;;
- : char option = Some 's'

utop # s_list;;
- : char list option = Some ['t'; 'r'; 'i'; 'n'; 'g']

The Core library’s List module returns char option and char list option on List.hd and!  The standard library returns char and char list on List.hd and

(* I haven't loaded the Core library in this utop instance.*)
utop # let s = "string";;
val s : string = "string"
(* The standard library's String module does not have the to_list function.*)
utop # let s_list = String.to_list s;;
Error: Unbound value String.to_list

utop # let s_list = ['s'; 't'; 'r'; 'i'; 'n'; 'g'];;
val s_list : char list = ['s'; 't'; 'r'; 'i'; 'n'; 'g']
(* The standard library's List.hd returns type char on a char list.*)
utop # List.hd s_list;;
- : char = 's'
(* The standard library's returns type char list on a char list.*)
utop # s_list;;
- : char list = ['t'; 'r'; 'i'; 'n'; 'g']

For some, loading the Core library may be too heavy to bother.  You could of course write your own to_list function.  But I believe in not fighting with the language I’m using!  I implement my look-up function without turning the string to a list.

Below is a trie implementation in OCaml from OCaml MOOC, with the same example as above:

type trie = Trie of int option * char_to_children
and char_to_children = (char * trie) list

let empty =
  Trie (None, [])

let example =
  Trie (None,
	[('i', Trie (Some 11,
                     [('n', Trie (Some 5, [('n', Trie (Some 9, []))]))]));
	  Trie (None,
		  Trie (None,
			[('n', Trie (Some 12, [])); ('d', Trie (Some 4, []));
			 ('a', Trie (Some 3, []))]));
		 ('o', Trie (Some 7, []))]));
	 ('A', Trie (Some 15, []))])

I use a counter for the string index and recurse over it:

(* children_from_char takes a char_to_children and a char, 
and find the corresponding trie, if any. If none, returns None. *)
let rec children_from_char m c = 
  match m with 
    |(x, t) :: tl -> 
      if x = c then Some t else children_from_char tl c 
    |_ -> None;; 

(* look_up_with_index takes a trie, a string, and the string char's index, 
and returns the key of the string in the trie, if any. If none, returns None. *) 
let rec lookup_with_index trie w index = 
  match trie with 
    | Trie (key, ctc) -> 
      if (index + 1 <= String.length w) then 
        match (children_from_char ctc w.[index]) with 
          | Some trie -> lookup_with_index trie w (index + 1) 
          | None -> None else key;; 

(* Initialized lookup with string index starting from 0. *) 
let rec lookup trie w = 
  lookup_with_index trie w 0;;

Alternatively, one may implement it imperatively and use a for-loop in OCaml.  You can see that OCaml is more versatile, because it can do both functional and imperative styles.  However, its functional style is more lengthy in this case.  Haskell can implement this more simply but it lacks the imperative option.

Please feel free to enlighten me with your way of implementing this!



, ,




Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.