Functional Typeclasses 1: Overview and `Semigroup`

This is the first of a series of articles that deep dives a few very powerful type classes that every functional programmer should know. For example,Monoid, Functor, Applicative, etc. These are indeed so functional that they are prevalent in Haskell production code. We explain them with examples in Haskell. This series assumes the audience is familiar with some functional programming concepts we went through in the last series including higher order functions and types and typeclasses.

I'll first give an overview of the class of typeclasses that will be covered in this series. Then, we will go through a very universal typeclass: Semigroup.

Typeclasses: recap

Recall that typeclasses are categories of types, and so the instances of a class are types (not values)! A typeclass definition consists of:

  • operations (aka functions or methods) that apply to the typeclass instances.
  • laws that these operations need to satisfy.

As I go through more typeclasses I first introduce their methods and laws and what these empower us to do.

Let's look at the Num typeclass in Haskell as an example:

Methods for Num

These operations are defined for the Num typeclass:

  • (+) addition:
    • (+) :: a -> a -> a infixl 6
  • (-) subtraction:
    • (-) :: a -> a -> a infixl 6
  • (*) multiplication:
    • (*) :: a -> a -> a infixl 7
  • abs get the absolute value:
    • abs :: a -> a
  • signum get the sign of a number:
    • signum :: a -> a
    • For real numbers, the signum is either -1 (negative), 0 (zero), or 1 (positive).
  • fromInteger getting an a from an Integer:
    • fromInteger :: Integer -> a
  • negate negation:
    • negate :: a -> a

That is, for any type that belongs to the Num class, we are able to perform addition, subtraction, etc on them. For example, since Int is an instance of Num, we know we can perform these operations on any Int.

Laws the methods should satisfy

For all typeclasses, the defined methods must satisfy certain laws. Associativity and Commutativity are common laws for typeclasses.

For example, for the Num typeclass (the Haskell Report defines no laws for Num. However, (+) and (*) are customarily expected to have the following properties):

Associativity of (+)
    (x + y) + z = x + (y + z)
Commutativity of (+)
    x + y = y + x

The class of typeclasses we'll focus on

They have instances that are type families

Typeclasses like Num has simple types like Integer or Natural (natural numbers) in it but the typeclasses we'll focus on in this series specify more generic methods. Therefore, these typeclasses have instances of mostly type families in them. That is, the instances are type constructors. They take at least one type argument to form a type. For example, some of the instances of the Semigroup typeclass include:

Semigroup [a]
Semigroup (Either a b)

That is, [Int], [Bool] (from the List type family) and Either Int Bool, Either Char Int (from the Either type family) etc are all Semigroup instances!

Their relationships

The typeclasses we'll cover are related, as depicted from the below graph from Typeclassopedia:

  • Solid arrows point from the general to the specific; that is, if there is an arrow from Foo to Bar it means that every Bar is a Foo.
  • Dotted lines indicate some other sort of relationship.

We'll start from the middle: Semigroup. We'll then look at Monoid. As you can see from the graph, all Monoid instances are Semigroup instances. Next, we'll look at Foldable and Traversable. Finally, Functor, which will build up to Applicative. Then Alternative and Monad. All these are very common typeclasses.

Semigroup

Looking at the reference for Semigroup, you will see that there's only one method, <>, sometimes called append:

(<>) :: a -> a -> a

In a nutshell, if a type is an instance of the Semigroup we know that we can apply the <> operator to two values.

The function (<>) has to satisfy associativity. I.e., the following has to hold for <>:

 x <> (y <> z) = (x <> y) <> z

And that's it! For example, List is an instance of Semigroup because the append function ((++)) is associative:

(++) :: [a] -> [a] -> [a]

Because we know that:

[1,2] ++ ([3,4] ++ [5,6]) 
= [1,2,3,4,5,6]
= ([1,2] ++ [3,4]) ++ [5,6]

Because list's (++) satisfies the requirement for Semigroup, it is an instance of Semigroup.

This also means that you can use <> on lists to append lists. We can see this in GHCi:

Prelude> [1,2] <> ([3,4] <> [5,6])
[1,2,3,4,5,6]

In production code, it is not uncommon to preferentially use the more general <> to the datatype specific operators. It's good to get used to seeing them!

Notable Semigroups

In this section we study some notable Semigroups and how the operator <> behaves with GHCi.

Either

As from hackage: the Either type represents values with two possibilities: a value of type Either a b is either Left a or Right b.

data Either a b

The Either type is sometimes used to represent a value which is either correct or an error; by convention, the Left constructor is used to hold an error value and the Right constructor is used to hold a correct value (mnemonic: "right" also means "correct").

Since Either is an instance of Semigroup, we know we can use the <> operator on them. Trying it in GHCi:

Prelude> import Data.Either
Prelude Data.Either> Right 1 <> Right 2 <> Left ()
Right 1
Prelude Data.Either> Left () <> Right 1 <> Right 2
Right 1
Prelude Data.Either> Left 1 <> Left 2 <> Right 3
Right 3
Prelude Data.Either> Left 1 <> Left 2 <> Left 3 <> Left 4
Left 4

Notice that <> in Either returns the first Right and if there is no Right, returns the last Left. The name "append" doesn't really work well for the <> operator here!

Maybe

Let's look at the Maybe semigroup:

data Maybe a 
    = Nothing
    | Just a

The Maybe type encapsulates an optional value. A value of type Maybe a either contains a value of type a (represented as Just a), or it is empty (represented as Nothing).

Maybe is listed as an instance of Semigroup like this in hackage (in the "instances" section of either Maybe or Semigroup):

Semigroup a => Semigroup (Maybe a)

That is, we need the type variable a to be a Semigroup instance for a Maybe a to be a Semigroup instance!

Let's see what <> does with Maybe String:

Prelude> import Data.Maybe
Prelude Data.Maybe> Just " is awesome" <> Nothing
Just " is awesome"
Prelude Data.Maybe> Just "Haskell" <> Just " is awesome" <> Nothing
Just "Haskell is awesome"
Prelude Data.Maybe> Nothing <> Just "Haskell" <> Just " is awesome"
Just "Haskell is awesome"
Prelude Data.Maybe> Just "Haskell" <> Nothing <> Just " is awesome"
Just "Haskell is awesome"

<> appends the strings, ignoring any Nothing. This makes sense because type String = [Char] and we know that [a] is an instance of Semigroup.

() is also an instance of Semigroup, so it just works in GHCi:

Prelude Data.Maybe> Just () <> Just ()
Just ()
Prelude Data.Maybe> Just () <> Just () <> Nothing
Just ()

What about Maybe Integer? We get some errors in GHCi:

Prelude Data.Maybe> (Just 1::Maybe Integer) <> Just 2

<interactive>:3:1: error:
    • No instance for (Semigroup Integer) arising from a use of ‘<>’
    • In the expression: (Just 1 :: Maybe Integer) <> Just 2
      In an equation for ‘it’: it = (Just 1 :: Maybe Integer) <> Just 2

This is because Integer is not an instance of Semigroup! That is, the compiler doesn't know how to apply the <> operator on Integer!

Sum or product

To fix that, we can tell the compiler how to apply the <> operator on Integer or any Num type. As listed on Semigroup's hackage, for the Num class, its Semigroup instances are via either Sum or Product from Data.Monoid (in the "Num wrappers" section):

Num a => Semigroup (Product a)
Num a => Semigroup (Sum a)

If we specify the type to be a Product a, then the GHCi should know to <> them by multiplication:

Prelude Data.Maybe> import Data.Monoid
Prelude Data.Maybe Data.Monoid> (Just 1::Maybe (Product Integer)) <> Just 2
Just (Product {getProduct = 2})
Prelude Data.Maybe Data.Monoid> (Just 1::Maybe (Product Integer)) <> Just 2 <> Just 3 <> Nothing
Just (Product {getProduct = 6})

Try it out with Sum!

Non-empty list

The NonEmpty data family is a list that has at least one element in it.

data NonEmpty a

We construct it by using the :| constructor, and specifying the first element before the :| constructor - this ensures that it won't be an empty list. <> works the same on NonEmpty as on List:

Prelude> import Data.List.NonEmpty 
Prelude Data.List.NonEmpty> (1 :| [2,3]) <> (4 :| [])
1 :| [2,3,4]

Note that the type argument need not be a simple type for any of these! For example, we can have a NonEmpty (Either [Int] Bool) and <> will still work, as long as the typeclass requirements are fulfilled:

Prelude> import Data.Either
Prelude Data.Either> import Data.List.NonEmpty 
Prelude Data.Either Data.List.NonEmpty> (Right [1,2] :| [Left False, Right [3]]) <> (Left True :| [Right []]) 
Right [1,2] :| [Left False,Right [3],Left True,Right []]

There are many other Semigroup instances that are not mentioned here and I encourage you to try them out in GHCi!

This concludes this post. You now know what the <> operator is, and what it does in some common Semigroups. In the next post, we'll learn about Monoid. Stay tuned!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.