2012-07-19

Theorems for Free: The Monad Edition

This is for the record, since the derivations took me a while and I'd rather not lose them.

A functor is the signature:

module type FUNCTOR = sig
  type 'a t
  val fmap : ('a -> 'b) -> ('a t -> 'b t)
end

satisfying the following laws:

Identity:    fmap id      ≡ id
Composition: fmap (f ∘ g) ≡ fmap f ∘ fmap g

An applicative structure or idiom is the signature:

module type APPLICATIVE = sig
  type 'a t
  val pure : 'a -> 'a t
  val ap : ('a -> 'b) t -> ('a t -> 'b t)
end

satisfying the following laws:

Identity:     ap (pure id)                  ≡ id
Composition:  ap (ap (ap (pure (∘)) u) v) w ≡ ap u (ap v w)
Homomorphism: ap (pure f) ∘ pure            ≡ pure ∘ f
Interchange:  ap u (pure x)                 ≡ ap (pure (λf.f x)) u

An applicative functor is the structure:

module type APPLICATIVE_FUNCTOR = sig
  type 'a t
  include FUNCTOR     with type 'a t := 'a t
  include APPLICATIVE with type 'a t := 'a t
end

that is simultaneously a functor and an applicative structure, satisfying the additional law:

Fmap: fmap ≡ ap ∘ pure

A monad is the structure:

module type MONAD = sig
  type 'a t
  val return : 'a -> 'a t
  val bind : ('a -> 'b t) -> ('a t -> 'b t)
end

satisfying the following laws:

Right unit:    bind return     ≡ id
Left unit:     bind f ∘ return ≡ f
Associativity: bind f ∘ bind g ≡ bind (bind f ∘ g)

Every monad is an applicative functor:

module ApplicativeFunctor (M : MONAD)
: APPLICATIVE_FUNCTOR with type 'a t = 'a M.t
= struct
  type 'a t = 'a M.t
  open M
  let fmap f = bind (fun x -> return (f x))
  let pure   = return
  let ap u v = bind (fun f -> fmap f v) u
end

This can be proved by easy (but tedious) equational reasoning. That the proof is rigorous is Wadler's celebrated result.

Proof of the Functor Identity:

  fmap id
≡ { definition }
  bind (return ∘ id)
≡ { composition }
  bind return
≡ { Monad Right unit }
  id

Proof of the Functor Composition:

  fmap f ∘ fmap g
≡ { definition }
  bind (return ∘ f) ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind (return ∘ f) ∘ return ∘ g)
≡ { Monad Left unit }
  bind (return ∘ f ∘ g)
≡ { definition }
  fmap (f ∘ g)

A number of naturality conditions are simple equations between λ-terms. I'll need these later:

Lemma 1 (Yoneda):

  fmap f ∘ (λh. fmap h x)
≡ { defn. ∘, β-reduction }
  λg. fmap f (fmap g x)
≡ { defn. ∘ }
  λg. (fmap f ∘ fmap g) x
≡ { Functor Composition }
  λg. fmap (f ∘ g) x
≡ { abstract }
  λg. (λh. fmap h x) (f ∘ g)
≡ { defn. ∘, η-contraction }
  (λh. fmap h x) ∘ (f ∘)

Lemma 2:

  fmap f ∘ return
≡ { definition }
  bind (return ∘ f) ∘ return
≡ { Monad Left unit }
  return ∘ f

Lemma 3:

  bind f ∘ fmap g
≡ { definition fmap }
  bind f ∘ bind (return ∘ g)
≡ { Monad Associativity }
  bind (bind f ∘ return ∘ g)
≡ { Monad Left unit }
  bind (f ∘ g)

Lemma 4:

  bind (fmap f ∘ g)
≡ { definition fmap }
  bind (bind (return ∘ f) ∘ g)
≡ { Monad Associativity }
  bind (return ∘ f) ∘ bind g
≡ { definition fmap }
  fmap f ∘ bind g

The Applicative Functor condition is easy to prove and comes in as a handy lemma:

  ap ∘ pure
≡ { definition }
  λv. bind (λf. fmap f v) ∘ return
≡ { Monad Left unit }
  λv. λf. fmap f v
≡ { η-contraction }
  fmap

Proof of the Applicative Identity:

  ap (pure id)
≡ { Applicative Functor }
  fmap id
≡ { Functor Identity }
  id

Proof of the Applicative Homomorphism:

  ap (pure f) ∘ pure
≡ { Applicative Functor }
  fmap f ∘ pure
≡ { Lemma 2, defn. pure }
  pure ∘ f

Proof of the Applicative Interchange:

  ap (pure (λf.f x)) u
≡ { Applicative Functor }
  fmap (λf.f x) u
≡ { definition }
  bind (return ∘ (λf.f x)) u
≡ { defn. ∘, β-reduction }
  bind (λf. return (f x)) u
≡ { Lemma 2 }
  bind (λf. fmap f (return x)) u
≡ { definition }
  ap u (pure x)

The proof of the Applicative Composition condition is the least straightforward of the lot, as it requires ingenuity to choose the reduction to apply at each step. I started with a long, tedious derivation that required forward and backward reasoning; at the end I refactored it in byte-sized lemmas in order to simplify it as much as I could. As a heuristic, I always try to start from the most complicated expression to avoid having to guess where and what to abstract (that is, applying elimination rules requires neatness, while applying introduction rules requires backtracking):

  ap (ap (ap (pure (∘)) u) v) w
≡ { Applicative Functor }
  ap (ap (fmap (∘) u) v) w
≡ { definition }
  bind (λf. fmap f w) (bind (λf. fmap f v) (fmap (∘) u))
≡ { Lemma 3 with f := λf. fmap f v, g := (∘) }
  bind (λf. fmap f w) (bind ((λf. fmap f v) ∘ (∘)) u)
≡ { Monad Associativity with f := λf. fmap f w, g := (λf. fmap f v) ∘ (∘) }
  bind (bind (λf. fmap f w) ∘ (λf. fmap f v) ∘ (∘)) u
≡ { defn. ∘ (at right) }
  bind (λf. (bind (λf. fmap f w) ∘ (λf. fmap f v)) (f ∘)) u
≡ { defn. ∘, β-reduction }
  bind (λf. bind (λf. fmap f w) (fmap (f ∘) v)) u
≡ { Lemma 3 with f := λf. fmap f w, g := (f ∘) }
  bind (λf. bind ((λf. fmap f w) ∘ (f ∘)) v) u
≡ { Lemma 1 }
  bind (λf. bind (fmap f ∘ (λf. fmap f w)) v) u
≡ { Lemma 4 with g := λf. fmap f w }
  bind (λf. fmap f (bind (λf. fmap f w) v)) u
≡ { definition }
  ap u (ap v w)

What is this good for? I don't really know; Haskellers can't seem to be able to live without them while I can't seem to justify their application. I suspect laziness has a lot to do with it; in any case, the machinery is more palatable with the appropriate combinators:

module Functor (F : FUNCTOR) = struct
  include F
  let ( <$> ) = fmap
end

module Applicative (A : APPLICATIVE) = struct
  include A
  let ( <*> ) = ap
end

module Monad (M : MONAD) = struct
  include M
  include (ApplicativeFunctor (M)
         : APPLICATIVE_FUNCTOR with type 'a t := 'a t)
  let ( <$> )     = fmap
  let ( <*> )     = ap
  let ( >>= ) m f = bind f m
end

2012-07-17

An Odd Lemma

While proving that every monad is an applicative functor, I extracted the following derivation as a lemma:

  fmap f ∘ (λh. fmap h x)
≡ { defn. ∘, β-reduction }
  λg. fmap f (fmap g x)
≡ { defn. ∘ }
  λg. (fmap f ∘ fmap g) x
≡ { Functor }
  λg. fmap (f ∘ g) x
≡ { abstract f ∘ g }
  λg. (λh. fmap h x) (f ∘ g)
≡ { defn. ∘, η-contraction }
  (λh. fmap h x) ∘ (f ∘)

for all f, x. This is the Yoneda Lemma in a special form. The term λh. fmap h x is the natural transformation between an arbitrary functor F and the functor Hom(X, —), where X is fixed by the type of x. Dan Piponi calls it the hardest trivial thing in mathematics. I didn't recognize it immediately (my category-fu is nonexistent), but the striking symmetry made me curious and I started investigating.

2012-07-12

A minor branch off Braun Trees

Revisiting the post on Braun Trees I noticed that, while pedagogical, the implementation of the root replacement operation rep can be streamlined a bit. By inlining the mutually recursive siftdown, specializing on the matches and adding guards, the result is as follows:

let rec rep compare e = function
| N (_, (N (el, _, _) as l), E)
  when compare el e  < 0 ->
  N (el, rep compare e l, E)
| N (_, (N (el, _, _) as l), (N (er, _, _) as r))
  when compare el e  < 0
    || compare er e  < 0 ->
    if compare el er < 0
      then N (el, rep compare e l, r)
      else N (er, l, rep compare e r)
| N (_, l, r) -> N (e, l, r)
| E           -> failwith "empty heap"

The guards are the strongest possible in order to minimize the needs to descend to the children. The conditional under the second case could be in turn lifted to a guard, since (el < eer < e) ∧ el < erel < eel < er, and similarly for the else case, but the loss of simplicity is quite substantial in my opinion.

2012-07-09

Existential Crisis

In a long discussion at LtU, user Jules Jacobs advances a use-case that for him would have been difficult to solve in a statically-typed language. I focus on his first two points:

  1. A data structure that is a container of T plus a function on T's. I would have needed existential types to hide the T, which I don't get in many languages.
  2. A couple of functions that take heterogeneous list of items that the function will convert to the interface it needs the elements to be with an extensible conversion function. […]

It's true, not many languages have existential types but OCaml happens to do, or at least allows existentials to be encoded. I'll take a big hint from MLton and I'll try to solve both problems (I won't pretend to solve his actual problems; just my interpretation of the problems he posed). Another reason why I write this post is because I always forget what the essence of existential types is. Pierce writes:

Similarly, there are two different ways of looking at an existential type, written {∃X,T}. The logical intuition is that an element of {∃X,T} is a value of type [XS]T, for some type S. The operational intuition, on the other hand, is that an element of {∃X,T} is a pair, written {*S,t}, of a type S and a term t of type [XS]T.

[…] To understand existential types, we need to know two things: how to build (or introduce, in the jargon of §9.4) elements that inhabit them, and how to use (or eliminate) these values in computations.

An existentially typed value is introduced by pairing a type with a term, written {*S,t}. A useful concrete intuition is to think of a value {*S,t} of type {∃X,T} as a simple form of package or module with one (hidden) type component and one term component. The type S is often called the hidden representation type, or sometimes (to emphasize a connection with logic, cf. §9.4) the witness type of the package. For example, the package p = {*Nat, {a=5, f=λx:Nat. succ(x)}} has the existential type {∃X, {a:X, f:XX}}. The type component of p is Nat, and the value component is a record containing a field a of type X and a field f of type XX, for some X (namely Nat).

(Types and Programming Languages, p. 363–364) I quote at some length because Pierce's presentation is so condensed that for me it really bears repeating; also, because every time I want a refresher I can come to this post instead of cracking open the physical book. The gist of it is pithily and more memorably stated in Mitchell and Plotkin's slogan, abstract types have existential type.

The complication is that in ML types are not values, so in order to pack an existential we must reify the type component as a term, for instance as a pair of functions, an injection and a projection:

module type UNIV = sig
  type t
  val embed : unit -> ('a -> t) * (t -> 'a)
end

Why UNIV when we're talking about existential types? Go ask Oleg. Actually, there is a type isomorphism analogous to the logical equivalence between (∃x. P x) ⇒ Q and x.(P x ⇒ Q); as Jeremy Gibbons writes …the justification being that a datatype declaration such as type e = ∃t. E of t foo introduces a constructor E : (∃t. t foo) → e, and this type is isomorphic to t.(t foo → e) because e is independent of t (Haskellisms paraphrased).

In any case, here the existential type is t, and embed produces a (inj, prj) pair that can be applied to values of some type α. Only the prj of the pair can recover the injected value; the use of any other prj will fail. There are at least two possible implementations, one using references and another one using exceptions (which are values of a single open, extensible, generative type). The latter is very clever:

module Univ : UNIV = struct
  type t = exn
  let embed (type u) () =
    let module E = struct
      exception E of u
      let inj x = E x
      let prj = function E x -> x | _ -> raise Not_found
    end in E.(inj, prj)
end

The use of the named type u and a local module are 3.12 features that allow declaring a polymorphic exception (compare the SML solution in MLton's website). Since the exception constructor E is different for every invocation of embed (this is the "generative" bit referred to above), only the prj of the pair can recover the value:

let () =
  let inj_int  , prj_int   = Univ.embed ()
  and inj_float, prj_float = Univ.embed () in
  let r = ref (inj_int 13) in
  let s1 = try string_of_int   (prj_int   !r) with Not_found -> "None" in
  r := inj_float 13.0;
  let s2 = try string_of_int   (prj_int   !r) with Not_found -> "None" in
  let s3 = try string_of_float (prj_float !r) with Not_found -> "None" in
  Printf.printf "%s %s %s\n" s1 s2 s3

Note that the reference r holds values "of different types" via the corresponding inj. This code typechecks and when run outputs:

13 None 13.

On top of this "universal" existential type we can build heterogeneous property lists à la Lisp (see again MLton's site):

module PList : sig
  type t
  val make     : unit -> t
  val property : unit -> (t -> 'a -> unit) * (t -> 'a)
end = struct
  type t = Univ.t list ref

  let make () = ref []

  let property (type u) () =
    let inj, prj = Univ.embed () in
    let put r v = r := inj v :: !r
    and get r   =
      let rec go = function
      | []      -> raise Not_found
      | x :: xs -> try prj x with Not_found -> go xs
      in go !r
    in (put, get)
end

Each property must be created explicitly but independently of any list using it. They encapsulate an existentially-typed value; look-up just proceeds by attempting to project out the corresponding value. These lists really are magical:

let () =
  let put_name  , get_name   = PList.property ()
  and put_age   , get_age    = PList.property ()
  and put_weight, get_weight = PList.property () in
  let show p = Printf.printf "%s: %d years, %.1f kg\n"
    (get_name   p)
    (get_age    p)
    (get_weight p)
  in
  let boy, girl = PList.make (), PList.make () in
  put_name   boy  "Tim";
  put_age    boy  13;
  put_weight boy  44.0;
  put_name   girl "Una";
  put_age    girl 12;
  put_weight girl 39.0;
  List.iter show [boy; girl]

Here boy and girl are two different, independent property lists that act as extensible records with labels get_name, get_age and get_weight. The list iteration prints the properties uniformly via show, without having to cast or do any strange contortions (at least not any outside the definitions). The output is:

Tim: 13 years, 44.0 kg
Una: 12 years, 39.0 kg

Of course nothing says that the property lists must be homogeneous in the properties they contain; looking for an inexistent property will simply fail. On the other hand, probably the preferred way to handle extensible records in OCaml is via objects, using structural subtyping in the same way dynamically-typed languages would use duck typing. This would make solving the original problem a little more familiar to Python or Ruby programmers; but then, recovering the original objects from the lists would be impossible without downcasts.

2012-07-02

Assessing Abstractions

Back to OCaml! Catching up with StackOverflow's OCaml questions, I found an interesting one about private type abbreviations in module signatures. One thing in that conversation that struck me as odd was the assertion that the compiler optimizes single-constructor variants, thereby subsuming the semantics of Haskell's all three declarators, data, type and newtype, into one. Definitive proof would be obtainable by diving into the compiler's source. I decided instead to read the output of ocamlopt to try to catch a glimpse of this, even if I know it doesn't constitute definitive proof. What I found is both good and bad, or as we say here una de cal y una de arena.

Consider the following module definitions:

module S = struct
  type t = P of int * int
  let make x y = P (x, y)
end

module T = struct
  type t = int * int
  let make x y = x, y
end

Module S (for "single sum type") declares a single-variant type, while T (for "tuple type") declares an abbreviation. Putting them in a file and compiling it with ocamlopt -S -c generates the following listing (prelude and other specificities omitted):

S.makeT.make
_camlAbbrev__make_1033:
        subq    $8, %rsp
        movq    %rax, %rdi
.L101:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L102
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    %rdi, (%rax)
        movq    %rbx, 8(%rax)
        addq    $8, %rsp
        ret
.L102:  call    _caml_call_gc
        jmp     .L101
_camlAbbrev__make_1038:
        subq    $8, %rsp
        movq    %rax, %rdi
.L105:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L106
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    %rdi, (%rax)
        movq    %rbx, 8(%rax)
        addq    $8, %rsp
        ret
.L106:  call    _caml_call_gc
        jmp     .L105

(ocamlopt -version is 3.12.1). The compiler generates the exact same code in both cases. The bolded instructions build the heap-allocated 3-word pair as tag, first component, second component. In order to test modular abstraction, I coerce these implementations into the following signatures:

module A : sig
  type t = int * int
  val make : int -> int -> t
end = T

module B : sig
  type t = private int * int
  val make : int -> int -> t
end = T

module X : sig
  type t = P of int * int
  val make : int -> int -> t
end = S

module Y : sig
  type t = private P of int * int
  val make : int -> int -> t
end = S

Modules A and B coerce module T with a public and a private type definition, respectively. Modules X and Y do the same with module S. The following code shows how to use each variant of type t in a destructuring pattern match:

let a () = A.(let   (x, y) =  make 2 3               in x + y)
let b () = B.(let   (x, y) = (make 2 3 :> int * int) in x + y)
let x () = X.(let P (x, y) =  make 2 3               in x + y)
let y () = Y.(let P (x, y) =  make 2 3               in x + y)

As explained in the manual, a private type abbreviation requires an explicit coercion for the types to be compatible. The use of a single-variant type makes for not only safer coding (the intent behind Haskell's newtype), it also allows for lighter syntax. I was surprised by the assembly code generated by ocamlopt (somewhat simplified):

ab
_camlAbbrev__a_1060:
        subq    $8, %rsp
.L109:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L110
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    $5, (%rax)
        movq    $7, 8(%rax)
        movq    8(%rax), %rbx
        movq    (%rax), %rax
        leaq    -1(%rax, %rbx), %rax
        addq    $8, %rsp
        ret
.L110:  call    _caml_call_gc
        jmp     .L109
_camlAbbrev__b_1063:
        subq    $8, %rsp
.L113:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L114
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    $5, (%rax)
        movq    $7, 8(%rax)
        movq    8(%rax), %rbx
        movq    (%rax), %rax
        leaq    -1(%rax, %rbx), %rax
        addq    $8, %rsp
        ret
.L114:  call    _caml_call_gc
        jmp     .L113
cd
_camlAbbrev__x_1066:
        subq    $8, %rsp
.L117:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L118
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    $5, (%rax)
        movq    $7, 8(%rax)
        movq    8(%rax), %rbx
        movq    (%rax), %rax
        leaq    -1(%rax, %rbx), %rax
        addq    $8, %rsp
        ret
.L118:  call    _caml_call_gc
        jmp     .L117
_camlAbbrev__y_1070:
        subq    $8, %rsp
.L121:  subq    $24, %r15
        movq    _caml_young_limit@GOTPCREL(%rip), %rax
        cmpq    (%rax), %r15
        jb      .L122
        leaq    8(%r15), %rax
        movq    $2048, -8(%rax)
        movq    $5, (%rax)
        movq    $7, 8(%rax)
        movq    8(%rax), %rbx
        movq    (%rax), %rax
        leaq    -1(%rax, %rbx), %rax
        addq    $8, %rsp
        ret
.L122:  call    _caml_call_gc
        jmp     .L121

Identical machine code generated in all four cases, which I find very encouraging. Here the compiler inlines the tupling constructor make (the three first bolded lines; note that an integer n is tagged as 2·n + 1 so that 2 is represented by 5 and 3 by 7). What I find a bummer is that the tuple is immediately destructured (into rbx and rax, next two bolded lines) and operated on its components (last bolded line; note in passing how addition of tagged integers is done with leaq, since 2·(m + n) + 1 = (2·m + 1) + (2·n + 1) - 1), without any kind of CSE performed. This is perfectly summarized by Pascal Cuoq in a comment to another StackOverflow question:

[…] Last time I checked it didn't even optimize the allocation of a, b in match a, b with x, y -> .... If you wish to check by yourself, I found that reading the x86 assembly generated by ocamlopt -S was convenient for me because I didn't have to learn a new representation.

This, I suspect, is an artifact of inlining in the presence of code emission that must be correct in the face of separate compilation. Disturbingly enough, ocamlopt seems to inline even across module boundaries. Separating the test into an abbrev.ml implementation:

module S = struct
  type t = P of int * int
  let make x y = P (x, y)
end

module T = struct
  type t = int * int
  let make x y = x, y
end

module A = T
module B = T
module X = S
module Y = S

with interface abbrev.mli:

module A : sig type t =              int * int val make : int -> int -> t end
module B : sig type t = private      int * int val make : int -> int -> t end
module X : sig type t =         P of int * int val make : int -> int -> t end
module Y : sig type t = private P of int * int val make : int -> int -> t end

the following code in test.ml:

open Abbrev

let a () = A.(let   (x, y) =  make 2 3               in x + y)
let b () = B.(let   (x, y) = (make 2 3 :> int * int) in x + y)
let x () = X.(let P (x, y) =  make 2 3               in x + y)
let y () = Y.(let P (x, y) =  make 2 3               in x + y)

compiles to the same assembly as above! Xavier Leroy spells out the conditions for ocamlopt to do cross-module inlining; I think it obvious that these tests do fulfill them. An interesting tidbit in that message is the command-line argument -dcmm to dump a RTL-like representation of the generated code:

(function camlTest__a_1038 (param/1056: addr)
 (let match/1057 (alloc 2048 5 7)
   (+ (+ (load match/1057) (load (+a match/1057 8))) -1)))

(function camlTest__b_1041 (param/1058: addr)
 (let match/1059 (alloc 2048 5 7)
   (+ (+ (load match/1059) (load (+a match/1059 8))) -1)))

(function camlTest__x_1044 (param/1060: addr)
 (let match/1061 (alloc 2048 5 7)
   (+ (+ (load match/1061) (load (+a match/1061 8))) -1)))

(function camlTest__y_1048 (param/1062: addr)
 (let match/1063 (alloc 2048 5 7)
   (+ (+ (load match/1063) (load (+a match/1063 8))) -1)))

I think it is not very productive to worry too much about the cost of abstractions in OCaml, as it probably is offset by the naïveté of the code generator.