2007-11-12

Tallying Grass

Or rather, how to become a vegetarian by discounting beings with a stomach.

In the last post, I showed how to count a kind of flat critters living on the integer-tiled plane (the two-dimensional integer lattice) called ominos. The ominos are sets of cells with integer coordinates that are connected horizontally or vertically; in other words, they are formed by square tiles that share an edge. This is the so-called F pentomino:

For every integer n ≥ 1, I showed how to generate (and count) all the n-celled ominos by extending the (n-1)-ominos, one cell at a time. Some of these ominos have holes, or stomachs in them; that is, they completely surround a region of the plane such that it remains disconnected from the outside region, or exterior. For instance, this 9-omino (or enneomino) has a stomach:

This isn't the simplest animal with a cavity. The obvious mutilation that results in a square ring isn't either, surprisingly. The simplest non-simply-connected omino (that's the medical, I mean technical term) is the following heptomino:

I want to wean my program off meat; that is, I want it to count only simple ominos. These I could call trees, but trees are a special kind of branching ominos that lack "blocks"; I can then call them vegetables, or veggies for short, as trees are a special case of vegetables. The problem is how to determine the realm an omino belongs to; in other words, I need to come up with a predicate that distinguishes the ominos that are simply-connected from those that aren't.

Inspecting cells one by one might seem like a plausible way to do that; however, simplicity is a global property of the omino, and no matter how clever the cell inspection is, it can be fooled by some configuration. The simplicity condition is geometrical in nature, and the test must somehow take this into account. Enter the Jordan Curve Theorem: a simply-connected region has a boundary that neatly divides the plane in just two regions, namely the interior and the exterior. By taking into account the simple geometry of the integer lattice and the connectivity properties of an omino's cells, a consequence of the theorem is that an omino is simple if and only if the exterior is comprised of one connected region.

I don't need to take into account the entire integer lattice; it is enough to consider a representative sample of the exterior cells. These are the cells that completely surround the omino; since our representation includes the omino's extents, including a further border of one cell is sufficient.

This rectangular region consists of all cells whose coordinates are, each independently, in range, so I first need to generate the list of all integers comprised by said range:

let rec range i j =
  if i > j then [] else
  i :: range (succ i) j

Then, I need to combine two ranges, each for the independent x- and y-coordinates into pairs, giving the list of cells. This is the familiar Cartesian product:

let cart l m =
  List.fold_right (fun i ->
    List.fold_right (fun j r -> (i, j) :: r) m) l []

Aside: This is perhaps more recognizable when expressed as a list comprehension:

cart l m = r
 where r = [(i, j) | i <- l, j <- m]

With that, a region is simply the set of comprised cells between the given coordinates:

let region (im, jm) (ix, jx) : C.t =
  C.of_list (cart (range im ix) (range jm jx))

Given an omino, its exterior is the set of cells in the including region not belonging to it:

let exterior o : C.t =
  C.diff (region (-1, -1) (o.wid, o.hgt)) o.cel

(Note how the exterior completely surrounds the omino by extending its bounds.) Now the hard part: determining if the resulting set contains one connected component, or more than one. If the former, the exterior is connected; if the latter, some cells are separated from the exterior by the omino's cells and it is non-simple. A graph traversal that "marks" or remembers visited cells is what I need:

let find_component s =
  let neighbors = C.inter s % flower in
  let rec visit visited pending =
    if C.is_empty pending then visited else
    let c = C.choose pending in
    let pending = C.remove c pending in
    let unvisited = C.diff (neighbors c) visited in
    visit (C.add c visited) (C.union unvisited pending)
  in
  visit (C.empty) (C.singleton (C.choose s))

The algorithm keeps a set of visited cells in parallell with a set of cells pending of visit. If there are no cells of the latter kind, we are done; if not, we choose arbitrarily one, discount it, find its yet-unvisited neighbors in the region and recur, accounting for the cell as visited and remembering its neighbors for further consideration. We jumpstart the process by beginning with an arbitrary cell in the set, and without cells yet visited. Since it starts with an arbitrary cell, the returned connected component is arbitrary. This is the standard algorithm, except that the choice of cells, instead of following a first-in, first-out discipline (breadth-first), or a last-in, first-out discipline (depth-first) is completely non-deterministic.

Now, our predicate is, quite simply:

let is_connected o =
  let t = exterior o in
  C.equal t (find_component t)

That is, if any of the connected components of the exterior comprise the entire exterior, there are no holes and the omino is simple. I'll make a modification to the extensions predicate to filter candidates by an arbitrary predicate:

let extensions p o =
  let candidates c : C.t = C.diff (flower c) o.cel
  and extend c : omino = canon (normalize (C.add c o.cel)) in
  let extend_at c : O.t -> O.t = C.fold (fun c s ->
    let o = extend c in
    if p o then O.add o s else s) (candidates c) in
  C.fold extend_at o.cel O.empty

(For efficiency, the predicate is fused with the fold.) Now I must modify extend_set accordingly:

let extend_set p s = O.fold (O.union % extensions p) s O.empty

To test everything so far, I'll check that, given the true predicate (that is, that which is true everywhere) I get the same results as the last time. For that, I will use the constant function which returns the same value for all its arguments:

let const k _ = k

Then, exactly as before:

# List.map O.cardinal (nest 10 (extend_set (const true)) o1) ;;
- : int list = [1; 1; 2; 5; 12; 35; 108; 369; 1285; 4655]

That is, A000105. But now, by using is_connected:

# List.map O.cardinal (nest 10 (extend_set is_connected) o1) ;;
- : int list = [1; 1; 2; 5; 12; 35; 107; 363; 1248; 4460]

Which agrees with A000104, the number of n-ominos without holes. This takes a while; an obvious optimization would be to use sets with O(1) membership test; for instance, hash sets.

Thanks to Alejandro and Boris for the insightful discussion that led me to this solution.

No comments: