Tuesday, April 25, 2006

Automatic Structures: Part 2 -- Buchi-Bruyer Theorem

We now concentrate on the proof of Buchi-Bruyer Theorem, probably the most important theorem concerning automatic structures, which says that the set of regular relations coincides with the set of relations definable in Muniv (defined below). By the way, this theorem is folklore in the sense that its proof is unpublished, but well-known to everybody in the community.

We first need some definitions. Suppose that s := (s1, ...,sn) ∈ (Σ*)n. Then, define a string [s] over the alphabet (Σ∪ {#})n whose length is max{s1,...,sn}, and whose ith symbol is (a1,...,an) where aj is the ith symbol of sj, if i ≤ |sj|, and aj is #, otherwise. One might visualize [s] as the string obtained by placing s1,...,sn in a left-aligned column and pad each string si with # so that each of the resulting rows is of equal length. After that, we consider this matrix as a string [s] whose jth position is the jth column of the string. A subset S of (Σ*)n is said to be regular if the set { [s] : s ∈ S } is regular.

Fix an alphabet of size at least two. Consider the infinite structure Muniv := (Σ*, ≤, (La)a ∈ Σ, el) where

1. the universe is the set of all Σ-strings,
2. x ≤ y iff x is a prefix of y,
3. La(x) is true iff the rightmost symbol of x is a, and
4. el(x,y) is true iff |x| = |y| (|x| denotes the length of x).

What properties are (first-order) definable in Muniv? Here are some simple ones:

1. |x| ≤ |y| (i.e. the string x is no longer than the string y),
2. |x| = |y| + k for some fixed constant k,
3. im-pref(x,y) (i.e. x = y.a for some letter a ∈ Σ), and
4. the kth symbol of x is a (where k is fixed and a ∈ Σ).

Apologize for the overloading of the symbol '≤' because of the lack of HTML symbols. For example, the first property above can be expressed as

∃ s( s ≤ y ∧ el(x,s) ).

The second property is also easily expressible, but you might need more quantifiers and make use of the relation im-pref. The third property can be expressed by saying that |x| ≤ |y| and there is no z with x < z < y (here '<' is the irreflexive version of the prefix relation ≤).

Now, a subset S of (Σ*)n is said to be definable in Muniv if there exists a first-order formula φ(x1,...,xn) in the vocabulary of Muniv such that

S = { s : Muniv |= φ(s) }.

Theorem (Buchi-Bruyer): A subset S of (Σ*)n is definable in Muniv iff S is regular.

Proof sketch:
(<=) Suppose that S is recognized by the automaton A = (Q,q0,F,δ: Q x Σ -> Q), where Q = {q0,...,ql} is a finite set of states, q0 ∈ Q is the initial state, F ⊆ Q is the set of final states, and δ the transition function. So, for all s ∈ (Σ*)n, s ∈ S iff the string s1...sk = [s] is accepted by A,i.e., there exists a run p0...pk such that p0 = q0, pk ∈ F, and δ(pi,si+1) = pi+1. The defining formula for S is

φ(x1,...,xn) = ∃v0,...,vl( ψlen ∧ ψchar ∧ ψstart ∧ ψend ∧ ψtrans )

where:

1. ψlen asserts that |vi| = max{|xj| : 1 ≤ j ≤ n} + 1.
2. ψchar asserts that [(v1,...,vl)] = w0...wk is a characteristic sequence, i.e., each vi is a string of 0s and 1s, and that, for each position h, exactly one of the strings vis have value 1. [Intuitively, we want each wj to represent the state pj in our accepting run.]
3. ψstart asserts that the first position of v0 has value 1.
4. ψend asserts that the last position of some vj, where qj ∈ F, has value 1.
5. ψtrans asserts that the transition from wj to wj+1 respects δ. [This will be a huge table, quite tedious to write down.]

The reader should convince herself that all of the above sentences are definable in Muniv.

(=>) The proof is by induction on the formula φ(x1,...,xn) defining S. The base case (i.e. atomic formulas) is very easy (left to the reader). Further, the case where φ is of the form ψ1 ∨ ψ2 or ¬ ψ follows immediately from the properties that regular languages are closed under union and complementation. What remains is to prove this for the case where φ is of the form ∃xn+1 ψ(x1,...,xn+1). Suppose that the n+1-ary regular relation R defined by ψ is recognized by the automaton A = (Q,q0,F,δ). To construct an automaton A' for S, one first applies the pumping lemma to show that: there exists a number K such that if (s1,...,sn+1) ∈ R, where |sn+1| > |sj| for 1 ≤ j ≤ n, then there exists another string s'n+1 such that (s1,...,s'n+1) ∈ R and |s'n+1| ≤ max{|sj| : 1 ≤ j ≤ n} + K. We construct A' as follows. First, make K+1 isomorphic copies of A (with different labels), where the ith copy is denoted by Ai = (Qi = {q0i,...,qli},q0i,Fii). The states of A' are the union of the Qis. The start states consist of all the q0is. The final states are the union of the Fis. The transition function δ' works as follows: whenever δi(qji,(c1,...,cn+1)) = qhi, where cm ∈ (Σ &cup {#}), we put δ'(qji,(c1,...,cn) = qhi. Note that A' is non-deterministic. It is easy to check that A' recognizes S. (QED)