Tuesday, April 25, 2006

Automatic Structures: Part 2 -- Buchi-Bruyer Theorem

We now concentrate on the proof of Buchi-Bruyer Theorem, probably the most important theorem concerning automatic structures, which says that the set of regular relations coincides with the set of relations definable in Muniv (defined below). By the way, this theorem is folklore in the sense that its proof is unpublished, but well-known to everybody in the community.

We first need some definitions. Suppose that s := (s1, ...,sn) ∈ (Σ*)n. Then, define a string [s] over the alphabet (Σ∪ {#})n whose length is max{s1,...,sn}, and whose ith symbol is (a1,...,an) where aj is the ith symbol of sj, if i ≤ |sj|, and aj is #, otherwise. One might visualize [s] as the string obtained by placing s1,...,sn in a left-aligned column and pad each string si with # so that each of the resulting rows is of equal length. After that, we consider this matrix as a string [s] whose jth position is the jth column of the string. A subset S of (Σ*)n is said to be regular if the set { [s] : s ∈ S } is regular.

Fix an alphabet of size at least two. Consider the infinite structure Muniv := (Σ*, ≤, (La)a ∈ Σ, el) where

  1. the universe is the set of all Σ-strings,
  2. x ≤ y iff x is a prefix of y,
  3. La(x) is true iff the rightmost symbol of x is a, and
  4. el(x,y) is true iff |x| = |y| (|x| denotes the length of x).

What properties are (first-order) definable in Muniv? Here are some simple ones:

  1. |x| ≤ |y| (i.e. the string x is no longer than the string y),
  2. |x| = |y| + k for some fixed constant k,
  3. im-pref(x,y) (i.e. x = y.a for some letter a ∈ Σ), and
  4. the kth symbol of x is a (where k is fixed and a ∈ Σ).

Apologize for the overloading of the symbol '≤' because of the lack of HTML symbols. For example, the first property above can be expressed as

∃ s( s ≤ y ∧ el(x,s) ).

The second property is also easily expressible, but you might need more quantifiers and make use of the relation im-pref. The third property can be expressed by saying that |x| ≤ |y| and there is no z with x < z < y (here '<' is the irreflexive version of the prefix relation ≤).

Now, a subset S of (Σ*)n is said to be definable in Muniv if there exists a first-order formula φ(x1,...,xn) in the vocabulary of Muniv such that

S = { s : Muniv |= φ(s) }.


Theorem (Buchi-Bruyer): A subset S of (Σ*)n is definable in Muniv iff S is regular.

Proof sketch:
(<=) Suppose that S is recognized by the automaton A = (Q,q0,F,δ: Q x Σ -> Q), where Q = {q0,...,ql} is a finite set of states, q0 ∈ Q is the initial state, F ⊆ Q is the set of final states, and δ the transition function. So, for all s ∈ (Σ*)n, s ∈ S iff the string s1...sk = [s] is accepted by A,i.e., there exists a run p0...pk such that p0 = q0, pk ∈ F, and δ(pi,si+1) = pi+1. The defining formula for S is

φ(x1,...,xn) = ∃v0,...,vl( ψlen ∧ ψchar ∧ ψstart ∧ ψend ∧ ψtrans )

where:

  1. ψlen asserts that |vi| = max{|xj| : 1 ≤ j ≤ n} + 1.
  2. ψchar asserts that [(v1,...,vl)] = w0...wk is a characteristic sequence, i.e., each vi is a string of 0s and 1s, and that, for each position h, exactly one of the strings vis have value 1. [Intuitively, we want each wj to represent the state pj in our accepting run.]
  3. ψstart asserts that the first position of v0 has value 1.
  4. ψend asserts that the last position of some vj, where qj ∈ F, has value 1.
  5. ψtrans asserts that the transition from wj to wj+1 respects δ. [This will be a huge table, quite tedious to write down.]

The reader should convince herself that all of the above sentences are definable in Muniv.

(=>) The proof is by induction on the formula φ(x1,...,xn) defining S. The base case (i.e. atomic formulas) is very easy (left to the reader). Further, the case where φ is of the form ψ1 ∨ ψ2 or ¬ ψ follows immediately from the properties that regular languages are closed under union and complementation. What remains is to prove this for the case where φ is of the form ∃xn+1 ψ(x1,...,xn+1). Suppose that the n+1-ary regular relation R defined by ψ is recognized by the automaton A = (Q,q0,F,δ). To construct an automaton A' for S, one first applies the pumping lemma to show that: there exists a number K such that if (s1,...,sn+1) ∈ R, where |sn+1| > |sj| for 1 ≤ j ≤ n, then there exists another string s'n+1 such that (s1,...,s'n+1) ∈ R and |s'n+1| ≤ max{|sj| : 1 ≤ j ≤ n} + K. We construct A' as follows. First, make K+1 isomorphic copies of A (with different labels), where the ith copy is denoted by Ai = (Qi = {q0i,...,qli},q0i,Fii). The states of A' are the union of the Qis. The start states consist of all the q0is. The final states are the union of the Fis. The transition function δ' works as follows: whenever δi(qji,(c1,...,cn+1)) = qhi, where cm ∈ (Σ &cup {#}), we put δ'(qji,(c1,...,cn) = qhi. Note that A' is non-deterministic. It is easy to check that A' recognizes S. (QED)

Thursday, April 20, 2006

Automatic Structures: Part 1

Several weeks ago, my supervisor Leonid Libkin gave a presentation about automatic structures in our reading group meeting. I would like to talk about these nice animals in some details. This is the first part in the series of posts giving a flavor of automatic structures and sketching some important proofs. If there are interests, I will also talk about how one might apply automatic structures to program verification (this line of research is still under intense development).

Finite model theory primarily concerns finite structures, and has been successfully applied to database theory, and logical approaches to verification (i.e. model checking). On the other hand, finite structures are often too restrictive. For example, when modeling C programs, the use of infinite structures is often inevitable. For this reason, a lot of effort has been put into extending the framework of finite model theory to infinite structures. Several such approaches include metafinite model theory, embedded finite model theory, and automatic structures. In the sequel, we are primarily interested in automatic structures.

Roughly speaking, automatic structures are structures whose universe, and relations can each be represented by a finite automaton. A simple example of automatic structures is Presburger arithmetic (N,+), where N is the set of natural numbers. Here is how one might represent (N,+) using finite automata. Represent the universe N in binary in reverse order (e.g. 4 = 001, 2 = 01); call such a representation bin(N). So, bin(N) is a language over Σ = {0,1}. It is simple to devise a finite automaton that recognizes precisely bin(N). Now, how do we represent the 3-ary relation

+ = { (bin(i),bin(j),bin(k)) : i + j = k, i,j,k ∈ N }

with an automaton? First, it is possible to represent the relation + as a language L over the alphabet {0,1,#}3 defined in the following way. For i,j,k ∈ bin(N), concatenate i,j,k with the padding symbol # so that the resulting strings i',j',k' are of the same length. For example, if i = 001, j = 01, and k = 00001, then i' = 001##, j' = 01###, and k' = 00001. Now, we may treat the tuple (i',j',k') as a string over the alphabet {0,1,#}3, e.g., for i',j',k' in the above example, the resulting string is (0,0,0)(0,1,0)(1,#,0)(#,#,0)(#,#,1). Then, if i + j = k, put the string (i',j',k') in L. Having defined L, it is not hard to exhibit an automaton that recognizes precisely L. [This automaton resembles the commonplace algorithm for addition.]

What is so cool about automatic structures? First, it is somewhat immediate that automatic structures have decidable FO theories. Second, there exists a universal automatic structure Muniv, which is a structure in which all other automatic structures can be interpreted with FO translation (or reduction). Furthermore, the set of all relations definable in Muniv captures precisely all regular relations, which gives an easy way to prove properties about regular languages. By the way, automatic structures also give a nice way of proving that model checking some types of infinite transition systems be decidable.

Anyway, this post was merely intended to whet the reader's appetite. I hope this was enticing enough. In the next post, I will give a precise definition of automatic structures and prove the Buchi-Bruyer theorem that the set of relations definable in Muniv coincides with all regular relations.