_{univ}(defined below). By the way, this theorem is folklore in the sense that its proof is unpublished, but well-known to everybody in the community.

We first need some definitions. Suppose that

**s**:= (s

_{1}, ...,s

_{n}) ∈ (Σ*)

^{n}. Then, define a string [

**s**] over the alphabet (Σ∪ {#})

^{n}whose length is max{s

_{1},...,s

_{n}}, and whose ith symbol is (a

_{1},...,a

_{n}) where a

_{j}is the ith symbol of s

_{j}, if i ≤ |s

_{j}|, and a

_{j}is #, otherwise. One might visualize [

**s**] as the string obtained by placing s

_{1},...,s

_{n}in a left-aligned column and pad each string s

_{i}with # so that each of the resulting rows is of equal length. After that, we consider this matrix as a string [

**s**] whose jth position is the jth column of the string. A subset S of (Σ*)

^{n}is said to be

*regular*if the set { [

**s**] :

**s**∈ S } is regular.

Fix an alphabet of size at least two. Consider the infinite structure M

_{univ}:= (Σ*, ≤, (L

_{a})

_{a ∈ Σ}, el) where

- the universe is the set of all Σ-strings,

- x ≤ y iff x is a prefix of y,

- L
_{a}(x) is true iff the rightmost symbol of x is a, and

- el(x,y) is true iff |x| = |y| (|x| denotes the length of x).

What properties are (first-order) definable in M

_{univ}? Here are some simple ones:

- |x| ≤ |y| (i.e. the string x is no longer than the string y),

- |x| = |y| + k for some fixed constant k,

- im-pref(x,y) (i.e. x = y.a for some letter a ∈ Σ), and

- the kth symbol of x is a (where k is fixed and a ∈ Σ).

Apologize for the overloading of the symbol '≤' because of the lack of HTML symbols. For example, the first property above can be expressed as

∃ s( s ≤ y ∧ el(x,s) ).

The second property is also easily expressible, but you might need more quantifiers and make use of the relation im-pref. The third property can be expressed by saying that |x| ≤ |y| and there is no z with x < z < y (here '<' is the irreflexive version of the prefix relation ≤).

Now, a subset S of (Σ*)

^{n}is said to be definable in M

_{univ}if there exists a first-order formula φ(x

_{1},...,x

_{n}) in the vocabulary of M

_{univ}such that

S = {

**s**: M

_{univ}|= φ(

**s**) }.

**Theorem**(Buchi-Bruyer): A subset S of (Σ*)

^{n}is definable in M

_{univ}iff S is regular.

*Proof sketch*:

(<=) Suppose that S is recognized by the automaton A = (Q,q

_{0},F,δ: Q x Σ -> Q), where Q = {q

_{0},...,q

_{l}} is a finite set of states, q

_{0}∈ Q is the initial state, F ⊆ Q is the set of final states, and δ the transition function. So, for all

**s**∈ (Σ*)

^{n},

**s**∈ S iff the string s

_{1}...s

_{k}= [

**s**] is accepted by A,i.e., there exists a run p

_{0}...p

_{k}such that p

_{0}= q

_{0}, p

_{k}∈ F, and δ(p

_{i},s

_{i+1}) = p

_{i+1}. The defining formula for S is

φ(x

_{1},...,x

_{n}) = ∃v

_{0},...,v

_{l}( ψ

_{len}∧ ψ

_{char}∧ ψ

_{start}∧ ψ

_{end}∧ ψ

_{trans})

where:

- ψ
_{len}asserts that |v_{i}| = max{|x_{j}| : 1 ≤ j ≤ n} + 1.

- ψ
_{char}asserts that [(v_{1},...,v_{l})] = w_{0}...w_{k}is a*characteristic sequence*, i.e., each v_{i}is a string of 0s and 1s, and that, for each position h, exactly one of the strings v_{i}s have value 1. [Intuitively, we want each w_{j}to represent the state p_{j}in our accepting run.]

- ψ
_{start}asserts that the first position of v_{0}has value 1.

- ψ
_{end}asserts that the last position of some v_{j}, where q_{j}∈ F, has value 1.

- ψ
_{trans}asserts that the transition from w_{j}to w_{j+1}respects δ. [This will be a huge table, quite tedious to write down.]

The reader should convince herself that all of the above sentences are definable in M

_{univ}.

(=>) The proof is by induction on the formula φ(x

_{1},...,x

_{n}) defining S. The base case (i.e. atomic formulas) is very easy (left to the reader). Further, the case where φ is of the form ψ

_{1}∨ ψ

_{2}or ¬ ψ follows immediately from the properties that regular languages are closed under union and complementation. What remains is to prove this for the case where φ is of the form ∃x

_{n+1}ψ(x

_{1},...,x

_{n+1}). Suppose that the n+1-ary regular relation R defined by ψ is recognized by the automaton A = (Q,q

_{0},F,δ). To construct an automaton A' for S, one first applies the pumping lemma to show that: there exists a number K such that if (s

_{1},...,s

_{n+1}) ∈ R, where |s

_{n+1}| > |s

_{j}| for 1 ≤ j ≤ n, then there exists another string s'

_{n+1}such that (s

_{1},...,s'

_{n+1}) ∈ R and |s'

_{n+1}| ≤ max{|s

_{j}| : 1 ≤ j ≤ n} + K. We construct A' as follows. First, make K+1 isomorphic copies of A (with different labels), where the ith copy is denoted by A

^{i}= (Q

^{i}= {q

_{0}

^{i},...,q

_{l}

^{i}},q

_{0}

^{i},F

^{i},δ

^{i}). The states of A' are the union of the Q

^{i}s. The start states consist of all the q

_{0}

^{i}s. The final states are the union of the F

^{i}s. The transition function δ' works as follows: whenever δ

^{i}(q

_{j}

^{i},(c

_{1},...,c

_{n+1})) = q

_{h}

^{i}, where c

_{m}∈ (Σ &cup {#}), we put δ'(q

_{j}

^{i},(c

_{1},...,c

_{n}) = q

_{h}

^{i}. Note that A' is non-deterministic. It is easy to check that A' recognizes S. (QED)