# Finite State Transducers

A DFA, on input a string, produces a single bit answer: accept or reject. Here we define a more general kind of finite automata (Finite State Transducers or FST), often useful in applications, that can produce arbitrarily long strings as output. Moreover, the output is produced in a streaming fashion, reading the input in a single pass, and producing the output string on the fly.

There are several possible ways to formalize FSTs as a mathematical definition or a computer program. Here we choose to define FSTs by modifying our definition of DFA as follows:

• Beside the usual input alphabet Σ, we also specify an output alphabet Γ, so that FSTs can be used to convert strings between different character sets.
• Instead of a set of final states F ⊆ Q, we specify an output function γ : Q → Γ * , which at every state q produces γ(q) as part of the output string.

In summary, an FST is a define by a tuple T = (Q, Σ, Γ, δ, s, γ) where Q, Σ, δ, s are just as in the definition of DFAs, Γ is a finite alphabet, and γ : Q → Γ *  is a function. As for DFAs, we make the definition of FST parametric with respect to the type of state, and define a function `evalFST` that maps each FST T to the corresponsing function fT : Σ *  → Γ * , the function computed by T.

``````import DFA (extendDelta)

type FST st = ([st],[Char],[Char],st->Char->st,st,st->String)

evalFST :: FST st -> String -> String
evalFST (qs,sigma,gamma,delta,s,out) [] = out s
evalFST (qs,sigma,gamma,delta,s,out) (x:xs) =
let s1 = delta s x
fst1 = (qs,sigma,gamma,delta,delta s x,out)
in (out s) ++ (evalFST fst1 xs)``````

The first line imports the `extendDelta` function from the `DFA` module, as we will need it later on. Since the transition function δ of an FST has the same type as the transition funciton of a DFA, we can apply `extendDelta` to it.

Let's define two example FSTs, and try out our evaluation function. For convenience, we define `bits` as an abbreviation for the binary alphabet. The first FST flips all the bits. The second FST removes repeted occurrences of 0.

``````bits = ['0','1']

flipFST = ([0,1,2],bits,bits,delta,2,out) where
delta q '0' = 1
delta q '1' = 0
out 0 = "0"
out 1 = "1"
out 2 = ""

data ABCD = A | B | C | D

nodup0FST = ([A,B,C,D],bits,bits,delta,A,out) where
delta A '0' = C
delta B '0' = C
delta _ '0' = D
delta _ '1' = B
out A = ""
out B = "1"
out C = "0"
out D = ""``````
``````ghci> evalFST flipFST "000111001101"
"111000110010"
ghci> evalFST nodup0FST "000111001101"
"011101101"``````

As you may expect, FSTs can be composed together: given two FSTs T0 and T1, we can combine them into a single FST that computes the function composition (T1 ∘ T0)(x) = T1(T0(x)).

``````composeFST :: FST st1 -> FST st0 -> FST (Maybe (st0,st1))
composeFST (qs1,sigma1,gamma1,delta1,s1,out1)
(qs0,sigma0,gamma0,delta0,s0,out0) | (gamma1 == sigma0)
= (qs,sigma0,gamma1,delta,Nothing,out)
where qs = [Nothing] ++ [Just (q0,q1) | q0 <- qs0, q1 <- qs1]
delta Nothing x = delta (Just (s0,s1)) x
delta (Just (q0,q1)) x =
Just (delta0 q0 x, (extendDelta delta1) q1 (out0 q0))
out Nothing =
let fst1 = (qs1,sigma1,gamma1,delta1,s1,out1)
in evalFST fst1 (out0 s0)
out (Just (q0,q1)) =
let fst1 = (qs1,sigma1,gamma1,delta1,q1,out1)
n = length (out1 q1)
in drop n (evalFST fst1 (out0 q0))``````

Let us go over the definition one piece at a time. We define the set of states of the composed transducer as the cartesian product of the sets of states of T0 and T1, plus a new (start) state. A simple way to add a new state to `st` is to use the type `Maybe st`. We have already seen the `Maybe` type constructor, which is typically used to represent an optional value of some type. Here we use `Maybe` just as a trick to introduce a new state `Nothing`, in addition to all the old states `Just (q0,q1)`.

The transition function `delta` and output function `out` are defined for this new set of states. The transition function δ treats the new start state `Nothing` identically to the pair of start states `Just (s0,s1)`. In general, on input a symbol `x` the current state of the transducer `Just (q_0,q_1)` is updated by letting the first transducer read `x`, and the second transducer read the string `out0 q0` produced by the first transducer before reading the `x`.

The output function is defined using `evalFST`. Let us first look at the output associated to the start state `Nothing`. On input the empty string ϵ, the first transducer produces a string `out0 s0`, which is passed to the second transducer. So, the final output can be computed as the output of the second transducer `fst1` on input `out0 s0`. The output function on other states is defined similarly, except that we use a transducer `fst1` with propertly modified start state `q1`, and we remove the first n characters of the output to avoid duplications.

Time to try out our composition function. Let us start with something simple: flipping a string twice:

``````ghci> evalFST (composeFST flipFST flipFST) "01010011"
"01010011"``````

Now, say we want to define a transducer `nodup1` that removes all the repeated occurrences of the bit 1. We could define it from scratch, similarly to the definition of `nodup0`. But an even simpler solution can be obtained by reusing `nodup0` in combination with the `flip` transducer: we first flip all the bits, remove repeated 0s, and then flip the bits again.

``````ghci> let nodup1FST = flipFST `composeFST` nodup0FST `composeFST` flipFST
ghci> evalFST nodup1FST "000111001101"
"000100101"``````

For future reference, you can find the code with the definition of FST, evaluation and composition functions at FST.hs. The example FSTs are in lec3.hs.