# Finite State Transducers

A DFA, on input a string, produces a single bit answer: accept or reject. Here we define a more general kind of finite automata (Finite State Transducers or FST), often useful in applications, that can produce arbitrarily long strings as output. Moreover, the output is produced in a streaming fashion, reading the input in a single pass, and producing the output string on the fly.

There are several possible ways to formalize FSTs as a mathematical definition or a computer program. Here we choose to define FSTs by modifying our definition of DFA as follows:

• Beside the usual input alphabet Σ, we also specify an output alphabet Γ, so that FSTs can be used to convert strings between different character sets.
• Instead of a set of final states F ⊆ Q, we specify an output function γ : Q → Γ*, which at every state q produces γ(q) as part of the output string.

In summary, an FST is defined by a tuple T = (Q, Σ, Γ, δ, s, γ) where Q, Σ, δ, s are just as in the definition of DFAs, Γ is a finite alphabet, and γ : Q → Γ* is a function. As for DFAs, we make the definition of FST parametric with respect to the type of state, and define a function `evalFST` that maps each FST T to the corresponsing function fT : Σ* → Γ*, the function computed by T.

``````type FST st = ([st],[Char],[Char],st->Char->st,st,st->String)

evalFST :: FST st -> String -> String
evalFST (qs,sigma,gamma,delta,s,out) [] = out s
evalFST (qs,sigma,gamma,delta,s,out) (x:xs) =
let s1 = delta s x
fst1 = (qs,sigma,gamma,delta,delta s x,out)
in (out s) ++ (evalFST fst1 xs)``````

Let’s define two example FSTs, and try out our evaluation function. For convenience, we define `bits` as an abbreviation for the binary alphabet. The first FST flips all the bits. The second FST removes repeted occurrences of 0.

``````bits = ['0','1']

flipFST = ([0,1,2],bits,bits,delta,2,out) where
delta q '0' = 1
delta q '1' = 0
out 0 = "0"
out 1 = "1"
out 2 = ""

data ABCD = A | B | C | D

nodup0FST = ([A,B,C,D],bits,bits,delta,A,out) where
delta A '0' = C
delta B '0' = C
delta _ '0' = D
delta _ '1' = B
out A = ""
out B = "1"
out C = "0"
out D = ""``````
``````ghci> evalFST flipFST "000111001101"
"111000110010"
ghci> evalFST nodup0FST "000111001101"
"011101101"``````

Now consider running an FST T on an input string w, and passing the result to a DFA M. Can these two computatios be combined into a single DFA? This is not obvious, as a DFA does not have enough memory to store the intermediate result produced by the FST. In order to combine the two computations, we need to run T and M in parallel. This can be done using a variant of the cartesian product construction we used to combine DFAs. Here is a function that combines a DFA `(qs,alpha,delta,s,fs)` with an FST `(qsT,alphaIn,alphaOut,deltaT,sT,out)`. into a single DFA `(qs',alphaIn,delta',s',fs')`. Of course, in order to compose the two automata we need the output alphabet of the FST to be the same as the input alphabet of the DFA.

``````composeDFAFST :: DFA st -> FST stT -> DFA (st,stT)
composeDFAFST
(qs,alpha,delta,s,fs)                 -- input DFA
(qsT,alphaIn,alphaOut,deltaT,sT,out)  -- input FST
| (alpha == alphaOut) =
(qs',alphaIn,delta',s',fs')       -- output DFA
where qs' = [ (q,qT) | q <- qs, qT <- qsT ]
deltaStar = foldl delta
s' = (deltaStar s (out sT),sT)
fs' = [ (q,qT) | q <- fs, qT <- qsT ]
delta' (q,qT) a =
let qT' = deltaT qT a
w = out qT'
in (deltaStar q w,qT')``````

Each new state keeps track of a DFA state `q` and an FST state `qT`. Initially, `qT=sT` is the start state of the FST, and `q` is the state reached by the DFA upon reading `out sT`. When reading an input symbol `a`, the states are updated by letting the FST read the `a` to move to a new state `qT' = deltaT qT a`, and letting the DFA read the string `w` output by the FST when entering this state.

We can also combine two FSTs together, to obtain a new FST that computes the composition of the two functions. Here the construction is slightly more complex, and requires the introduction of a new start state. A simple way to add a new state to `st` is to use the type `Maybe st`. We have already seen the `Maybe` type constructor, which is typically used to represent an optional value of some type. Here we use `Maybe` just as a trick to introduce a new state `Nothing`, in addition to all the old states `Just (q0,q1)`.

``````composeFST :: FST st1 -> FST st0 -> FST (Maybe (st0,st1))
composeFST
(qs1,sigma1,gamma1,delta1,s1,out1)
(qs0,sigma0,gamma0,delta0,s0,out0)
| (gamma1 == sigma0) =
(qs,sigma0,gamma1,delta,Nothing,out)
where qs = [Nothing] ++ [Just (q0,q1) | q0 <- qs0, q1 <- qs1]
delta1Star = foldl delta1
delta Nothing x = delta (Just (s0,s1)) x
delta (Just (q0,q1)) x =
Just (delta0 q0 x, delta1Star q1 (out0 q0))
out Nothing =
let fst1 = (qs1,sigma1,gamma1,delta1,s1,out1)
in evalFST fst1 (out0 s0)
out (Just (q0,q1)) =
let fst1 = (qs1,sigma1,gamma1,delta1,q1,out1)
n = length (out1 q1)
in drop n (evalFST fst1 (out0 q0))``````

Time to try out our composition function. Let us start with something simple: flipping a string twice:

``````ghci> evalFST (composeFST flipFST flipFST) "01010011"
"01010011"``````

Now, say we want to define a transducer `nodup1` that removes all the repeated occurrences of the bit 1. We could define it from scratch, similarly to the definition of `nodup0`. But an even simpler solution can be obtained by reusing `nodup0` in combination with the `flip` transducer: we first flip all the bits, remove repeated 0s, and then flip the bits again.

``````ghci> let nodup1FST = flipFST `composeFST` nodup0FST `composeFST` flipFST
ghci> evalFST nodup1FST "000111001101"
"000100101"``````

For future reference, you can find the code with the definition of FST, evaluation and composition functions at FST.hs.