Equivalence of Regular Expressions

# Equivalence of Regular Expressions

We give a behavioral specification for regular expressions, followed by a number of computations that check the equivalence of such expressions. No human intervention is involved in these proofs - BOBJ's circular coinductive rewriting algorithm (CCCRW) does all the work. We believe that CCCRW can always decide the equivalence of regular expressions, and we think we have a proof, but it is not yet written down. It is interesting to notice that as many as three circularities are generated for some of the examples below; these in effect are lemmas that are automatically proved by the algorithm as part of the main proof. For example, three of these are (a*b*)* = b*(a*b*)* (a*b*)* = a*b*(a*b*)* (a*b(a*b)*a*) + a* = a*(ba*)* The original version of this example was due to Grigore Rosu, inspired by some work of Bogdan Warinschi; it has since been simplified by Joseph Goguen. Note that the last computation uses conditional circular coinductive rewriting.

### 1. BOBJ Source Code

In the specification REGEXP, the symbol nil denotes the empty string, whereas empty denotes the empty regular language. Also LETTER is the theory of an individual letter, and a language is specified by providing a copy of LETTER each letter in its alphabet; note that these will have REGEXP as a shared subtheory. *** file: /groups/tatami/bobj/examples/rexp.bob ***> equivalence of regular languages bth REGEXP is sorts RegExp Letter . subsort Letter < RegExp . ops empty nil : -> RegExp . op __ : RegExp RegExp -> RegExp [assoc id: nil prec 20] . op _+_ : RegExp RegExp -> RegExp [assoc id: empty comm idem prec 30] . op _* : RegExp -> RegExp [prec 10] . op nil-in _ : RegExp -> Bool [prec 40] . vars L L' : RegExp . eq empty L = empty . eq nil-in L L' = nil-in L and nil-in L' . eq nil-in L + L' = nil-in L or nil-in L' . eq nil-in L* = true . eq nil-in nil = true . eq nil-in empty = false . end bth LETTER is pr REGEXP . op x : -> Letter . eq nil-in x = false . op x? : RegExp -> Bool . vars L L' : RegExp . var X : Letter . eq x?(L L') = x?(L) or nil-in L and x?(L') . eq x?(L + L') = x?(L) or x?(L') . eq x?(L*) = x?(L) . eq x?(X) = x == X . eq x?(nil) = false . eq x?(empty) = false . op -x : RegExp -> RegExp [prec 1] . eq -x(L L') = -x(L) L' + if (nil-in L) then -x(L') else empty fi . eq -x(L + L') = -x(L) + -x(L') . eq -x(L*) = -x(L) L* . eq -x(x) = nil . eq -x(X) = empty if X =/= x . eq -x(nil) = empty . eq -x(empty) = empty . end bth LANGUAGE is pr LETTER * (op x to a, op x? to a?, op -x to -a) + LETTER * (op x to b, op x? to b?, op -x to -b) . end set cred trace on cred a* + nil == a* . cred aa* + nil == a* . cred a*a* == a* . cred (a+nil) a* == a* . cred (a+nil)* == a* . cred a*a == aa* . cred a(ba)* == (a b)*a . cred (a*b)*a* == a*(ba*)* . cred (a+b)* == (a*b*)* . ***> L = aL + b implies L = a*b open . op L : -> RegExp . cred L == a*b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and (nil-in L == false) . close

### 2. BOBJ Output

The output shows that some of these computations are really rather complex. awk% bobj \|||||||||||||||||/ --- Welcome to BOBJ --- /|||||||||||||||||\ BOBJ version 0.9.205 built: Thu Nov 14 23:26:56 PST 2002 University of California, San Diego Sun Nov 17 15:48:27 PST 2002 BOBJ> in rexp ========================================== ***> equivalence of regular languages ========================================== bth REGEXP ========================================== bth LETTER ========================================== bth LANGUAGE ========================================== c-reduce in LANGUAGE : (a *) + nil == a * using cobasis for LANGUAGE: op b? : RegExp -> Bool op a? : RegExp -> Bool op nil-in _ : RegExp -> Bool [prec 40] op -a : RegExp -> RegExp op -b : RegExp -> RegExp --------------------------------------- reduced to: (a *) + nil == a * ----------------------------------------- add rule (C1) : (a *) + nil = a * ----------------------------------------- target is: (a *) + nil == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a *) + nil == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a *) + nil == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a *) + nil == a * expand by: op -a : RegExp -> RegExp reduced to: true nf: a * ----------------------------------------- target is: (a *) + nil == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 290ms parse time: 11ms ========================================== c-reduce in LANGUAGE : (a (a *)) + nil == a * reduced to: (a (a *)) + nil == a * ----------------------------------------- add rule (C1) : (a (a *)) + nil = a * ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op -a : RegExp -> RegExp reduced to: true nf: a * ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 240ms parse time: 9ms ========================================== c-reduce in LANGUAGE : (a *) (a *) == a * reduced to: (a *) (a *) == a * ----------------------------------------- add rule (C1) : (a *) (a *) = a * ----------------------------------------- target is: (a *) (a *) == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a *) (a *) == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a *) (a *) == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a *) (a *) == a * expand by: op -a : RegExp -> RegExp deduced using (C1) : true nf: a * ----------------------------------------- target is: (a *) (a *) == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 232ms parse time: 6ms ========================================== c-reduce in LANGUAGE : (a + nil) (a *) == a * reduced to: (a + nil) (a *) == a * ----------------------------------------- add rule (C1) : (a + nil) (a *) = a * ----------------------------------------- target is: (a + nil) (a *) == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a + nil) (a *) == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a + nil) (a *) == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a + nil) (a *) == a * expand by: op -a : RegExp -> RegExp reduced to: true nf: a * ----------------------------------------- target is: (a + nil) (a *) == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 246ms parse time: 7ms ========================================== c-reduce in LANGUAGE : (a + nil) * == a * reduced to: (a + nil) * == a * ----------------------------------------- add rule (C1) : (a + nil) * = a * ----------------------------------------- target is: (a + nil) * == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a + nil) * == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a + nil) * == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a + nil) * == a * expand by: op -a : RegExp -> RegExp deduced using (C1) : true nf: a * ----------------------------------------- target is: (a + nil) * == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 228ms parse time: 4ms ========================================== c-reduce in LANGUAGE : (a *) a == a (a *) reduced to: (a *) a == a (a *) ----------------------------------------- add rule (C1) : (a *) a = a (a *) ----------------------------------------- target is: (a *) a == a (a *) expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a *) a == a (a *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a *) a == a (a *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: false ----------------------------------------- target is: (a *) a == a (a *) expand by: op -a : RegExp -> RegExp deduced using (C1) : (a (a *)) + nil == a * ----------------------------------------- target is: (a *) a == a (a *) expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- add rule (C2) : (a (a *)) + nil = a * ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op -a : RegExp -> RegExp reduced to: true nf: a * ----------------------------------------- target is: (a (a *)) + nil == a * expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- result: true c-rewrite time: 311ms parse time: 7ms ========================================== c-reduce in LANGUAGE : a ((b a) *) == ((a b) *) a reduced to: a ((b a) *) == ((a b) *) a ----------------------------------------- add rule (C1) : a ((b a) *) = ((a b) *) a ----------------------------------------- target is: a ((b a) *) == ((a b) *) a expand by: op b? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: a ((b a) *) == ((a b) *) a expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: a ((b a) *) == ((a b) *) a expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: false ----------------------------------------- target is: a ((b a) *) == ((a b) *) a expand by: op -a : RegExp -> RegExp reduced to: (b a) * == (b ((a b) *) a) + nil ----------------------------------------- target is: a ((b a) *) == ((a b) *) a expand by: op -b : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- add rule (C2) : (b a) * = (b ((a b) *) a) + nil ----------------------------------------- target is: (b a) * == (b ((a b) *) a) + nil expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (b a) * == (b ((a b) *) a) + nil expand by: op a? : RegExp -> Bool reduced to: true nf: false ----------------------------------------- target is: (b a) * == (b ((a b) *) a) + nil expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (b a) * == (b ((a b) *) a) + nil expand by: op -a : RegExp -> RegExp reduced to: true nf: empty ----------------------------------------- target is: (b a) * == (b ((a b) *) a) + nil expand by: op -b : RegExp -> RegExp deduced using (C1) : true nf: ((a b) *) a ----------------------------------------- result: true c-rewrite time: 434ms parse time: 18ms ========================================== c-reduce in LANGUAGE : (((a *) b) *) (a *) == (a *) ((b (a *)) *) reduced to: (((a *) b) *) (a *) == (a *) ((b (a *)) *) ----------------------------------------- add rule (C1) : (((a *) b) *) (a *) = (a *) ((b (a *)) *) ----------------------------------------- target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *) expand by: op -a : RegExp -> RegExp reduced to: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) ----------------------------------------- target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *) expand by: op -b : RegExp -> RegExp deduced using (C1) : true nf: (a *) ((b (a *)) *) ----------------------------------------- add rule (C2) : ((a *) b (((a *) b) *) (a *)) + (a *) = (a *) ((b (a *)) *) ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op -a : RegExp -> RegExp reduced to: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op -b : RegExp -> RegExp deduced using (C1) : true nf: (a *) ((b (a *)) *) ----------------------------------------- add rule (C3) : ((a *) b (((a *) b) *) (a *)) + (a *) = (a *) ((b (a *)) *) ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op -a : RegExp -> RegExp deduced using (C3) : true nf: (a *) ((b (a *)) *) ----------------------------------------- target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *)) *) expand by: op -b : RegExp -> RegExp deduced using (C1) : true nf: (a *) ((b (a *)) *) ----------------------------------------- result: true c-rewrite time: 824ms parse time: 25ms ========================================== c-reduce in LANGUAGE : (a + b) * == ((a *) (b *)) * reduced to: (a + b) * == ((a *) (b *)) * ----------------------------------------- add rule (C1) : (a + b) * = ((a *) (b *)) * ----------------------------------------- target is: (a + b) * == ((a *) (b *)) * expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a + b) * == ((a *) (b *)) * expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: (a + b) * == ((a *) (b *)) * expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: (a + b) * == ((a *) (b *)) * expand by: op -a : RegExp -> RegExp deduced using (C1) : ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) ----------------------------------------- target is: (a + b) * == ((a *) (b *)) * expand by: op -b : RegExp -> RegExp deduced using (C1) : ((a *) (b *)) * == (b *) (((a *) (b *)) *) ----------------------------------------- add rule (C2) : ((a *) (b *)) * = (a *) (b *) (((a *) (b *)) *) ----------------------------------------- add rule (C3) : ((a *) (b *)) * = (b *) (((a *) (b *)) *) ----------------------------------------- target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) expand by: op -a : RegExp -> RegExp reduced to: true nf: (a *) (b *) (((a *) (b *)) *) ----------------------------------------- target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *) expand by: op -b : RegExp -> RegExp reduced to: true nf: (b *) (((a *) (b *)) *) ----------------------------------------- target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: true ----------------------------------------- target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *) expand by: op -a : RegExp -> RegExp reduced to: true nf: (a *) (b *) (((a *) (b *)) *) ----------------------------------------- target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *) expand by: op -b : RegExp -> RegExp reduced to: true nf: (b *) (((a *) (b *)) *) ----------------------------------------- result: true c-rewrite time: 596ms parse time: 7ms ========================================== ***> L = aL + b implies L = a*b ========================================== open ========================================== op L : -> RegExp ========================================== c-reduce in LANGUAGE : L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) using cobasis for LANGUAGE: op b? : RegExp -> Bool op a? : RegExp -> Bool op nil-in _ : RegExp -> Bool [prec 40] op -a : RegExp -> RegExp op -b : RegExp -> RegExp --------------------------------------- reduced to: L == (a *) b ----------------------------------------- add rule (C1) : L = (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) ----------------------------------------- target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) expand by: op b? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) expand by: op a? : RegExp -> Bool reduced to: true nf: true ----------------------------------------- target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) expand by: op nil-in _ : RegExp -> Bool [prec 40] reduced to: true nf: false ----------------------------------------- target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) expand by: op -a : RegExp -> RegExp deduced using (C1) : true nf: (a *) b ----------------------------------------- target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false) expand by: op -b : RegExp -> RegExp reduced to: true nf: nil ----------------------------------------- result: true c-rewrite time: 393ms parse time: 42ms ========================================== close BOBJ> q Bye. awk%
To BOBJ examples homepage
To Tatami project homepage
Maintained by Joseph Goguen
Last modified: Sun Aug 17 19:17:19 PDT 2003