Equivalence of Regular Expressions
Equivalence of Regular Expressions
We give a behavioral specification for regular expressions, followed by a
number of computations that check the equivalence of such expressions. No
human intervention is involved in these proofs - BOBJ's circular coinductive
rewriting algorithm (CCCRW) does all the work. We believe that CCCRW can
always decide the equivalence of regular expressions, and we think we
have a proof, but it is not yet written down. It is interesting to notice
that as many as three circularities are generated for some of the examples
below; these in effect are lemmas that are automatically proved by the
algorithm as part of the main proof. For example, three of these are
(a*b*)* = b*(a*b*)*
(a*b*)* = a*b*(a*b*)*
(a*b(a*b)*a*) + a* = a*(ba*)*
The original version of this example was due to Grigore Rosu, inspired by some work of Bogdan Warinschi; it has since been simplified by
Joseph Goguen. Note that the last computation uses conditional circular
coinductive rewriting.
1. BOBJ Source Code
In the specification REGEXP, the symbol nil denotes the
empty string, whereas empty denotes the empty regular language. Also
LETTER is the theory of an individual letter, and a language is
specified by providing a copy of LETTER each letter in its alphabet;
note that these will have REGEXP as a shared subtheory.
*** file: /groups/tatami/bobj/examples/rexp.bob
***> equivalence of regular languages
bth REGEXP is sorts RegExp Letter .
subsort Letter < RegExp .
ops empty nil : -> RegExp .
op __ : RegExp RegExp -> RegExp [assoc id: nil prec 20] .
op _+_ : RegExp RegExp -> RegExp [assoc id: empty comm idem prec 30] .
op _* : RegExp -> RegExp [prec 10] .
op nil-in _ : RegExp -> Bool [prec 40] .
vars L L' : RegExp .
eq empty L = empty .
eq nil-in L L' = nil-in L and nil-in L' .
eq nil-in L + L' = nil-in L or nil-in L' .
eq nil-in L* = true .
eq nil-in nil = true .
eq nil-in empty = false .
end
bth LETTER is pr REGEXP .
op x : -> Letter .
eq nil-in x = false .
op x? : RegExp -> Bool .
vars L L' : RegExp . var X : Letter .
eq x?(L L') = x?(L) or nil-in L and x?(L') .
eq x?(L + L') = x?(L) or x?(L') .
eq x?(L*) = x?(L) .
eq x?(X) = x == X .
eq x?(nil) = false .
eq x?(empty) = false .
op -x : RegExp -> RegExp [prec 1] .
eq -x(L L') = -x(L) L' + if (nil-in L) then -x(L') else empty fi .
eq -x(L + L') = -x(L) + -x(L') .
eq -x(L*) = -x(L) L* .
eq -x(x) = nil .
eq -x(X) = empty if X =/= x .
eq -x(nil) = empty .
eq -x(empty) = empty .
end
bth LANGUAGE is pr LETTER * (op x to a, op x? to a?, op -x to -a) +
LETTER * (op x to b, op x? to b?, op -x to -b) .
end
set cred trace on
cred a* + nil == a* .
cred aa* + nil == a* .
cred a*a* == a* .
cred (a+nil) a* == a* .
cred (a+nil)* == a* .
cred a*a == aa* .
cred a(ba)* == (a b)*a .
cred (a*b)*a* == a*(ba*)* .
cred (a+b)* == (a*b*)* .
***> L = aL + b implies L = a*b
open .
op L : -> RegExp .
cred L == a*b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and
(nil-in L == false) .
close
2. BOBJ Output
The output shows that some of these computations are really rather complex.
awk% bobj
\|||||||||||||||||/
--- Welcome to BOBJ ---
/|||||||||||||||||\
BOBJ version 0.9.205 built: Thu Nov 14 23:26:56 PST 2002
University of California, San Diego
Sun Nov 17 15:48:27 PST 2002
BOBJ> in rexp
==========================================
***> equivalence of regular languages
==========================================
bth REGEXP
==========================================
bth LETTER
==========================================
bth LANGUAGE
==========================================
c-reduce in LANGUAGE : (a *) + nil == a *
using cobasis for LANGUAGE:
op b? : RegExp -> Bool
op a? : RegExp -> Bool
op nil-in _ : RegExp -> Bool [prec 40]
op -a : RegExp -> RegExp
op -b : RegExp -> RegExp
---------------------------------------
reduced to: (a *) + nil == a *
-----------------------------------------
add rule (C1) : (a *) + nil = a *
-----------------------------------------
target is: (a *) + nil == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a *) + nil == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a *) + nil == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a *) + nil == a *
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: a *
-----------------------------------------
target is: (a *) + nil == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 290ms parse time: 11ms
==========================================
c-reduce in LANGUAGE : (a (a *)) + nil == a *
reduced to: (a (a *)) + nil == a *
-----------------------------------------
add rule (C1) : (a (a *)) + nil = a *
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: a *
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 240ms parse time: 9ms
==========================================
c-reduce in LANGUAGE : (a *) (a *) == a *
reduced to: (a *) (a *) == a *
-----------------------------------------
add rule (C1) : (a *) (a *) = a *
-----------------------------------------
target is: (a *) (a *) == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a *) (a *) == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a *) (a *) == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a *) (a *) == a *
expand by: op -a : RegExp -> RegExp
deduced using (C1) : true
nf: a *
-----------------------------------------
target is: (a *) (a *) == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 232ms parse time: 6ms
==========================================
c-reduce in LANGUAGE : (a + nil) (a *) == a *
reduced to: (a + nil) (a *) == a *
-----------------------------------------
add rule (C1) : (a + nil) (a *) = a *
-----------------------------------------
target is: (a + nil) (a *) == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a + nil) (a *) == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a + nil) (a *) == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a + nil) (a *) == a *
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: a *
-----------------------------------------
target is: (a + nil) (a *) == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 246ms parse time: 7ms
==========================================
c-reduce in LANGUAGE : (a + nil) * == a *
reduced to: (a + nil) * == a *
-----------------------------------------
add rule (C1) : (a + nil) * = a *
-----------------------------------------
target is: (a + nil) * == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a + nil) * == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a + nil) * == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a + nil) * == a *
expand by: op -a : RegExp -> RegExp
deduced using (C1) : true
nf: a *
-----------------------------------------
target is: (a + nil) * == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 228ms parse time: 4ms
==========================================
c-reduce in LANGUAGE : (a *) a == a (a *)
reduced to: (a *) a == a (a *)
-----------------------------------------
add rule (C1) : (a *) a = a (a *)
-----------------------------------------
target is: (a *) a == a (a *)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a *) a == a (a *)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a *) a == a (a *)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: false
-----------------------------------------
target is: (a *) a == a (a *)
expand by: op -a : RegExp -> RegExp
deduced using (C1) : (a (a *)) + nil == a *
-----------------------------------------
target is: (a *) a == a (a *)
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
add rule (C2) : (a (a *)) + nil = a *
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: a *
-----------------------------------------
target is: (a (a *)) + nil == a *
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
result: true
c-rewrite time: 311ms parse time: 7ms
==========================================
c-reduce in LANGUAGE : a ((b a) *) == ((a b) *) a
reduced to: a ((b a) *) == ((a b) *) a
-----------------------------------------
add rule (C1) : a ((b a) *) = ((a b) *) a
-----------------------------------------
target is: a ((b a) *) == ((a b) *) a
expand by: op b? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: a ((b a) *) == ((a b) *) a
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: a ((b a) *) == ((a b) *) a
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: false
-----------------------------------------
target is: a ((b a) *) == ((a b) *) a
expand by: op -a : RegExp -> RegExp
reduced to: (b a) * == (b ((a b) *) a) + nil
-----------------------------------------
target is: a ((b a) *) == ((a b) *) a
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
add rule (C2) : (b a) * = (b ((a b) *) a) + nil
-----------------------------------------
target is: (b a) * == (b ((a b) *) a) + nil
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (b a) * == (b ((a b) *) a) + nil
expand by: op a? : RegExp -> Bool
reduced to: true
nf: false
-----------------------------------------
target is: (b a) * == (b ((a b) *) a) + nil
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (b a) * == (b ((a b) *) a) + nil
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: empty
-----------------------------------------
target is: (b a) * == (b ((a b) *) a) + nil
expand by: op -b : RegExp -> RegExp
deduced using (C1) : true
nf: ((a b) *) a
-----------------------------------------
result: true
c-rewrite time: 434ms parse time: 18ms
==========================================
c-reduce in LANGUAGE : (((a *) b) *) (a *) == (a *) ((b (a *)) *)
reduced to: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
-----------------------------------------
add rule (C1) : (((a *) b) *) (a *) = (a *) ((b (a *)) *)
-----------------------------------------
target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
expand by: op -a : RegExp -> RegExp
reduced to: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
-----------------------------------------
target is: (((a *) b) *) (a *) == (a *) ((b (a *)) *)
expand by: op -b : RegExp -> RegExp
deduced using (C1) : true
nf: (a *) ((b (a *)) *)
-----------------------------------------
add rule (C2) : ((a *) b (((a *) b) *) (a *)) + (a *) = (a *) ((b (a
*)) *)
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op -a : RegExp -> RegExp
reduced to: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op -b : RegExp -> RegExp
deduced using (C1) : true
nf: (a *) ((b (a *)) *)
-----------------------------------------
add rule (C3) : ((a *) b (((a *) b) *) (a *)) + (a *) = (a *) ((b (a
*)) *)
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op -a : RegExp -> RegExp
deduced using (C3) : true
nf: (a *) ((b (a *)) *)
-----------------------------------------
target is: ((a *) b (((a *) b) *) (a *)) + (a *) == (a *) ((b (a *))
*)
expand by: op -b : RegExp -> RegExp
deduced using (C1) : true
nf: (a *) ((b (a *)) *)
-----------------------------------------
result: true
c-rewrite time: 824ms parse time: 25ms
==========================================
c-reduce in LANGUAGE : (a + b) * == ((a *) (b *)) *
reduced to: (a + b) * == ((a *) (b *)) *
-----------------------------------------
add rule (C1) : (a + b) * = ((a *) (b *)) *
-----------------------------------------
target is: (a + b) * == ((a *) (b *)) *
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a + b) * == ((a *) (b *)) *
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: (a + b) * == ((a *) (b *)) *
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: (a + b) * == ((a *) (b *)) *
expand by: op -a : RegExp -> RegExp
deduced using (C1) : ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
-----------------------------------------
target is: (a + b) * == ((a *) (b *)) *
expand by: op -b : RegExp -> RegExp
deduced using (C1) : ((a *) (b *)) * == (b *) (((a *) (b *)) *)
-----------------------------------------
add rule (C2) : ((a *) (b *)) * = (a *) (b *) (((a *) (b *)) *)
-----------------------------------------
add rule (C3) : ((a *) (b *)) * = (b *) (((a *) (b *)) *)
-----------------------------------------
target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: (a *) (b *) (((a *) (b *)) *)
-----------------------------------------
target is: ((a *) (b *)) * == (a *) (b *) (((a *) (b *)) *)
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: (b *) (((a *) (b *)) *)
-----------------------------------------
target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: true
-----------------------------------------
target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *)
expand by: op -a : RegExp -> RegExp
reduced to: true
nf: (a *) (b *) (((a *) (b *)) *)
-----------------------------------------
target is: ((a *) (b *)) * == (b *) (((a *) (b *)) *)
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: (b *) (((a *) (b *)) *)
-----------------------------------------
result: true
c-rewrite time: 596ms parse time: 7ms
==========================================
***> L = aL + b implies L = a*b
==========================================
open
==========================================
op L : -> RegExp
==========================================
c-reduce in LANGUAGE : L == (a *) b if (a?(L) == true) and (-a(L) ==
L) and (b?(L) == true) and (-b(L) == nil) and ((nil-in L) ==
false)
using cobasis for LANGUAGE:
op b? : RegExp -> Bool
op a? : RegExp -> Bool
op nil-in _ : RegExp -> Bool [prec 40]
op -a : RegExp -> RegExp
op -b : RegExp -> RegExp
---------------------------------------
reduced to: L == (a *) b
-----------------------------------------
add rule (C1) : L = (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
-----------------------------------------
target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
expand by: op b? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
expand by: op a? : RegExp -> Bool
reduced to: true
nf: true
-----------------------------------------
target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
expand by: op nil-in _ : RegExp -> Bool [prec 40]
reduced to: true
nf: false
-----------------------------------------
target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
expand by: op -a : RegExp -> RegExp
deduced using (C1) : true
nf: (a *) b
-----------------------------------------
target is: L == (a *) b if (a?(L) == true) and (-a(L) == L) and
(b?(L) == true) and (-b(L) == nil) and ((nil-in L) == false)
expand by: op -b : RegExp -> RegExp
reduced to: true
nf: nil
-----------------------------------------
result: true
c-rewrite time: 393ms parse time: 42ms
==========================================
close
BOBJ> q
Bye.
awk%
To BOBJ examples homepage
To Tatami project homepage
Maintained by Joseph Goguen
Last modified: Sun Aug 17 19:17:19 PDT 2003