Theory Comps Review Notes The main topics of the theory part of the comps are Automata Theory, Computability, Logic, and Complexity: Automata Theory ---------------- Chomsky's Hierarchy finite state automata <-> regular languages push-down automata <-> context-free grammars linear bound automata <-> context-sensitive grammars Turing machine <-> recursively enumerable languages Finite state automata (FSA) Structure of DFA (deterministic finite automata) M = (Q,Sigma,q_0,delta,F) Q is the set of all machine states Sigma is the input alphabet q_0 is the starting state delta is the transition function: Q x Sigma -> Q F is the set of final states NFA (non-deterministic finite automata) Any NFA can be written as a DFA. Know the procedure. Regular languages Closure properties (are regular languages closed under...) 1. union: yes 2. concatenation: yes 3. Kleene closure: yes. These are true by definition of regular languages. 4. complementatin: yes. Make all final states non-final and vice versa. 5. intersection: yes. This is a consequence of 1 and 4. 6. substitution: yes 7. homomorphism: yes. This is a consequence of 6. 8. inverse homomorphism: yes 9. quotient: yes Pumping lemma (for regular languages) This is used to show that a language is not regular. The basic idea is to show that, given an infinite language L, it is not possible to find a middle part which can be pumped in every possible way, so that the pumped result is also in L. Specifically, take your language L and assume towards a contradiction that it is regular. Then pick an n such that for all z in L, |z| >= n. Now figure out a z = uvw, and show that pumping v can lead to a z which isn't in L. This is the contradiction which says that L is not regular. Myhill-Nerode theorem I doubt this will be on the comps. NFA minimization I didn't see any problems on this either. Push-down automata (PDA) Structure of PDAs M = (Q,Sigma,Gamma,delta,q_0,Z_0,F) Q is the set of machine states Sigma is the input alphabet Gamma is the stack alphabet delta is the transition function: Q x Sigma x Gamma -> Q x Gamma* q_0 is the initial state Z_0 is the initial start stack symbol L(M) is the language accepted by a PDA due to entering of an F state N(M) is the language accepted by a PDA due to an empty stack Context-free grammars G = (V,T,P,S) V is the set of non-terminal variables T is the set of terminals P is the set of productions S is the starting variable Chomsky Normal Form All productions are of form A -> BC | a Greibach Normal Form All productions are of form A -> a@ where @ is a series of non-terminals or null. Pumping lemma for CFGs For a CFL L, choose an n such that for |z| >= n for each z in L. Now, you should be able to write z = uvwxy where: 1. |vx| >= 1 2. |vwx| <= n 3. uv^iwx^iy is in L for every i >= 0. To prove that a language L is not context-free, choose your n and z so that the first two conditions of the list are satisfied, but the last condition fails. Ogden's lemma This is very similar to the pumping lemma except in addition to choosing n, you should n or more output characters to be "distinguished". Here: 1. v and x together should contain at least 1 distinguished space 2. vwx should contain at most n distinguished positions 3. the same as three above The proof that a language L is not context-free follows the same general procedure. CFG closure properties (are CFGs closed under...) 1. union: yes 2. concatenation: yes 3. Kleene closure: yes. 4. complementatin: no. This a consequence of 1 and 5. 5. intersection: no. 6. substitution: yes 7. homomorphism: yes. A consequence of 6. 8. inverse homomorphism: yes Turing machine (TM) Basic structure M = (Q,Sigma,Gamma,delta,B,F) Q is the set of states Sigma is the set of input symbols. It is a subset of Gamma Gamma is the set of tape symbols delta is the transition function: Q x Gamma -> Q x Gamma x {L,R} B is the blank symbol F is the set final states Turing machine "primitives" Storage in the finite control A state in Q can be used to store information. Multiple tracks A Turing machine can be divided into a k-track, single head Turing machine. Checking off symbols In a multiple track machine a check-mark track can be kept. Shifting over A Turing machine can add blank spaces to the tape by shifting all nonblank characters to the right. Subroutines Two-way infinite tape Multiple tapes running in parallel Often these multi-tape machines consist of an input tape, an output tape, and some work tapes. Classifying languages It is important to be able to classify languages as regular, context-free, recursive, or recursively enumerable. Here are ways to prove that: L is regular -- use closure properties. L is not regular -- use the pumping lemma. L is context-free -- use closure properties. L is not context-free -- use the pumping lemma for CFGs. L is recursive -- show that L and ~L are both r.e. L is not recursive -- reduce halting problem to machine which makes L. L is r.e. -- build a Turing machine which enumerates L. L is not r.e. -- reduce co-halting problem to machine which makes L. Computability ---------------- If you can reduce the halting problem to problem X, problem X is not computable (L(X) is not recursive). Reductions A -> B reduce problem A to problem B This means that A is a special case of B. B is thus, potentially a "harder" problem. The way to show that A reduces to B, is to show that you can use algorithm B to build algorithm A in the following manner. To the input of A (which we'll call x) apply a transformation R, so you get a R(x). This R(x) is to be used as the input to algorithm B and the yes/no answer returned by B is to either become the yes/no answer of A or the reversal of the yes/no answer of A. This transformation R must be doable in LOGSPACE (or, more easily PTIME). Rice's theorem Suppose that C is a proper, non-empty subset of the set of all recursively enumerable languages. Then the following problem is undecidable: Given a Turing machine M, is L(M) in C? Examples: C could be the set of all empty languages, finite languages, etc. Diagonalization Another proof method is to show that a assuming you have an algorithm A which is correct, it leads to a contradiction. An example is the proof that the halting problem is undecidable. Assume that an algorithm exists which given a machine M and an input x, is able to return "yes" if M halt, "no" if it doesn't. The essence of a diagonalization proof is to take the assumed algorithm (we'll call it M_H here), and construct an algorithm from it creates a contradiction when the algorithm is run on itself. We'll construct a machine D which works like this: D(M): if M_H() = "yes" then loop forever else halt. When you try D(D), you get a contradiction. Logic ---------------- Understand vocabularies and models, and how the two relate. Understand first-order logic. Validity means an expression is true for any model. Satisfiability means an expression can be made true my at least some models. Complexity ---------------- See the first problem of Heather's third homework assignment in Winter '94. Some important class relations: TIME(f(n)) is a proper subset of TIME(f(2n+3)^3) SPACE(f(n)) is a proper subset of SPACE(f(n)log f(n)) TIME(f(n)) is a subset of NTIME(f(n)) SPACE(f(n)) is a subset of NSPACE(f(n)) NTIME(f(n)) is a subset of SPACE(f(n)) NSPACE(f(n)) is a subset of TIME(k^log n + f(n)) NSPACE(f(n)) is a subset of SPACE(f^2(n)) PSPACE = NPSPACE NSPACE(f(n)) = coNSPACE(f(n)) P is a subset of coNP coNP is a subset of PSPACE