* Completeness & An NP-Complete Problem * Cook-Levin Theorem SAT is NP-complete * Additional NP-complete problems COMPLETENESS A language L is complete for a class C if P is in C and L' <= L for every L' in C. (Here <= means "reduces to".) Suppose <= is the many-one reduction "<=m". Claim: A_TM is complete for the class of Turing-recognizable languages. Proof: Recall that A_TM = {: M accepts w}. A_TM is Turing-recognizable. Consider any language L that is recognizable. There is a TM M that accepts it. We give the reduction f: f(w) = . Clearly w is in L iff is in A_TM. End Proof Consider NP-Completeness with the reduction given by <=p. We can design A_NP. A_NP = {: M is a non-deterministic Turing machine and accepts w in t steps.} Theorem: A_NP is complete for NP. Proof: A_NP is in NP because given in A_NP, there is an accepting branch of length t, which can be verified in polynomial time. Let L be any language in NP. There exists an NDTM M that decides L in polynomial time -- say p(n) time. For any w in L, we set f(w) to . Clearly w is in L iff f(w) is in A_NP. And f can be computed in polynomial time. End Proof COOK-LEVIN THEOREM A Boolean formula is a formula composed of variables, negations, ands, and ors. SAT = {Boolean formula f for which there is a satisfying assignment} Theorem: SAT is NP-Complete Proof: Two proofs, one using reductions from Turing machines, and the other using reductions from circuits. (NP can also be defined in terms of circuits.) We only give the first proof here. We first prove that SAT is in NP using the verifier definition. For any formula f in SAT, a certificate is simply the satisfying assignment. We next prove that every language in NP poly-time reduces to SAT. Let L be a language in NP, for which there is a poly-time NDTM N that decides L. Let n^k be the running time of N. We build a SAT formula whose satisfying assignments correspond to the accepting branches of N. We encode branch computations of N as a SAT formula in a manner similar to what we did for PCP. Let T = n^k + 3. Consider a T x T grid in which each row will store a configuration. In the SAT formula, we have variables for each cell and encode the following: (a) the first row stores the start configuration; (b) the ith row yields the (i+1)st row; and (c) the last row is an accepting configuration. Since we are not sure that N runs in time exactly n^k, we will relax (b) and (c) a bit: (b) the ith row is either the same as the (i+1)st row or yields the (i+1)st row, and (c) some row is an accepting configuration. Note that we can assume wlog that the machine N runs in exactly n^k time, if needed, or we can stipulate that if you are in accepting state, you continue to be in accepting state. (1) For each cell i,j, and c in Q + Gamma + Sigma + {#}, we have a variable v_{ijc}, which is true if cell (i,j) has c. We first have a formula that for any i,j, exactly one v_{ijc} is true. f_cell = AND_ij (OR_c v_{ijc} AND AND_ab NOT(v_{ija} AND v_{ijb})) (2) The first row is the start configuration. f_first = v_{11#} AND v_{12q_0} AND (AND_j v_{1(j+2)w_j}) AND v_{1T#} (3) Some row is accepting. f_accept = V_ij v_{ijq_a} (4) Encoding the transitions. In a manner similar to PCP, we will encode that cells in a row either match the cells above or differ according to the transition function. In particular, we consider six cells (i,j), (i,j+1), (i,j+2), (i+1, j), (i+1,j+1), and (i+1, j+2). We allow the following matches: aqb/rac if (r,c,L) is in d(q,b) qab/rcb if (r,c,R) is in d(q,a) qab/cab abq/abc abc/abc abc/dbc abc/abq We can represent the disjunction of each of these possibilities as a formula f_ij. And we add a formula f_move = AND_ij f_ij Claim: If the top row of the grid is the start configuration and each 2x3 window is legal, then each row of the table is a configuration that legally follows the preceding configuration. Proof: Every cell that is not adjcent to a state symbol does not change. The 2x3 window corresponding with a state symbol in the top center exactly encodes a legal move. So if all windows are legal, so is the bottom configuration. End Proof of Claim We thus obtain that N accepts a string w iff the formula f we have constructed is satisfiable. We need to show that the construction is poly-time. Esssentially, we need to bound the size of f. Given fixed Gamma, Q, and Sigma, the size is clearly O(T^2) = O(n^{2k}). End Proof of Theorem Theorem: 3SAT is NP-Complete Proof: One attempt would be to convert an arbitrary SAT formula to a 3SAT formula. Unfortunately, arbitrary formulae may incur an exponential blowup when converted to CNF. One can easily see that in the above proof for SAT, f is essentially a CNF formula. Converting CNF to 3CNF is quite easy. (v_1 OR v_2 OR ... OR v_m) can be written as (v_1 OR v_2 OR w_1) AND (~w_1 OR v_3 OR w_2) AND (~w_2 OR v_4 OR w_3) ... AND (~w_{m-3} AND v_{m-1} AND v_m) If the first formula is satisfied by v_i, then the second formula by setting all the w_js to true until j <= i-2 and false thereafter. If the first formula is not satisfied, then no assignment of w_i's satisfies the second formula. Note that this is just a linear blowup. End Proof.