CSE 250A LECTURE NOTES

January 29, 2001
 
 

ANNOUNCEMENTS

Today's handout is the description of the second project assignment.  Your reports for the first project are due later today, not now in class.

A general comment on the current project: it is very difficult to show convincingly that one algorithm is systematically faster than another, especially when the algorithms both require exponential time, and running times have very high variance.  Perhaps the most important aspect of the current project is to design, carry out, and report on credible experiments.

On Wednesday this week we will start five minutes early, at 1:45pm.  Next week, Wed's lecture will be replaced by a makeup lecture on Friday February 9.
 
 

MIXING AND MATCHING CSP METHODS

Applying the AC-3 algorithm at every step of search is typically too expensive computationally.  A heuristic combination of ideas is often recommended in practice: A further idea for search rearrangement is called dependency-directed backtracking (DDB).  When a failure is detected, the idea is to determine which values of which currently instantiated variables cause the failure.  Then, search returns to one of these variables, and the offending combination of values is never tried again.  Unfortunately DDB is complicated to implement and the time saved by pruning the search tree further is less than the time required to perform the DDB.

Efficient algorithms for repairing broken solutions are a very important area of research for applications.  Often, a large scheduling problem will be solved and then one or a few constraints will be added or changed.  We want to find a new solution without starting from scratch, and without changing the old solution too much.  Randomized algorithms like the ones we will see next are often used for this repair task.  See the paper SPIKE: Intelligent Scheduling of Hubble Space Telescope Observations by Mark D. Johnston and Glenn E. Miller, for example.
 
 

THE DAVIS-PUTNAM-LOVELAND PROCEDURE

DPLL is an algorithm for propositional satisfiability.  This is the fastest known algorithm for satisfiability testing that is not just sound, but also complete, unlike all randomized local search methods.  A version was first published in 1960 by Martin Davis and Hilary Putnam and the algorithm is often called the Davis-Putnam method.  However the method in its present form is really due to Davis, G. Logemann, and Donald Loveland in 1962.

DPLL is depth-first search with backtracking and unit propagation.  Unit propagation means repeated applications of the following two inference rules to the clauses of the sentence to be satisfied:

x    -x | y1 | ... | yn            -x    x | y1 | ... | yn
-----------------------            -----------------------
    y1 | ... | yn                        y1 | ... | yn

The first of these rules says that if some clause consists of a variable alone, say x, then any other clause containing not x can be simplified to consist of just its other disjuncts.  The second rule is similar for a clause consisting of a negated variable alone.

Definitions: A literal is a variable or a negated variable.  A unit clause is one that consists of a single literal.  The first rule essentially infers that x =T and propagates consequences of this constraint.  The second rule is similar for x = F.  Performing unit propagation with x, i.e. with x = T, causes further unit propagation if and only if n = 1, i.e. (not x) or y is a binary clause.

With good data structures, unit propagation can always be done in time linear in the size of the entire set of clauses to be satisfied.
 
 

DPLL HEURISTICS

Because each variable has a domain of size exactly two, we cannot directly apply the "most constrained" heuristic for ordering variables.  Instead, a good heuristic is to choose a variable that causes the greatest amount of constraint propagation.  For efficiency, this is approximated by counting how many binary clauses each variable appears in,
because one application of unit propagation causes another only if the first application is to a binary clause.

Given which variable should be instantiated next, no useful heuristic is known for which value of that variable should be investigated first.  Therefore it is beneficial to choose a variable that causes many unit propagations for both assignments to it.  One widely used heuristic scoring function for variables is

score(x) = 1024*pc(x)*nc(x) + pc(x) + nc(x) + 1
where pc(x) and nc(x) are the numbers of times x appears positively and negatively in clauses of length 2.

For difficult 3SAT problems with clause to variable ratio around 4.26, the running time of a good implementation of the DPLL method is empirically O(2v/19.5).  Problems with 400 or more variables are not solvable in reasonable time.
 
 

LOCAL SEARCH

Now we start investigating a major new topic, algorithms that do not directly attempt to find globally optimal solutions to search problems.

A*, backtracking search for CSPs, and all the other search methods we have seen so far are systematic and global.  With A* for example, at each step, we expand the node that appears best among all previously discovered nodes.

But in general, global algorithms have infeasible space complexity for large problems.  One response is to consider iterative deepening variants, and indeed an IDA* method exists.

Another response is to consider search methods that ignore global information.  The general framework is:

choose a random initial state
while not termination-condition do
    expand current state to find its children
    evaluate the h value of each child state
    let new current state be child with lowest h value
 
 


Copyright (c) by Charles Elkan, 2001.