A general comment on the current project: it is very difficult to show convincingly that one algorithm is systematically faster than another, especially when the algorithms both require exponential time, and running times have very high variance. Perhaps the most important aspect of the current project is to design, carry out, and report on credible experiments.
On Wednesday this week we will start five minutes early, at 1:45pm.
Next week, Wed's lecture will be replaced by a makeup lecture on Friday
February 9.
Efficient algorithms for repairing broken solutions are a very important
area of research for applications. Often, a large scheduling problem
will be solved and then one or a few constraints will be added or changed.
We want to find a new solution without starting from scratch, and without
changing the old solution too much. Randomized algorithms like the
ones we will see next are often used for this repair task. See the
paper SPIKE:
Intelligent Scheduling of Hubble Space Telescope Observations by
Mark D. Johnston and Glenn E. Miller, for example.
DPLL is depth-first search with backtracking and unit propagation. Unit propagation means repeated applications of the following two inference rules to the clauses of the sentence to be satisfied:
x -x | y1 | ... | yn
-x x | y1 | ... | yn
-----------------------
-----------------------
y1 | ... | yn
y1 | ... | yn
The first of these rules says that if some clause consists of a variable alone, say x, then any other clause containing not x can be simplified to consist of just its other disjuncts. The second rule is similar for a clause consisting of a negated variable alone.
Definitions: A literal is a variable or a negated variable. A unit clause is one that consists of a single literal. The first rule essentially infers that x =T and propagates consequences of this constraint. The second rule is similar for x = F. Performing unit propagation with x, i.e. with x = T, causes further unit propagation if and only if n = 1, i.e. (not x) or y is a binary clause.
With good data structures, unit propagation can always be done in time
linear in the size of the entire set of clauses to be satisfied.
Given which variable should be instantiated next, no useful heuristic is known for which value of that variable should be investigated first. Therefore it is beneficial to choose a variable that causes many unit propagations for both assignments to it. One widely used heuristic scoring function for variables is
score(x) = 1024*pc(x)*nc(x) + pc(x) + nc(x) + 1where pc(x) and nc(x) are the numbers of times x appears positively and negatively in clauses of length 2.
For difficult 3SAT problems with clause to variable ratio around 4.26,
the running time of a good implementation of the DPLL method is empirically
O(2v/19.5).
Problems with 400 or more variables are not solvable in reasonable time.
A*, backtracking search for CSPs, and all the other search methods we have seen so far are systematic and global. With A* for example, at each step, we expand the node that appears best among all previously discovered nodes.
But in general, global algorithms have infeasible space complexity for large problems. One response is to consider iterative deepening variants, and indeed an IDA* method exists.
Another response is to consider search methods that ignore global information. The general framework is:
choose a random initial state
while not termination-condition do
expand current state to find its children
evaluate the h value of each child state
let new current state be child with lowest h value