CSE 250A LECTURE NOTES

February 5, 2001
 
 

ANNOUNCEMENTS

Today's handout consists of some notes about the reports submitted for the web crawler project.  Most of the important general issues for the second project are similar to those for the first project.  I'll discuss the notes briefly.  There are at least two additional important issues: Remember that this week, Wednesday's lecture will be replaced by a makeup lecture on Friday.
 
 

MIXING RANDOMNESS AND GREEDINESS

Forgetting about the metaphor from physics, the advantage of simulated annealing seems to be that it combines a random walk with greedy local search, and does not bother to evaluate all children of the current state.  Let's consider a 3SAT algorithm that does this combination directly:
choose a random initial truth assignment
while not termination condition do
    randomly choose an unsatisfied clause
    flip one variable of this clause as follows:
        with fixed probability p
             select the variable randomly
        or, with probability 1-p:
             select the variable greedily, i.e. select
             the variable that yields the best h value
This algorithm is called Walksat.  Note that at each step, the randomly chosen clause is satisfied, but other clauses may become unsatisfied.

The parameter p is called the "mixing probability" and determined approximately by experiment for a given class of CNF formulas.  For random hard 3SAT problems (those with the ratio of clauses to variables around 4.25) p = 0.5 works well.  For 3SAT formulas with more structure, as generated in many applications, slightly more greediness, i.e. p < 0.5, is often better.

Because of its random walk component, Walksat is less likely than GSAT to become stuck in the region around a local optimum.  Therefore, it is more likely to find a solution without any restarts and the termination condition, which says when to restart with a new random assignment, should be more stringent.  Empirically, restarting after O(n2) flips, where n is the number of variables, works well.

Walksat is the best known algorithm for 3SAT.  It outperforms simulated annealing and is capable of solving problems with around 2000 variables in a few hours.
 
 

HOW TO TUNE A LOCAL SEARCH METHOD

As discussed in class, the idea is to choose settings for the parameter of the algorithm in order to minimize the ratio m/s, where m is the long-term average of the h values achieved, and s is the standard deviation of these values.

For details see the paper Evidence for Invariants in Local Search by David McAllester, Bart Selman, and Henry Kautz, AAAI'97.
 
 

METAPHORICAL ALGORITHMS

Search algorithms, usually local and randomized, have been proposed based on numerous phenomena from physics, biology, sociology, and economics.  Adapted from Andrew Moore of CMU, here is a list: Rigorous experimental or mathematical results have not been established for any of these approaches.  In general, there is no reason to expect that whatever works best in one arena (ant colonies for example) will also work best in another arena (human societies or computation for example).  A metaphor can be inspirational, but there is no reason to expect it to be literally true or more than partially useful.

For example, after simulated annealing evaluates a randomly chosen neighbor, the algorithm always moves to the neighbor if it is better, following an intuition from physics.  But Walksat does better by sometimes not moving to a neighbor even if it is better.

There are still major opportunities for new heuristic approaches to search and combinatorial optimization.  Some important issues include:

The fundamental computational issue seems to be how to trade-off improvement along multiple conflicting dimensions.  Research should be guided by experimentation and logical argument more than by loose analogies.

The word "metaphorical" is not automatically pejorative.  What it suggests is the need for additional thought, because the evaluation of metaphorical methods cannot be purely quantitative,  Competing metaphors are always available; which
are better and why?

One issue with metaphor-based methods is the balance between focus on the nature of the method and focus on the nature of the task.  What are the fundamental properties of a task that cause certain methods to be suitable?  For example, there is a basic difference between offline search, as done by A*, and online search, as done in the first 250A project, where costs incurred already are sunk.
 



Copyright (c) by Charles Elkan, 2001.