Reviewer 1: The authors describe algorithms for fast "funny matrix" multiplication, meaning the usual max-product operation in graphical models. They show that by reordering the search using sorted vectors, they can speed up the operation (on average over independent orderings) by at least a small exponent factor, and sometimes much more. This seems like a very interesting and exciting idea, is novel to the best of my knowledge, and I certainly plan to give it a try. The paper is well written and clear, with intuitive examples to pull the reader along. I particularly liked that the authors took the time to explain when their algorithm would work well, and when not; and to include examples in which their algorithm did not help or had only modest improvement. I wish I had more useful suggestions, but I have only some minor issues to address. 1. I do not like the style of figures interrupting the text (latex "here"), and would prefer floating 2. The bulleted list in 1.1 moves between "list form" and "conversational form" (e.g., one bullet is giving an example) 3. Eqn. (17) appears to be wrong (perhaps an old notation?); it should maximize over x_j rather than y_j. Reviewer 2: The authors present an approach for speeding the message-passing operation either when the factors of the model decompose and/or when parts that are not evidence dependent can be precomputed. The core idea is that given pre-sorting of vectors, some maximization computations can be carried out efficiently. The authors show that their method results in expected run-time theoretical benefits as well as empirical gains in efficiency. The authors tackle an important problem and develop an interesting tool that has the potential of being useful in practice. The authors actually due a good job of covering a lot of ground. However, the central high-level concern is the state of this journal submission relative to 2+ works (the plus being assigned to a workshop presentation) that have appeared in the past by the same authors (and are properly cited). While I am not of those who think that a journal paper needs to be dramatically more significant than its conference predecessors, I believe that in this case the submission does not contain sufficient content to warrant a JMLR publication. On the one hand this paper is not really a theoretical one so that the proof appendix does not constitue a significant addition. On the other hand, except from cosmetic points, the main experiment added to the paper (Protein design) is inconclusive. Overall, I would have expected something more substantial such as an empirical comparison to other approaches, possibly including approximate methods (see relevant comments below) Additional comments: Though not properly exposed, the method proposed by the authors is beneficial mostly when the state space is large. Aside from making the scope more limited, this also brings to mind alternative MAP approaches that typically benefit from large state spaces. To name just a couple of the most recent one I am aware of: Alphabet SOUP method by Gould et al. and the clamping method by Eaton and Ghahramani. More generally, the idea of clamping which typically benefits from large state spaces appears widely in the CSP community. The second central concern is that while the benefit for exact inference has merits, many of the domains suggested by the authors actually benefit from approximate inference. Thus, from a practical point of view, which is the core contribution of the paper (despite that some theorists might be excited by the sqrt(N) factor), the performance of the method within approximate propagation method is as, if not more, important than its performance in a junction tree. The non-impressive results within Sontag's algorithm which is similar to many recent propagation based developments, further highlights the need for such a comparison. Presentation-wise, the claim that the improvements ``shall increase the class of problems for which inference via max-product belief propagation is tractable'' is a gross over-statement that is not justified by the experiments. More importantly, the authors do not properly discuss the main limitation of their method in my view which is that it is effective only when the state-space is quite large.