SOCS 2018, first day

IJCAI 2018 and SOCS 2018 is being held in Stockholm from July the 13th to July the 19th as a part of the Federated AI Meeting. In this and other posts I will give some short comments on interesting things I’ve heard. If anything is wrong or misrepresented, please don’t hesitate to get in touch with me.

The second day I attended SOCS 2018, the 11th Symposium on Combinatorial Search. The proceedings for SOCS 2018 are available here.

Holte’s History of A*

Robert Holte gave a vary interesting invited talk on his own view of the history of A* search.

The start of A* was in the mid 60’s on a computer called SDS 940. By then, the idea of a state evaluation function was an old idea (from the mid 40s for ideas on chess playing). However, it was first in 66 that it was used in single-agent search for the Graph Traverser, which would now be called greedy best first search.

A* was created in late 60s for the Shakey the Robot (project from 1966-1972), published in 1968. A quote by Nils Nilsson from 1974 was

The problem of efficiently searching a graph has essentially been solved and thus no longer occupies AI researchers.

A rebuttal quote by Ira Pohl in 1977

What could be meant by this I am not quite sure

Ed Feigenbaum and others said it more harshly

recent work makes it clear that heuristic search theory is far from complete

Patrick Hall IJCAI 1971

[A* is] a somewhat disguised example of branch-and-bound.

Essentially, the research directions in the 70s for heuristic search are still active today.

Judea Pearl 1983

Research on Search and heuristics is hailed by some for proving that AI is capable of breeding genuine scientific results and is shunned by others for not being where the real action truly is

So why is A* so contentious

  • Limitations and exaggerated claims
  • Critics. A* is not interesting
  • Overlooked opportunities

In particular, Holte warns of over-selling the work in the introduction, since the original A* paper could be read to say that A* optimally solves graph traversal and therefore many solves many problems in engineering. Also, the optimality theorem is only in some specific conditions. For example, bi-directional search can expand fewer nodes, and the successor function must be a black box. Also, it is contingent on the fact that only the heuristic function is used to guide the search.

The original paper form 98 has the optimality claim. There was a “correction” published in 72 that is actually wrong. The real proof of optimality is the Dechter and Pearl paper in Journal of ACM from 1984.

Out of application papers from 69-79 (35 in total) that referenced A, only 2 actually used A. In particular Shakey ended up not using A*, it used BFS. Shakey actually had two parts

  • Navigation (A*, BFS): How to go from position A to position B
  • Planning (STRIPS): How to push a box from position A to position B

The planning part never used any ideas from heuristic search.

A recent interesting development is that A* can be generalized from additive paths (a path is the sum of the edges lengths) to general cost algebras.

In general, one should both

  • Try to find new areas to apply results in,
  • Don’t over-sell the applicability of results

Search in Road Networks and Navigation Meshes

Shortest path problems are well studied. There are classical algorithms such as Dijkstra an A*. But for really large graphs pre-processing can speed up the search significantly.

Contraction hierarchies is an abstraction technique that for reasonably low pre-processing cost gives very fast queries. A CH essentially adds optimality-preserving short-cut paths. Creating these also gives an ordering for the nodes. This ordering for nodes can be used in a bi-directional Dijkstra.

Goal of work is to decouple the search algorithm and from the CH. Key insight is that an optimal path first increases the ordering, and then decreases, leading to an apex node. This can be use several ways. For example by creating bounding boxes of the nodes that a particular node and out-edge combination are apex nodes for. The pre-processing can be done in different levels, leading to a trade-off between the pre-processing gain and cost.

Implementations are available.

Very interesting and lots of insights combined with experiments.

Important question: Why is solving these types of things not ok to solve with A* which takes around 1 second and why does it require these advanced things to solve in around 1 microsecond? the answer is two-fold: for applications such as Google Maps it is very important for reasons of Scale, and for games the time-slice allowed is very very short.

Fast k-Nearest Neighbor On A Navigation Mesh

Traditional question is the k nearest neighbours in a plane. New set-up, what if there are polygonal obstacles.

Previous state of the art are from 2004 and uses something called incremental visibility graphs, but unfortunately they take long time (quadrtic in the vertices) and are created for each query.

The work presented extends something called Polyanya (Cui et al. 2017), an online optimal algorithm for shortest paths on a navigation mesh, to be useful for kNN. The base Polyanya algorithm seems very interesting to look into more.

Simplest version is to run Polyanya once per potential candidate. This is ok if the number of candidates is low. If the candidates are many, a modification of Polyanya to search in combination can be done by modifying the search heuristic in one way. For clustered candidates, the search ehuristic is modified in another way.

Results are very fast compared to previous state of the art.

Implementations are available.

k-Nearest Neighbors on Road Networks: Euclidean Heuristic Revisited

The key insight here is that the euclidean heuristic is suprisingly effective for road networks.

Code is available.

Online Selection and Refinement

A Regression-Based Methodology for Online Algorithm Selection

Goal of algorithm selection is to choose the best algorithm for a problem. This is typically modeled as a classification problem, where instances of problems are classified into the different algorithms.

Online algorithm selection improves the the classification after each instance has been classified and tested. The problem is that normally such choices only gives one instance<->algorithm pair. This becomes a contextual multi-armed bandit.

Tests are run against ASlib which is a standard benchmark set for algorithm selection.

Online strategies are simple greedy, epsilon greedy, and a variant of UCB where the prediction is adjusted with the standard deviation. With training data, using UCB costs during the run, and in the end they are not actually better than greedy.

Without training data, it is possible to use a set of algorithms to outperform the single best algorithm.

Online Refinement of Cartesian Abstraction Heuristics

The general question here is if a heuristic used in planning/search that is typically computed during pre-processing can be refined online.

The planning set-up uses Cartesian product abstractions (CPA). For example, we know that a package is in a truck, and that the truck is in one of the location A and B. This is an abstraction since we don’t know the concrete ground state. The CPA is modified during search by refinement (splitting states), merging (joining states), and changing the costs.

In many cases, this can give dramatically less nodes to search through, however more than 80% of the time is then used for handling the CPA updates. By adjusting the amount of CPA modifications (in particular, avoiding merging states), it is possible to perform better than the standard algorithms.

Making Hill-Climbing Great Again through Online Relaxation Refinement and Novelty Pruning

This presentation is in classical planning set-up. It uses something called k novelty pruning, which removes state where no new k-tuple of facts exists. A new insight here is that combining novelty pruning-based search with hill climbing is effective.

Merge-and-Shrink Heuristics for Classical Planning: Efficient Implementation and Partial Abstractions

How should a transition system be represented. One approach is to use an adjacency matrix. Another is to store by stransitions.

The latter representation (edge lists) can be improved by merging locally similar labels. I.e., for a state where two labels have the same states, then their list of transitions can be merged. It is importnat to take care that transformations such as shrinking, merging, and label reduction are handled correctly.

The new representation helps in improving the performance of many algorithms.

Hardness & Tractability

A Suboptimality Bound for $2^k$ Grid Path Planning

Grid path planning is used in robotics and video games. Typically 4-connected (north, west, east, and south) and 8-connected (N, S, W, E, NW, NE, SW, SE). These correspond to 2^3 for 4-connected and 2^3 for 8-connected. The more general 2^k generalizes to jumps further and further away.

Important to note is that 2^k is that it is a subset of any-angle paths. Bounds for k=2 and k=3 were known when compared with any-angle path finding. The paper shows relation to any-angle for all values of k.

Interesting observation is that if the placements are in the corners of the cells, increasing k will yield improving bounds. For placement in the center of cells, the upper bound is fixed regardless of the value of k.

A Refined Understanding of Cost-optimal Planning with Polytree Causal Graphs

Polytree causal graphs are DAGs which when undirected are trees. The paper shows some new results on when planning for polytrees is tractable, namely when the depth and domain size is bounded.

One interesting observation to get to the result is that it is sufficient when searching for a cost-efficient plan to find the shortest such plan. Ordering the nodes using a depth-ordering, the graph can be partitioned into k+1 by taking a node and removing its k outgoing edges (due to polytree requirement prohibiting diamond shapes). This partition has some nice properties with regards to finding the shortes plans.

Clearly a result and topic where one must read the paper.

On the Complexity of Quantum Circuit Compilation

The main result is that Quantum circuit (QC) compilation (QCC) is NP-Complete.

Compared with normal circuits, quantum hardware has many more ad-hoc constraints on layout. For example, there are spatial, temporal, and crosstalk constraints. Compilation has to add swaps to move qubits to the right position to handle constraints. Another problem is that the quantum effects have an upper bound on their effect, after which the state falls apart. This sets a limit on the maximum length of a computation, giving an upper bound for the resulting circuit.

Optimality for the QCC problem is when the depth of the circuit is minimal.

As a comparison with the result, the current hardware has planar graph topologies while the result uses non-planar graphs. It is an open question if planar graph compilation is also NP-complete.


Benchmarks for Pathfinding in 3D Voxel Space

Benchmarks for 3D voxels for path finding are extracted from the game Warframe. Warframe played by more than 16M players currently, even though it is more than 5 years old.

The maps are procedurally generated. The voxels are about half the size of the player. Path finding needs to be found in less than 1ms generally, but sometimes it is ok with 100 ms. Typical computer is a quad-core 8 GiB computer.

Normally the filled voxels are stored in a sparse octal tree. The order is stored in so-called Morton / z-order curve, which helps with caching. Planing typically goes in different levels in the tree, so that the effective voxels are not uniform in size.

Paths are typically computed player to player, and players are expected to move around.

Many possible problem definitions

  • Any angle
  • Hierarchical
  • Uniform voxel grid
  • Visibility graph

For simplicity a uniform voxel grid with 26-connected graph is used. Diagonals are ok if cardinal directions are free.

To get interesting test problems, instances with no solution or where the solution is trivial (i.e., no obstacles) are discarded.

Many interesting questions on how to use the data, for example creating sequential moving opponent instances.