IJCAI 2018, third day

IJCAI 2018 is being held in Stockholm from July the 13th to July the 19th as a part of the Federated AI Meeting. In this and other posts I will give some short comments on interesting things I’ve heard. If anything is wrong or misrepresented, please don’t hesitate to get in touch with me.

On the sitxh day the main IJCAI conference continued. The conference has many parallel tracks, comments here are from sessions that I thought might be interesting for me.

The third day of IJCAI includes an incited talk by Max Tegmark that I very much looked forward to.

Knowledge Representation, Constraints and Satisfiability

Exploiting Justifications for Lazy Grounding of Answer Set Programs

Answer set programming (ASP) is a knowledge representation system somewhat akin to logic programming. The models of the program are the consequences of the rules, and does not include “negation as failure”. Solving uses grounding of facts and then turning it into for example SAT systems.

A problem is that grounding can explode the size of grounding. Here the idea is to interleave the phases of grounding and solving. To avoid some exponential problems, justification analysis is used.

Pseudo-Boolean Constraints from a Knowledge Representation Perspective

Pseudo-Boolean constraints (PBC) are linear sums of Boolean variables. Two specific cases are normalized PBC which use only non-negative weights and CARD where the comparison operator is defined to be only >. Two other formalisms to note are CNF (conjunctive normal form) and OBDD (binary decision diagrams.).

This paper uses PBC in the context of knowledge representation. PBC are more succinct than CNF but less than OBDD. The paper goes through a lot of knowledge transformation operations and investigates if they are possible to do.

Novel Algorithms for Abstract Dialectical Frameworks based on Complexity Analysis of Subclasses and SAT Solving

The context of this paper is argumentation frameworks (AF) which is a graph-based representation where arguments are nodes and edges are attacks. Abstract dialectical frameworks (ADF) add propositional formulas to the nodes adding acceptance conditions. On these models semantics are added and reasoned with.

The complexity of reasoning tasks on ADFs is one level higher than for AFs in the polynomial hierarchy.

The paper defines certain ADF subclasses that are easier, and implements reasoning on ADFs using SAT solving.

The subclasses used are

Bipolar
Acyclic
Concise

and variants using a k that allows some deviation. For example it was known that bipolar drops the complexity class, and k-bipolar still allows this. On the other hand, k-acyclic does not have the benefit that acyclic has.

Stratified Negation in Limit Datalog Programs

Datalog is a quite interesting formalisms, since it has fairly efficient solvers. My personal interest is in the use in the Rust compiler for lifetime analysis.

The use here is for something called declarative analytics, which requires extending base datalog. Previously the authors extended datalog with arithmetic integers with limit predicates that keep only min or max values. The rules have only one numeric argument, except for the min/max limiters.

This paper extends limit datalog with negation.

Classification Transfer for Qualitative Reasoning Problems

Classification is to, given a set or relations X, find the complexity for all subsets of CSP(Y). The context of this and the contents of the paper are very technical, so I can’t summarize it more here.

One interesting thing to note is that analysis is done on rectangle algebras, which might be interesting to look more into.

Simpler and Faster Algorithm for Checking the Dynamic Consistency of Conditional Simple Temporal Networks

I’ve been interested in Simple Temporal Networks (STN) ever since I heard about it as a part of the domain store in IBM/ILOG CP Optimizer. An STN uses real-valued time-point variables and temporal constraints on these.

The context here is in scheduling in hospitals, where some planning is dependent on the outcome of certain nodes. This leads to Conditional STNs (CSTN). These have nodes that are so-called observation time points that have associated propositional names. While going through the network, the scenario (combination of propositional name values). Edges in the STN can use formulas on the propositional names.

The paper presents new algorithms for making consistency checks in a CSTN. Dynamic consistency for a CSTN means that there exists a strategy for executing the time-points so that all the relevant constraints (those involved in the execution) are satisfied.

The algorithm is simpler and faster than previous algorithm, and also uses a better semantics.

Nicole Immorlica gave an invited talk on her research are in the interaction of computer science and economics. In particular, she is interested in markets and how they are designed. In particular, she is interested in markets where the objective is to maximize the benefit to society.

Money is often useful to use in designing such markets. A simple example of money based market is a simple auction, where th point is to allocate something to the person who values it the most.

However, how to design markets without using money? Some examples

Traffic congestion control by who values their time
Online markets where virtual rewards drive online content
The market for presidents where everyone is allocated a vote and are allowed to “pay” with this vote
Market for schools with raffle tickets, where students have to buy in to risk

For road traffic, the natural question is how well congestion avoidance routes traffic. Looking at simple examples it is possible to see that routing more intelligently could lower the average but this is not an equilibrium (agents could increase their own benefit by not following the routing). A worst-case result is a network with two roads. one has a 60 minute delay, and the other has a delay that is proportional to the number of drivers on that road. The equilibrium will be when enough drivers drive on the dynamic road so that the delay reaches 60 minutes, while the optimum is to make some people use the fixed delay road instead. One nice result is that at worst, congestion equilibrium routing is at most 33% worse than the optimimum.

Online content uses votes, views, and likes to reward online content. Furthermore, sites uses badges for accomplishments and leader boards. In total, this is about fame. To show that this works, extracting data from Stack Overflow it is possible to show that users value the action the badge rewards since they perform more of the action in the lead-up to getting the badge. The example used was the electoral badge, which is given to users voting on enough questions.

Reward mechanisms that maximize contributions are similar to all-pay auctions. This can be used to show that for low-fame-value settings in an idealized setting leader boards with cut-offs are optimal, but badges can be better. If the fame is very important, then full leader boards are very good.

Are our current elections giving us the leaders we want? At this question, she showed us a caricature of Trump by Edel Rodriguez. In Majority rule, the electorate has one vote per person. But this sometimes gives the wrong results, since people may have different value on the preference (some have strong opinions, some don’t really care). One possibility is to use quadratic voting, which in abstract one can pay for v votes with v^2 dollars. For two-party systems, quadratic voting maximizes value to society. The reasoning from the formula

value x Pr[outcome] - cost

can be analyses to show that each rational person would vote with their value. To solve the problem of money it can be possible to give voters sets of points to vote on issues with.

For school allocation using raffle tickets, the use of raffle tickets makes people pay for their valuation using risk of losing some places to gain a place at another place. The setting is a set of students with school preferences and a set of schools with capacities and priorities (such as closeness, sibling preferences). The priorities are skipped here, and schools are duplicated to account for capacity. This leads to a matching problem. Properties needed are fairness, simplicity (it should be explainabler to a fifth grader), and value.

The latter is the important problem here, since it is hard to know the value of different peoples valuation of their preferences. Using stable-marriage style allocation (deffered acceptance) with preference orders does not give as good results as possible. The mechanism used is a lottery. A single ticket raffle is when each student gets one ticket, and can put it into the bucket of it’s choice. Each school draws a ticket, and gives the place to the student drawn. This is then iterated until all allocations are done. This gives more value in aggregate.

Further markets that are interesting are public housing, online dating, social networks, personal data, … In all of these it is important to

Constraints and Satisfiability

A Fast Algorithm for Generalized Arc Consistency of the Alldifferent Constraint

This paper has a new algorithm that prunes many values that the full domain consistent algorithm would remove but does so faster. The base is the same as Régin’s algorithm, but identifies edges directly during the maximal matching algorithm instead of running Strongly Connected Components on the residual graph. The worst case complexity is the same as Régin’s algorithm, but in practice it is much faster.

This could either be used as an alternative or in a staged propagation setting. Hue algorithm is implemented in Minion and compared with the built-in version.

Compact-MDD: Efficiently Filtering (s)MDD Constraints with Reversible Sparse Bit-sets

This is a paper that is close to my interests, since I worked on extensional constraints and MDD representations in my licentiate.

The new algorithm adapts the compact table bitwise operations from the table extensional constraints from CP2016 to propagate an MDD constraints.

In addition, the paper has a really elegant and cool construction of MDDs that constructs a non-deterministic MDD! It does so by splitting the MDD in two, compacting them both, and then connecting them using a non-deterministic layer. I searched a lot for ways to construct non-deterministic MDDs and minimizing NFAs in 2008, but never managed to find anything, so this result makes me happy. The benefit is that the MDDs have much fewer nodes in general, but sometimes a bit more edges.

Since the graph of nodes in the MDD is Berge-acyclic, it can efficiently be propagated in a top-down and then a bottom-up phase.

Core-Guided Minimal Correction Set and Core Enumeration

The context for this paper is finding minimal unsatisfiable sets (MUS, clauses that are unsatisfiable) and minimal correction sets (MCS, formulas to be changed to make clauses satisfiable). A system to find these sets is described.

The algorithm enumerates all MCSs in increasing order of cardinality. The algorithm enumerates correction sets using MUSs and turning them into correction sets. These are not guaranteed to be minimal, but since MCSs are generated in cardinality order, it is simple to see if the set already exists.

A Reactive Strategy for High-Level Consistency During Search

This paper is in the context of abstract CSP research, and how to choose between different levels of consistency such as GAC, SAC, SGAC, and so on.

As a CP system developer, this framework is not really that interesting to me. Consistency is not the question of choice for me, since I view constraints through the lens of propagators as weakly monotonic functions on the domain lattice. However, the base question on how to apply different levels of effort in propagation is interesting. Also, “singleton (generalized) arc consistency”/shaving can be used also in propagation based solvers.

The main idea is to keep information on the statistics of the backtrack depth. Stronger levels of consistency is applied when the number of backtracks reaches some threshold.

The results are fairly impressive, and i would be really interested in if it is possible to use the idea to for example change the level of propagation dynamically in a standard constraint programming solver such as Gecode.

Reduced Cost Fixing for Maximum Satisfiability

This paper won a best paper award at CP2017. The context is in solving maximum satisfiability problem. The solution choice is about combining SAT with a Integer Programming solver and using reduced costs from the IP to solve the MaxSAT.

Dynamic Dependency Awareness for QBF

Proof complexity can give lower bounds for the size of proof of unsatisfiability. Traces from SAT solvers can be viewed as proofs, so proof complexity gives a lower bound for the solvers.

Here the question is about QBF, where quantification is done over the CNF formula. Sometimes it is possible to re-order quantifiers based on their dependencies on other variables. Normally, this is done statically by computing the dependency information and re-writing the formula. Here dynamic re-writing is done inside the solver.

The paper shows that dynamic handling of dependency information can be exponentially better.

Intelligible Intelligence & Beneficial Intelligence

It is not enough to make our technology powerful, but we also need to steer it and to direct it at a destination.

Max Tegmark wants to define intelligence in a way that is not carbon based life form discriminating. Self driving car were a dream, now we have self driving rockets. Recently we could not recognize a face, now we can generate faces. Recently we could not solve Go, but now a computer can teach itself to play Go crushing not just human players but also human researchers writing .

The question is, how far can we go? For each task, Given a landscape of tasks and a rising sea level of AI capability, how far will it be able to take over. It is important to not be on the coastline where AI will take over soon

In particular, developments in AI could lead to further improvements in AI.

So, are we going to get a super intelligence or at least an artificail general intelligence (AGI)? We don’t know, but we need to discuss the possibility. Surveys of researchers say that we may get AGI in a few decades.

If we build technology that makes humans unnecessary, what could possibly go wrong?

So, how do we steer AI towards something that will benefit us.

Fire, things went wrong we make fire extinguishers
Cars invented, then crash, we make seat belts, better brakes, …
Nuclear power, synthetic biology, AGI, we need to get it right first!

Some principles.

We need to invest in AI safety research. The problem is not that an AI would turn evil or the problem of consciousness. The real problem is if the goal of the AI is not aligned with ours.

The only way to trust an AGI is to understand it. This leads to the problem of intelligible intelligence.

Good Old Fashioned AI (GOFAI) we could understand, we wrote the algorithms.
Machine Learning (ML) is problematic since it is hard to understand and many don’t think we ever will
Max Tegmark believes that a combination may emerge that can benefit from both parts. Could we use ML to find the right GOFAI algorithms?

Max Tegmark has been doing research both using ML in physics and using physics ideas in ML. This has resulted in papers that make strong connections between physics concepts and deep neural network concepts, and draws conclusions and gives guidance based on this.

There appears to be a connection between the set of problems neural networks can solve (they can solve very few of all problems due to the No Free Lunch theorem), but fortunately it seems to be the case that the problems they can solve are the ones that we actually care about.

As an example, taking the DNN playing Atari Breakout and extracting the functions. Then analyzing those functions some nice equations could be extracted that actually have interpretable meaning!

Extending this, simple worlds with more and more complex physical behaviors. Then training a DNN to predict behavior in these. However, accuracy is not very great, especially they were not very good. Also, there was no good explanation that could be extracted. Instead using methods to find optimal explanations, and using heuristic approaches for these was quite a bit better. An important thing here was to not use Mean Square Errors, but instead Minimum Description Length.

Another point that is very important is that AI-generated wealth must make everyone better off, not just some.

Also, it is important to ban lethal autonomous weapon systems (LAWS). An important point is that LAWS would be used in completely new ways. Comparing spears with cruise missiles, cruise missiles are used in completely different ways? It is important to ban ways to harming people and instead make sure to help people. Comparing fields

Biology: bioweapons vs new cures
Chemistry: chemical weapons vs new materials
Robotics AI: slaughterbots vs robotic workers

Many have pledged to not personally be a part of building LAWS systems. Many countries have pledged including Chine but surprisngly not Sweden yet.

Let’s build AI that empowers us, not AI that overpowers us!

Sister Conference Best Papers: Knowledge Representation and Reasoning

The Intricacies of Three-Valued Extensional Semantics for Higher-Order Logic Programs

This paper tackles the problem of specifying the semantics of a higher order logic programming language with negation, and won best paper at ICLP 2017. Unfortunately this is not possible with some definitions of negation, but possible with some other definitions (for example, with stratified negation).

As an example, consider the

p(R) :- R.

q(R) :- ~w(R).
w(R) :- ~R.

Both of the predicates p and q should have the same extensional semantics. For 3-valued sematics, it is unlikely that it exxists.

Further development was published at this years ICLP.

Attributed Description Logics: Reasoning on Knowledge Graphs

Wikidata is an open knowledge graph that from teh Wikimedia Foundation. It can, similar to Wikipedia, be edited by anyone.

The graph has facts with attributes. For example, edges between people and awards they have gotten. Sometimes with attributes on when. Sometimes with attributes on why.

Description logics are used to write formulas over knowledge graphs. Attributed description logics extends this to attributed knowledg graphs.

Planning and Scheduling

Variable-Delay Controllability

Simple Temporal Networks (STNs) can be extended to also handle uncertainty. This paper handles cases with variable-delay by transforming it into a STNU with fixed-delay.

Hierarchical Expertise Level Modeling for User Specific Contrastive Explanations

Classical planning is about creating a plan that can allow an agent to achieve a goal. Human-aware planning needs to make plans that take the humans into account. In particular, this work tries to create explanations that explain to the human why the agent is doing as it does. A contrastive explanation can be used to contrast the current plan against an alternative plan. These explanations should also be tailored towards the currents users expertise level.

The human is modeled as an abstract version of the agents model of the plan. These abstractions can be described in a lattice. Given a question, the goal is to find a small explanation in the abstraction lattice to show a contrastive explanation for the question. Finding the minimal explanation is too expensive, so it is done using heuristic search.

Novel Structural Parameters for Acyclic Planning Using Tree Embeddings

Propositional planning is PSPACE-complete. Identifying tractable fragments is interesting both theoretically and practically (solving easier sub-problems in planning systems).

From this years SoCS it is known that bounded polytrees are tractable. Here more general graphs are analyzed using treedepth. Treedepth is defined for a graph if it is possible to embed the graph into a tree where all edges go up-down (possibly skipping levels), but not sideways. The paper has versions of this for up and down directions of arcs. Tractability is then analyzed for the cases based on up and down depth.

Completeness-Preserving Dominance Techniques for Satisficing Planning

In classical planning for optimizing a goal, it is possible to add dominance pruning by checking if a state is “closer” to the goal than another state.

This paper defines a version of dominance that can also work for planning where satisfaction of the goal is the only result needed.

Emergency Response Optimization using Online Hybrid Planning

Emergency response planning (ERP) is exactly what it says, planning emergency responses by responders given emergency scenes. Typically the response time is optimized for. Here the question is how to do ERP in an online fashion.

The paper defines the problem using the RDDL language, proposes an algorithm that translates the problem to a mixed integer linear program that is solved using Gurobi, and tests it using real-world data. An interesting feature was doing stress tests by doubling the amount of emergencies in the data.