“Correlated Equilibria in Satisfaction Games”, Prof. Langford B. White, University of Adelaide

Date
12/10/2018

Time
14:00

Location
Institute of Telecommunications
E389 Seminarraum 1 (CG0118)

Loading Map....

Correlated Equilibria in Satisfaction Games

In this presentation I will introduce the idea of correlated equilibria in satisfaction games. A satisfaction game is a multi agent system where agents (players) do not compete to maximise some utility (payoff) function, but rather, they try to satisfy set theoretic constraints imposed on their own actions by those of the other agents. When there is a notion of utility such as a resource, then satisfaction can be defined in terms of being able to meet some minimum desired level of the resource rather than maximising it. A satisfaction equilibrium (SE) is a stable state whereby all agents are satisfied (Ross et al, 2006 ; Perlaza et al 2012). The notion of SE can be generalised (GSE) to the case where some subset of agents are satisfied and the remainder are not, but cannot become so by any choice of actions (Goodewardena et al 2017).

Correlated equilibrium (CE) is a generalisation of Nash’s strategic equilibrium (NE) due to Aumann (1997). The information structure in CE is different from NE. A CE is a mixed strategy (probability distribution over all players’ action) which is known to all players. An action profile is drawn from the distribution and each player’s action is private information – a player does not know the others’ actions, just that they will maximise their (conditional) expected reward given their recommended action. Aumann argues that CE is a “natural expression of Bayesian rationality”. CE results when no player can gain by unilateral deviation from their given action. The set of CE is a convex polytope which contains all (pure and mixed) NE.

I have applied to concept of CE to satisfaction games. A new definition of GSE is needed because the information structure is different that that in the current approaches. I have proven that with a binary utility (satisfied or not) that the set of CE correspond to mixed action profiles by which every player maximises its probability of satisfaction. This is a true expression of Bayesian rationality in satisfaction games.

Regret Matching (RM) (Hart and Mas-Colell, 2000) is an iterative algorithm which “approaches” the set of CE almost surely. I have compared RM to the existing algorithms for SE/GSE. For a resource allocation game, RM achieves higher levels of satisfaction, better utility and more efficient resource allocation. An adaptive version based on “reinforcement learning” (Hart and Mas-Colell, 2001) is also under current study.

Prof. Langford B. WhiteUniversity of Adelaide