cours:jeuxsto

title:: Algorithms for Stochastic Games
Algorithmes pour les jeux stochastiques
manager:: Stéphane Le Roux
ects:: 3
period:: 2
periodpref:: 2 (must)
format:: 8 x 3h over 1 period
hours:: 24
weeks:: 8
hours-per-week:: 3
lang:
themes:: Automata/Games, Verification, Algorithms
year:: 2025, 2026

[jeuxsto]

Algorithms for Stochastic Games
Algorithmes pour les jeux stochastiques

Language:

Period:

Duration:

24h (3h/week).

ECTS:

Themes: Automata/Games, Verification, Algorithms

Manager:

Stéphane Le Roux.

Lecturers for 2025-2026

Stéphane Le Roux, first term
Xavier Allamigeon, second term

Objective of the course

To survey stochastic games (the notion of value, complexity classes, strategy implementation, etc) and to cover some recent advances with algorithmic flavor.

Contents (2026-2027)

First term (12 hours)

Turn-based finite games

Zero-sum two-player games, mixed strategies, optimal strategies
multi-player games, Nash equilibrium, Subgame perfect equilibrium
Infinite turn-based games on well-founded trees

Finite games in normal form

Zero-sum two-player games and optimal strategies
The von Neumann minimax theorem
Multi-player games, Nash equilibrium
Nash theorem

Borel determinacy

Some properties of infinite trees
The notion of Borel set
Turn-based games on infinite trees
Representation of strategies
Statement of Borel determinacy
Part of the proof

The determinacy of Blackwell games

The notion of Borel-measurable map
Blackwell games
Statement of Blackwell determinacy
Part of the proof

Games on graphs

Turn-based games
Concurrent games

Second term (12 hours)

Stochastic games: basic elements

The one-player case: Markov decision processes.
Problems in finite horizon, discounted, with stopping time, and with mean-payoff.
Bellman’s dynamic programming equation. Positional strategies versus history-dependent strategies.
The zero-sum two-player case: Shapley’s extension of Bellman equation.
Miscellaneous examples, some being unexpected: risk-sensitive problems, log-glasses transform positive linear dynamics (population dynamics) to games, matrix scaling problems, nonnegative tensors.

The mean-payoff problem

The operator approach. The ergodic equation (additive eigenproblem). When it is solvable, the value of the mean-payoff game does exist, and coincides with the limit value of finite horizon games and discounted games.
Ergodicity notions for Markov decision processes. Bather’s theorem on communicating Markov decision processes.
Extensions of Bather’s theorem to the two-player case. Ergodicity notions for stochastic games. The role of dominions.
Non-ergodic turn-based games. Kohlberg’s theorem on the existence of invariant half-lines.
Blackwell optimal policies (optimal for all discount rates close enough to zero) do exist for turn-based game.
Non-ergodic concurrent games. Existence of the uniform value. Approach by Bewley, Kohlberg, Mertens and Neyman, using the semialgebraic character of the discounted value. Generalization to games definable in o-minimal structures.

Algorithms for mean-payoff games and games with a small discount rate

Value iteration and relative value iteration. Contraction properties in various norms and seminorms. Dobrushin ergodicity coefficient and Birkhoff’s contraction theorem. Example of the stochastic shortest path problem (contraction theorem of Bertsekas and Tsitsiklis). Parameterized complexity bounds for ergodic games.
The theorem of Ye, Hansen, Miltersen and Zwick: policy iteration with a fixed discount factor is strongly polynomial for zero-sum two-player turn-based games.
Reduction of the mean-payoff problem to the discounted problem with small discount rate. Bounds on the Blackwell threshold.
The complexity of concurrent games.
Mean-payoff games and tropical geometry. Embedding mean-payoff games in nonarchimedean linear programs. Link between the complexity of mean-payoff games and the complexity of linear programming

Schedule

Full lecturing team

Stéphane Le Roux
Xavier Allamigeon
Nathanaël Fijalkow
Stéphane Gaubert