Verified Lean algorithms for solving tabular MDPs and proving their properties. The focus of this project is on two main goals:
-
Basic algorithms that can solve robust and risk-averse MDPs of moderate size.
-
Proofs of correctness of algorithms and fundamental MDP properties which can be used independently to prove structural results, such as the optimality of certain policy class.
- Definitions: probability space and definition
- Definitions: probability, expectation, conditional properties
- Independent of random variables
- Tower property, law of the unconscious statistician
- Quantile definition and basic properties
- Quantile under monotone transformation
- Definition (non-constructive)
- Practical implementation O(n^2) and correctness
- Fast practical implementation O(n log n) and correctness
- VaR is positively homogeneous and monotone
- VaR is translation (cash) invariant
- VaR under monotone transformation
- Definition of MDP
- Definition of policies (history, Markov, stationary)
- Definition of value function (history-dependent)
- Histories and manipulation
- Probability space over histories
- Return and optimal return using histories
- History-dependent value function and dynamic program
- Markov optimal value function and optimal policy
- DP algorithms
- History-dependent utility functions
- Augmented value function dynamic program
- VaR computation from utility function
- VaR DP decomposition as in Hau et al., 2023
- Overview of tactics: https://github.com/madvorak/lean4-tactics
- Comprehensive list of tactics: https://seasawher.github.io/mathlib4-help/tactics/
- Loogle: https://loogle.lean-lang.org/
- Moogle: https://www.moogle.ai/
- Blueprint: https://github.com/PatrickMassot/leanblueprint
- Lean packages and extensions: https://reservoir.lean-lang.org/
- Notations: https://github.com/leanprover-community/lean4-mode/blob/master/data/abbreviations.json
- Resource for Probability: https://korivernon.com/documents/MathematicalStatisticsandDataAnalysis3ed.pdf