• Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...
    35 KB (5,156 words) - 11:15, 25 May 2025
  • observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...
    22 KB (3,306 words) - 13:42, 23 April 2025
  • Thumbnail for Markov chain
    In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...
    96 KB (12,900 words) - 11:52, 1 June 2025
  • to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...
    10 KB (1,231 words) - 16:39, 29 May 2025
  • Thumbnail for Markov property
    probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...
    8 KB (1,124 words) - 20:27, 8 March 2025
  • The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...
    3 KB (501 words) - 23:27, 25 June 2024
  • Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...
    2 KB (229 words) - 07:10, 17 June 2024
  • Thumbnail for Reinforcement learning
    Reinforcement learning (category Markov models)
    dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...
    69 KB (8,194 words) - 13:01, 17 June 2025
  • Thumbnail for Andrey Markov
    Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...
    10 KB (1,072 words) - 21:36, 10 June 2025
  • this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...
    29 KB (3,835 words) - 15:13, 21 April 2025
  • theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...
    3 KB (275 words) - 03:33, 13 March 2024
  • using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...
    281 KB (28,734 words) - 22:57, 20 June 2025
  • Thumbnail for Michael Katehakis
    University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...
    10 KB (966 words) - 01:50, 18 January 2025
  • Thumbnail for Multi-armed bandit
    played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...
    67 KB (7,667 words) - 19:30, 22 May 2025
  • State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...
    6 KB (716 words) - 19:17, 6 December 2024
  • reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...
    6 KB (763 words) - 19:33, 15 May 2024
  • using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...
    1 KB (152 words) - 17:39, 25 May 2025
  • example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...
    3 KB (415 words) - 14:59, 12 December 2023
  • Stochastic game (redirect from Markov game)
    Lloyd Shapley in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...
    16 KB (2,434 words) - 19:57, 8 May 2025
  • reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcement learning algorithms use dynamic programming...
    140 KB (15,572 words) - 23:09, 20 June 2025
  • probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...
    6 KB (614 words) - 16:21, 27 January 2025
  • allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...
    30 KB (3,469 words) - 20:23, 6 June 2025
  • framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...
    6 KB (551 words) - 02:14, 2 June 2025
  • Thumbnail for Bellman equation
    Optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...
    28 KB (4,008 words) - 22:01, 1 June 2025
  • methods. It estimates the state value function of a finite-state Markov decision process (MDP) under a policy π {\displaystyle \pi } . Let V π {\displaystyle...
    12 KB (1,565 words) - 20:36, 20 October 2024
  • which is, for the example, E(3) = 3.5 slots. Control theory Markov chain Markov decision process Tanenbaum & Wetherall 2010, p. 395 Rosenberg et al. RFC3261...
    23 KB (3,342 words) - 08:44, 17 June 2025
  • American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...
    5 KB (401 words) - 07:30, 13 September 2024
  • Viable Product) Markov decision process, a probabilistic model that is widely used in artificial intelligence Mask data preparation, a process in electronic...
    3 KB (381 words) - 14:02, 9 June 2025
  • Thumbnail for Thomas Dean (computer scientist)
    the anytime algorithm and was the first to apply the factored Markov decision process to robotics. He has authored several influential textbooks on artificial...
    24 KB (2,264 words) - 09:45, 29 October 2024
  • Monte Carlo tree search (category Optimal decisions)
    Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...
    39 KB (4,658 words) - 04:19, 5 May 2025