• Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...
    35 KB (5,156 words) - 19:43, 21 March 2025
  • observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...
    22 KB (3,306 words) - 13:42, 23 April 2025
  • Thumbnail for Markov chain
    In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...
    96 KB (12,900 words) - 21:01, 27 April 2025
  • to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...
    10 KB (1,231 words) - 22:12, 5 May 2025
  • The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...
    3 KB (501 words) - 23:27, 25 June 2024
  • Thumbnail for Markov property
    probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...
    8 KB (1,124 words) - 20:27, 8 March 2025
  • Thumbnail for Andrey Markov
    Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...
    10 KB (1,072 words) - 15:39, 28 November 2024
  • Thumbnail for Reinforcement learning
    Reinforcement learning (category Markov models)
    dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...
    68 KB (8,115 words) - 06:57, 5 May 2025
  • Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...
    2 KB (229 words) - 07:10, 17 June 2024
  • theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...
    3 KB (275 words) - 03:33, 13 March 2024
  • this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...
    29 KB (3,835 words) - 15:13, 21 April 2025
  • using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...
    1 KB (152 words) - 18:14, 13 December 2024
  • Thumbnail for Multi-armed bandit
    played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...
    67 KB (7,669 words) - 11:51, 22 April 2025
  • example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...
    3 KB (415 words) - 14:59, 12 December 2023
  • probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...
    6 KB (614 words) - 16:21, 27 January 2025
  • State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...
    6 KB (716 words) - 19:17, 6 December 2024
  • reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcement learning algorithms use dynamic programming...
    140 KB (15,513 words) - 09:56, 4 May 2025
  • using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...
    278 KB (28,572 words) - 12:40, 19 April 2025
  • of Markov decision process algorithms, the Monte Carlo POMDP (MC-POMDP) is the particle filter version for the partially observable Markov decision process...
    755 bytes (74 words) - 09:16, 21 January 2023
  • framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...
    6 KB (551 words) - 16:59, 16 November 2024
  • Thumbnail for Michael Katehakis
    University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...
    10 KB (966 words) - 01:50, 18 January 2025
  • Thumbnail for Bellman equation
    Optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...
    27 KB (4,004 words) - 16:37, 13 August 2024
  • reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...
    6 KB (763 words) - 19:33, 15 May 2024
  • Stochastic game (redirect from Markov game)
    Lloyd Shapley in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...
    16 KB (2,440 words) - 17:02, 20 March 2025
  • American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...
    5 KB (401 words) - 07:30, 13 September 2024
  • time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information...
    17 KB (1,910 words) - 17:46, 10 February 2025
  • allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...
    30 KB (3,469 words) - 00:04, 7 January 2025
  • Monte Carlo tree search (category Optimal decisions)
    Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...
    39 KB (4,658 words) - 04:19, 5 May 2025
  • game of Magic: The Gathering. Planning in a partially observable Markov decision process. The problem of planning air travel from one destination to another...
    14 KB (1,586 words) - 03:29, 24 March 2025
  • observability, planning corresponds to a partially observable Markov decision process (POMDP). If there are more than one agent, we have multi-agent...
    20 KB (2,247 words) - 11:27, 25 April 2024