Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...
35 KB (5,156 words) - 19:43, 21 March 2025
observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...
22 KB (3,306 words) - 13:42, 23 April 2025
In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...
96 KB (12,900 words) - 21:01, 27 April 2025
to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...
10 KB (1,231 words) - 22:12, 5 May 2025
The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...
3 KB (501 words) - 23:27, 25 June 2024
probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...
8 KB (1,124 words) - 20:27, 8 March 2025
Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...
10 KB (1,072 words) - 15:39, 28 November 2024
Reinforcement learning (category Markov models)
dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...
68 KB (8,115 words) - 06:57, 5 May 2025
Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...
2 KB (229 words) - 07:10, 17 June 2024
theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...
3 KB (275 words) - 03:33, 13 March 2024
this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...
29 KB (3,835 words) - 15:13, 21 April 2025
using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...
1 KB (152 words) - 18:14, 13 December 2024
Multi-armed bandit (redirect from Bandit process)
played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...
67 KB (7,669 words) - 11:51, 22 April 2025
example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...
3 KB (415 words) - 14:59, 12 December 2023
probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...
6 KB (614 words) - 16:21, 27 January 2025
State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...
6 KB (716 words) - 19:17, 6 December 2024
Machine learning (section Decision trees)
reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcement learning algorithms use dynamic programming...
140 KB (15,513 words) - 09:56, 4 May 2025
using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...
278 KB (28,572 words) - 12:40, 19 April 2025
of Markov decision process algorithms, the Monte Carlo POMDP (MC-POMDP) is the particle filter version for the partially observable Markov decision process...
755 bytes (74 words) - 09:16, 21 January 2023
framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...
6 KB (551 words) - 16:59, 16 November 2024
University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...
10 KB (966 words) - 01:50, 18 January 2025
Bellman equation (section A dynamic decision problem)
Optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...
27 KB (4,004 words) - 16:37, 13 August 2024
reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...
6 KB (763 words) - 19:33, 15 May 2024
Stochastic game (redirect from Markov game)
Lloyd Shapley in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...
16 KB (2,440 words) - 17:02, 20 March 2025
American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...
5 KB (401 words) - 07:30, 13 September 2024
time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information...
17 KB (1,910 words) - 17:46, 10 February 2025
allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...
30 KB (3,469 words) - 00:04, 7 January 2025
Monte Carlo tree search (category Optimal decisions)
Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...
39 KB (4,658 words) - 04:19, 5 May 2025
game of Magic: The Gathering. Planning in a partially observable Markov decision process. The problem of planning air travel from one destination to another...
14 KB (1,586 words) - 03:29, 24 March 2025
observability, planning corresponds to a partially observable Markov decision process (POMDP). If there are more than one agent, we have multi-agent...
20 KB (2,247 words) - 11:27, 25 April 2024