Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...
35 KB (5,156 words) - 11:15, 25 May 2025
observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...
22 KB (3,306 words) - 13:42, 23 April 2025
In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...
96 KB (12,900 words) - 11:52, 1 June 2025
to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...
10 KB (1,231 words) - 16:39, 29 May 2025
probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...
8 KB (1,124 words) - 20:27, 8 March 2025
The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...
3 KB (501 words) - 23:27, 25 June 2024
Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...
2 KB (229 words) - 07:10, 17 June 2024
Reinforcement learning (category Markov models)
dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...
69 KB (8,194 words) - 13:01, 17 June 2025
Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...
10 KB (1,072 words) - 21:36, 10 June 2025
this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...
29 KB (3,835 words) - 15:13, 21 April 2025
theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...
3 KB (275 words) - 03:33, 13 March 2024
using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...
281 KB (28,734 words) - 22:57, 20 June 2025
University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...
10 KB (966 words) - 01:50, 18 January 2025
Multi-armed bandit (redirect from Bandit process)
played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...
67 KB (7,667 words) - 19:30, 22 May 2025
State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...
6 KB (716 words) - 19:17, 6 December 2024
reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...
6 KB (763 words) - 19:33, 15 May 2024
using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...
1 KB (152 words) - 17:39, 25 May 2025
example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...
3 KB (415 words) - 14:59, 12 December 2023
Stochastic game (redirect from Markov game)
Lloyd Shapley in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...
16 KB (2,434 words) - 19:57, 8 May 2025
Machine learning (section Decision trees)
reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcement learning algorithms use dynamic programming...
140 KB (15,572 words) - 23:09, 20 June 2025
probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...
6 KB (614 words) - 16:21, 27 January 2025
allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...
30 KB (3,469 words) - 20:23, 6 June 2025
framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...
6 KB (551 words) - 02:14, 2 June 2025
Bellman equation (section A dynamic decision problem)
Optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...
28 KB (4,008 words) - 22:01, 1 June 2025
methods. It estimates the state value function of a finite-state Markov decision process (MDP) under a policy π {\displaystyle \pi } . Let V π {\displaystyle...
12 KB (1,565 words) - 20:36, 20 October 2024
which is, for the example, E(3) = 3.5 slots. Control theory Markov chain Markov decision process Tanenbaum & Wetherall 2010, p. 395 Rosenberg et al. RFC3261...
23 KB (3,342 words) - 08:44, 17 June 2025
American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...
5 KB (401 words) - 07:30, 13 September 2024
Viable Product) Markov decision process, a probabilistic model that is widely used in artificial intelligence Mask data preparation, a process in electronic...
3 KB (381 words) - 14:02, 9 June 2025
the anytime algorithm and was the first to apply the factored Markov decision process to robotics. He has authored several influential textbooks on artificial...
24 KB (2,264 words) - 09:45, 29 October 2024
Monte Carlo tree search (category Optimal decisions)
Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...
39 KB (4,658 words) - 04:19, 5 May 2025