Markov_decision_process Search Results

Markov decision process

Markov decision process (MDP), also called a stochastic dynamic program or stochastic control problem, is a model for sequential decision making when...

35 KB (5,156 words) - 19:43, 21 March 2025

Partially observable Markov decision process

observable Markov decision process (POMDP) is a generalization of a Markov decision process (MDP). A POMDP models an agent decision process in which it...

22 KB (3,306 words) - 13:42, 23 April 2025

Markov chain

In probability theory and statistics, a Markov chain or Markov process is a stochastic process describing a sequence of possible events in which the probability...

96 KB (12,900 words) - 21:01, 27 April 2025

Markov model

to expected rewards. A partially observable Markov decision process (POMDP) is a Markov decision process in which the state of the system is only partially...

10 KB (1,231 words) - 22:12, 5 May 2025

Decentralized partially observable Markov decision process

The decentralized partially observable Markov decision process (Dec-POMDP) is a model for coordination and decision-making among multiple agents. It is a...

3 KB (501 words) - 23:27, 25 June 2024

Markov property

probability theory and statistics, the term Markov property refers to the memoryless property of a stochastic process, which means that its future evolution...

8 KB (1,124 words) - 20:27, 8 March 2025

Andrey Markov

Markov decision process Markov's inequality Markov brothers' inequality Markov information source Markov network Markov number Markov property Markov process...

10 KB (1,072 words) - 15:39, 28 November 2024

Reinforcement learning (category Markov models)

dilemma. The environment is typically stated in the form of a Markov decision process (MDP), as many reinforcement learning algorithms use dynamic programming...

68 KB (8,115 words) - 06:57, 5 May 2025

List of things named after Andrey Markov

Gauss–Markov theorem Gauss–Markov process Markov blanket Markov boundary Markov chain Markov chain central limit theorem Additive Markov chain Markov additive...

2 KB (229 words) - 07:10, 17 June 2024

Markov reward model

theory, a Markov reward model or Markov reward process is a stochastic process which extends either a Markov chain or continuous-time Markov chain by adding...

3 KB (275 words) - 03:33, 13 March 2024

Q-learning

this choice by trying both directions over time. For any finite Markov decision process, Q-learning finds an optimal policy in the sense of maximizing...

29 KB (3,835 words) - 15:13, 21 April 2025

Sequential decision making

using methods like Markov decision processes (MDPs) and dynamic programming. Puterman, Martin L. (1994). Markov decision processes: discrete stochastic dynamic...

1 KB (152 words) - 18:14, 13 December 2024

Multi-armed bandit (redirect from Bandit process)

played. The bandit problem is formally equivalent to a one-state Markov decision process. The regret ρ {\displaystyle \rho } after T {\displaystyle T} rounds...

67 KB (7,669 words) - 11:51, 22 April 2025

One-pass algorithm

example of a one-pass algorithm is the Sondik partially observable Markov decision process. Given any list as an input: Count the number of elements. Given...

3 KB (415 words) - 14:59, 12 December 2023

Model-free (reinforcement learning)

probability distribution (and the reward function) associated with the Markov decision process (MDP), which, in RL, represents the problem to be solved. The transition...

6 KB (614 words) - 16:21, 27 January 2025

State–action–reward–state–action

State–action–reward–state–action (SARSA) is an algorithm for learning a Markov decision process policy, used in the reinforcement learning area of machine learning...

6 KB (716 words) - 19:17, 6 December 2024

Machine learning (section Decision trees)

reinforcement learning, the environment is typically represented as a Markov decision process (MDP). Many reinforcement learning algorithms use dynamic programming...

140 KB (15,513 words) - 09:56, 4 May 2025

Artificial intelligence (section Planning and decision-making)

using decision theory, decision analysis, and information value theory. These tools include models such as Markov decision processes, dynamic decision networks...

278 KB (28,572 words) - 12:40, 19 April 2025

Monte Carlo POMDP

of Markov decision process algorithms, the Monte Carlo POMDP (MC-POMDP) is the particle filter version for the partially observable Markov decision process...

755 bytes (74 words) - 09:16, 21 January 2023

Karl Johan Åström

framework of Markov decision processes with incomplete information, what ultimately led to the notion of a Partially observable Markov decision process. In 1995...

6 KB (551 words) - 16:59, 16 November 2024

Michael Katehakis

University. He is noted for his work in Markov decision process, Gittins index, the multi-armed bandit, Markov chains and other related fields. Katehakis...

10 KB (966 words) - 01:50, 18 January 2025

Bellman equation (section A dynamic decision problem)

Optimality condition in optimal control theory Markov decision process – Mathematical model for sequential decision making under uncertainty Optimal control...

27 KB (4,004 words) - 16:37, 13 August 2024

Learning automaton

reinforcement learning if the environment is stochastic and a Markov decision process (MDP) is used. Research in learning automata can be traced back...

6 KB (763 words) - 19:33, 15 May 2024

Stochastic game (redirect from Markov game)

Lloyd Shapley in the early 1950s. They generalize Markov decision processes to multiple interacting decision makers, as well as strategic-form games to dynamic...

16 KB (2,440 words) - 17:02, 20 March 2025

Keith W. Ross

American scholar of computer science whose research has focused on Markov decision processes, queuing theory, computer networks, peer-to-peer networks, Internet...

5 KB (401 words) - 07:30, 13 September 2024

Intrinsic motivation (artificial intelligence)

time are represented by probability distributions describing a markov decision process and the cycle of perception and action treated as an information...

17 KB (1,910 words) - 17:46, 10 February 2025

Planning Domain Definition Language

allows efficient description of Markov Decision Processes (MDPs) and Partially Observable Markov Decision Processes (POMDPs) by representing everything...

30 KB (3,469 words) - 00:04, 7 January 2025

Monte Carlo tree search (category Optimal decisions)

Adaptive Multi-stage Sampling (AMS) algorithm for the model of Markov decision processes. AMS was the first work to explore the idea of UCB-based exploration...

39 KB (4,658 words) - 04:19, 5 May 2025

List of undecidable problems

game of Magic: The Gathering. Planning in a partially observable Markov decision process. The problem of planning air travel from one destination to another...

14 KB (1,586 words) - 03:29, 24 March 2025

Automated planning and scheduling

observability, planning corresponds to a partially observable Markov decision process (POMDP). If there are more than one agent, we have multi-agent...

20 KB (2,247 words) - 11:27, 25 April 2024