value iteration - significado y definición. Qué es value iteration
Diclib.com
Diccionario ChatGPT
Ingrese una palabra o frase en cualquier idioma 👆
Idioma:     

Traducción y análisis de palabras por inteligencia artificial ChatGPT

En esta página puede obtener un análisis detallado de una palabra o frase, producido utilizando la mejor tecnología de inteligencia artificial hasta la fecha:

  • cómo se usa la palabra
  • frecuencia de uso
  • se utiliza con más frecuencia en el habla oral o escrita
  • opciones de traducción
  • ejemplos de uso (varias frases con traducción)
  • etimología

Qué (quién) es value iteration - definición

MODEL FOR DECISION MAKING UNDER UNCERTAINTY
Markov Decision Process; Value iteration; Policy iteration; Markov decision problems; Markov Decision Processes; Markov decision processes; Algorithms for solving Markov decision processes; Methods for solving Markov decision processes
  • Example of a simple MDP with three states (green circles) and two actions (orange circles), with two rewards (orange arrows).

Iterated function         
  • ''F''}}<br
/>is  '''iterated'''  indefinitely,   then  ''A ''  and  ''K''<br
/>are  the  starting  points  of  two  infinite  [[spiral]]s.
  • 6}}) is shown.
MATHEMATICAL OPERATION OF COMPOSING A FUNCTION WITH ITSELF REPEATEDLY
Picard sequence; Function iteration; Function Iteration; Iterated map; Fractional iteration; Iterative functional-differential equation; Iteration orbit; Iterative function
In mathematics, an iterated function is a function (that is, a function from some set to itself) which is obtained by composing another function with itself a certain number of times. The process of repeatedly applying the same function is called iteration.
Markov decision process         
In mathematics, a Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker.
Value (economics)         
  • Value or price
MEASURE OF THE BENEFIT PROVIDED BY A GOOD OR SERVICE TO AN ECONOMIC AGENT
Monetary value; Value for money; Economic value; Theory of value(economics); Financial value
In economics, economic value is a measure of the benefit provided by a good or service to an economic agent. It is generally measured through units of currency, and the interpretation is therefore "what is the maximum amount of money a specific actor is willing and able to pay for the good or service"?

Wikipedia

Markov decision process

In mathematics, a Markov decision process (MDP) is a discrete-time stochastic control process. It provides a mathematical framework for modeling decision making in situations where outcomes are partly random and partly under the control of a decision maker. MDPs are useful for studying optimization problems solved via dynamic programming. MDPs were known at least as early as the 1950s; a core body of research on Markov decision processes resulted from Ronald Howard's 1960 book, Dynamic Programming and Markov Processes. They are used in many disciplines, including robotics, automatic control, economics and manufacturing. The name of MDPs comes from the Russian mathematician Andrey Markov as they are an extension of Markov chains.

At each time step, the process is in some state s {\displaystyle s} , and the decision maker may choose any action a {\displaystyle a} that is available in state s {\displaystyle s} . The process responds at the next time step by randomly moving into a new state s {\displaystyle s'} , and giving the decision maker a corresponding reward R a ( s , s ) {\displaystyle R_{a}(s,s')} .

The probability that the process moves into its new state s {\displaystyle s'} is influenced by the chosen action. Specifically, it is given by the state transition function P a ( s , s ) {\displaystyle P_{a}(s,s')} . Thus, the next state s {\displaystyle s'} depends on the current state s {\displaystyle s} and the decision maker's action a {\displaystyle a} . But given s {\displaystyle s} and a {\displaystyle a} , it is conditionally independent of all previous states and actions; in other words, the state transitions of an MDP satisfy the Markov property.

Markov decision processes are an extension of Markov chains; the difference is the addition of actions (allowing choice) and rewards (giving motivation). Conversely, if only one action exists for each state (e.g. "wait") and all rewards are the same (e.g. "zero"), a Markov decision process reduces to a Markov chain.