C# Класс AForge.MachineLearning.Sarsa

Sarsa learning algorithm.
The class provides implementation of Sarse algorithm, known as on-policy Temporal Difference control.
Показать файл Открыть проект Примеры использования класса

Открытые методы

Метод Описание
GetAction ( int state ) : int

Get next action from the specified state.

The method returns an action according to current exploration policy.

Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy ) : System

Initializes a new instance of the Sarsa class.

Action estimates are randomized in the case of this constructor is used.

Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy, bool randomize ) : System

Initializes a new instance of the Sarsa class.

The randomize parameter specifies if initial action estimates should be randomized with small values or not. Randomization of action values may be useful, when greedy exploration policies are used. In this case randomization ensures that actions of the same type are not chosen always.

UpdateState ( int previousState, int previousAction, double reward ) : void

Update Q-function's value for the previous state-action pair.

Updates Q-function's value for the previous state-action pair in the case if the next state is terminal.

UpdateState ( int previousState, int previousAction, double reward, int nextState, int nextAction ) : void

Update Q-function's value for the previous state-action pair.

Updates Q-function's value for the previous state-action pair in the case if the next state is non terminal.

Описание методов

GetAction() публичный Метод

Get next action from the specified state.
The method returns an action according to current exploration policy.
public GetAction ( int state ) : int
state int Current state to get an action for.
Результат int

Sarsa() публичный Метод

Initializes a new instance of the Sarsa class.
Action estimates are randomized in the case of this constructor is used.
public Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy ) : System
states int Amount of possible states.
actions int Amount of possible actions.
explorationPolicy IExplorationPolicy Exploration policy.
Результат System

Sarsa() публичный Метод

Initializes a new instance of the Sarsa class.
The randomize parameter specifies if initial action estimates should be randomized with small values or not. Randomization of action values may be useful, when greedy exploration policies are used. In this case randomization ensures that actions of the same type are not chosen always.
public Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy, bool randomize ) : System
states int Amount of possible states.
actions int Amount of possible actions.
explorationPolicy IExplorationPolicy Exploration policy.
randomize bool Randomize action estimates or not.
Результат System

UpdateState() публичный Метод

Update Q-function's value for the previous state-action pair.
Updates Q-function's value for the previous state-action pair in the case if the next state is terminal.
public UpdateState ( int previousState, int previousAction, double reward ) : void
previousState int Curren state.
previousAction int Action, which lead from previous to the next state.
reward double Reward value, received by taking specified action from previous state.
Результат void

UpdateState() публичный Метод

Update Q-function's value for the previous state-action pair.
Updates Q-function's value for the previous state-action pair in the case if the next state is non terminal.
public UpdateState ( int previousState, int previousAction, double reward, int nextState, int nextAction ) : void
previousState int Curren state.
previousAction int Action, which lead from previous to the next state.
reward double Reward value, received by taking specified action from previous state.
nextState int Next state.
nextAction int Next action.
Результат void