메소드 | 설명 | |
---|---|---|
GetAction ( int state ) : int |
Get next action from the specified state. The method returns an action according to current |
|
Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy ) : System |
Initializes a new instance of the Sarsa class. Action estimates are randomized in the case of this constructor is used. |
|
Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy, bool randomize ) : System |
Initializes a new instance of the Sarsa class. The randomize parameter specifies if initial action estimates should be randomized with small values or not. Randomization of action values may be useful, when greedy exploration policies are used. In this case randomization ensures that actions of the same type are not chosen always. |
|
UpdateState ( int previousState, int previousAction, double reward ) : void |
Update Q-function's value for the previous state-action pair. Updates Q-function's value for the previous state-action pair in the case if the next state is terminal. |
|
UpdateState ( int previousState, int previousAction, double reward, int nextState, int nextAction ) : void |
Update Q-function's value for the previous state-action pair. Updates Q-function's value for the previous state-action pair in the case if the next state is non terminal. |
public GetAction ( int state ) : int | ||
state | int | Current state to get an action for. |
리턴 | int |
public Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy ) : System | ||
states | int | Amount of possible states. |
actions | int | Amount of possible actions. |
explorationPolicy | IExplorationPolicy | Exploration policy. |
리턴 | System |
public Sarsa ( int states, int actions, IExplorationPolicy explorationPolicy, bool randomize ) : System | ||
states | int | Amount of possible states. |
actions | int | Amount of possible actions. |
explorationPolicy | IExplorationPolicy | Exploration policy. |
randomize | bool | Randomize action estimates or not. |
리턴 | System |
public UpdateState ( int previousState, int previousAction, double reward ) : void | ||
previousState | int | Curren state. |
previousAction | int | Action, which lead from previous to the next state. |
reward | double | Reward value, received by taking specified action from previous state. |
리턴 | void |
public UpdateState ( int previousState, int previousAction, double reward, int nextState, int nextAction ) : void | ||
previousState | int | Curren state. |
previousAction | int | Action, which lead from previous to the next state. |
reward | double | Reward value, received by taking specified action from previous state. |
nextState | int | Next state. |
nextAction | int | Next action. |
리턴 | void |