C# Класс Accord.MachineLearning.RouletteWheelExploration

Roulette wheel exploration policy.

The class implements roulette whell exploration policy. Acording to the policy, action a at state s is selected with the next probability:

Q( s, a ) p( s, a ) = ------------------ SUM( Q( s, b ) ) b

where Q(s, a) is action's a estimation (usefulness) at state s.

The exploration policy may be applied only in cases, when action estimates (usefulness) are represented with positive value greater then 0.

Наследование: IExplorationPolicy

Показать файл Открыть проект

Открытые методы

Метод	Описание
ChooseAction ( double actionEstimates ) : int	Choose an action. The method chooses an action depending on the provided estimates. The estimates can be any sort of estimate, which values usefulness of the action (expected summary reward, discounted reward, etc).
RouletteWheelExploration ( ) : System	Initializes a new instance of the RouletteWheelExploration class.

Описание методов

ChooseAction() публичный Метод

Choose an action.

The method chooses an action depending on the provided estimates. The estimates can be any sort of estimate, which values usefulness of the action (expected summary reward, discounted reward, etc).

public ChooseAction ( double actionEstimates ) : int
actionEstimates	double	Action estimates.
Результат	int

RouletteWheelExploration() публичный Метод

Initializes a new instance of the RouletteWheelExploration class.

public RouletteWheelExploration ( ) : System
Результат	System