Sequential drug decision problems in long-term medical conditions: a case Study of Primary Hypertension Eunju Kim ba, ma, msc

Yüklə 10,52 Mb.

səhifə	40/116
tarix	04.01.2022
ölçüsü	10,52 Mb.
	#58520

1 ... 36 37 38 39 40 41 42 43 ... 116

FOR i = 1:T

IF i == 1

FOR s = 1:HSi

FOR a = 1:SS

V{i}(s,a)= sum(mTransition{i}(s,:,a).*mReward{i}(:,a));

END

END

ELSE

FOR s=1:HSi

FOR a = 1:SS

V{i}(s,a) = sum(mTransition{i}(s,:,a).*mReward{i}(:,a))+

DR^(t-1)*(mTransition{i}(s,:,a).* (:));

END

END
% Return the optimal value and the location of the optimal value (i.e., optimal solution) in each column (i.e., for each state).

[(:),(:)] = max(V{i}(:,:),[],1);

END

Figure ‎3.. Pseudo-code of the DP used for the simple hypothetical case

3.6.4Simulated Annealing

As an example of meta-heuristics, SA was applied to the hypothetical simple SDDP. The same search space was used as for the enumeration algorithm. The process started with π₁=(drug₁,drug₂,drug₃), and then iterated by randomly searching a search space to seek better policies (see Figure ‎3.). The neighbourhood was not defined in this simple hypothetical SDDP as the size of the search space was very small. If a new sequential treatment policy nPolicy was better than the initial policy cPolicy based on the total treatment benefit, the search process re-started with this newly found improved policy and continued to iterate in this way until no further improvement was found. The initial temperature InitTemp was set to 1. The cooling rate CoolSched was assumed to be 0.8. The SA algorithm stopped if the temperature T reached 0.001, new policies were consecutively rejected five times (i.e., MaxConsRej), old policy was consecutively successful five times (i.e., MaxSuccess) or the total iteration number reached 100 (i.e., MaxTries).

% Main parameter settings.

def = struct('CoolSched',@(T) (.8*T),... % Cooling schedule.

'InitTemp',1,... % Initial temperature.

'MaxConsRej',5,... % Max no. of consecutive rejections.

'MaxSuccess',5,... % Max no. of consecutive success.

'MaxTries',100,... % Max no. of total tries.

'StopTemp',0.001, ... % Stopping temperature.

'k',1); % Boltzmann constant.

% Initial solution and the reward from the initial policy.

cPolicy = 1; [OldReward] = function_SemiMarkov(cPolicy);

WHILE ~Finished;

itry = itry+1; % An iteration counter.

% Stop / decrement T criteria.

Yüklə 10,52 Mb.

Dostları ilə paylaş:

1 ... 36 37 38 39 40 41 42 43 ... 116