Sequential drug decision problems in long-term medical conditions: a case Study of Primary Hypertension Eunju Kim ba, ma, msc

FOR a = 1:A Policy = SS(a,:); TotalReward(a,1) = function_SemiMarkov(Policy); END

Yüklə 10,52 Mb.

səhifə	39/116
tarix	04.01.2022
ölçüsü	10,52 Mb.
	#58520

1 ... 35 36 37 38 39 40 41 42 ... 116

3.6.3Dynamic programming

FOR a = 1:A

Policy = SS(a,:);

TotalReward(a,1) = function_SemiMarkov(Policy);

END
% Decide the optimal solution based on the total net benefits of six sequential treatment policies.

[MaxReward,OptSol] = max(TotalReward(:,1));

Figure ‎3.. Pseudo-code of the enumeration used for the simple hypothetical case

3.6.3Dynamic programming

DP used the Bellman’s value function in Equation 3.4-3.5, where the search space was constructed with a set of individual drugs SS (see Figure ‎3.). To consider the different transition probabilities depending on the disease history, the health state transition space HS was constructed with the possible health state combinations until the time period considered: thus the number of possible health states increased from 1 at t₁ to 27 at t₄. All matrices of potential transitions mTransition and one-step rewards mReward were calculated before the optimisation procedure started. The problem solving procedure was started from the last decision period i. Policy iteration compared the estimates of the value function V{i}(s,a)where a∈SS was used for s∈HSiand identified the best solution for each health state s in each period i. Once the optimal solution in the last period was identified, the optimal solution was selected under the policy of using in i=1. This process was followed until the optimal drug sequence was identified for the first time period. DP had a limitation to restrict the infeasible solutions under the decision-rule assumed in the hypothetical SDDP (i.e., non-repetition of drugs after treatment failure and the continuous use of the current drug after treatment success). As its optimal solution was calculated backward in time, medical and drug use history was not known when the decision was made so health states or drug uses in previous time periods cannot be considered when working out the optimal drug at current time.

T = 3; % The number of time periods.

SS = 3; % The number of possible treatment options. They are same in SS₁, SS₂, SS₃ and SS₄.

HS = [3^0,3^1,3^2,3^3] % The number of possible health states (with disease history) at t, where HS₁=3^0, HS₂=3^1, HS₃=3^2 and HS₄=3^3.
% Calculate all possible transition probabilities from s_t to s_t+1 by drug a∊SS using ‘function_Transition’.

[mTransition{t}(s,s’,a)] = function_Transition;

% Calculate all immediate rewards using ‘function_Reward’.

[mReward{t}(s,a)] = function_Reward(mUtility,mCost);

DR = 0.8; % Discount rate.
% Start the problem solving procedure from the last stage i.

Yüklə 10,52 Mb.

Dostları ilə paylaş:

1 ... 35 36 37 38 39 40 41 42 ... 116