SDDPs involve patients experiencing a series of health-related and drug-related events over a finite time. A finite set of mutually exclusive health states h∈H and drugs a∈A are defined. Note that a drug a defined here refers to a single treatment regimen, which could be a single drug (potentially with different doses), or a combination of more than one drug in circumstances where multiple drugs may be prescribed at the same time. Given a health state st∈H at time t, the selected drug xt∈A has an impact on both the next health state st→st+1 and the costs/benefits associated with this transition.
As the health state transitions are not fully known in advance, a transition probability function can be used to represent the probability of the patients being in state st+1 at time t+1 if drug xt is chosen for health state st at time t. For simplicity, the transition probability is often assumed to follow the Markov property, which assumes no memory. That is, future states st+1 depend only upon the present state st and xt, not on the sequence of events that preceded it. Mathematically, such an SDDP can be defined as P(T, H, HS, A, SS, DS, Ω, p, r, f), as follows:
-
T=(t1,t2,…,tn) represents the time dimension with n periods where possible changes in health state and drug use occur.
-
H represents a set of l possible health states H={h1,h2,…,hl }.
-
HS represents the health state space, which is a set of possible health state combinations or disease pathways within the time dimension HS={θ=(s1,s2,…,sn, sn+1 )}, where st∈H. The number of potential health state combinations within T is:
, where all patients start with the same health state. Equation 2.1.
-
A represents a set of m possible drug alternatives A={a1,a2, …,am}.
-
A drug switch is when two different drugs (ai and aj where i≠j) are used sequentially in two adjacent time periods within T. Theoretically, the maximum number of drug switches that can possibly occur within T is n (i.e., if drug switch occurs between all adjacent time periods), but in reality the number of drug switches within T may be less than n if the same drug ai can be used continuously for more than one time period whilst being effective and without any AE. In this definition, it is assumed that a drug can be continuously used for more than one time period, but a drug cannot be re-used if it has been previously used and replaced with another drug.
-
SS represents the search space, which is a set of possible drug sequences SS={π=(d1,d2,…,dm)}, where di∈A. π represents one potential sequential treatment policy, which is the permutations of m possible drug alternatives within A. Once T is defined the number of possible drug sequences is:
, where m≤n,
, where m>n
Equation 2.2.
-
Decision space DS={πsx=((s1,x1),(s2,x2),…,(sn,xn))} represents a set of health state transitions within T together with the associated drugs used for each health state, where st∈H and xt∈A. Unlike the sequential treatment policy π=(d1,d2,…,dm), the drug sequence used for each time period πx=(x1,x2,…,xn) allows repetition of the same drug depending on st (i.e., a drug can be used continuously for more than one time period if the drug is effective). Depending on the decision when to switch drugs within T, there are a set of variations of πx={=(x1,x2,…,xn)} for a specific π=(d1,d2,…,dm). For example, considering three drugs 1, 2 and 3 for four time periods, one of the six possible sequential treatment policy π=(1,2,3) has 11 variations πx during T as following:
πx = {(1,1,1,1,1),(1,1,1,1,2),(1,1,1,2,2),(1,1,2,2,2),(1,2,2,2,2),(1,1,1,2,3),
(1,1,2,3,3), (1,2,3,3,3),(1,1,2,2,3),(1,2,2,3,3) and (1,2,2,2,3)}.
-
Ω represents the set of probability matrices ω for st+1, which depends upon either st and xt, where the Markovian assumption is used, or (s1,x1,s2,x2,…,st,xt) where non-Markovian assumption is used. The probability that θ=(s1,s2,…,sn) happens for a given drug sequence πx is:
,
where
Equation 2.3.
Equation 2.4.
-
The objective function f(π) is the sum of all reward functions r(θ) for all possible health state combinations:
Equation 2.5.
In a maximisation problem, the optimal drug sequence π*∈SS satisfying f(π*)≥f(π) ∀ π∈SS is called the globally optimal solution of P(T, H, HS, A, SS, DS, Ω, p, r, f). Figure 2. illustrates possible health state transitions and associated drug uses of a simple SDDP diagrammatically where the Markovian assumption is made.
1) st represents the health state at t; xt represents the drug choice at t; ω(st+1|st,xt) represents the transition probability from st to st+1 where xt is used; r(s t+1|st,xt) represents the partial reward where xt is used for st at t; r(θ) represents the sum of the partial rewards of using drug πx for a specific disease pathway θ=(s1,s2,…,sn).
Figure 2.. Graphical presentation of an SDDP with the Markovian assumption
Dostları ilə paylaş: |