Chapter 8.Discussion
This chapter summarises what was achieved in this study. The advantages and disadvantages of the proposed methods, and the potential trade-offs between model validity and computational complexity are discussed based on the model results in the previous chapter. Implications of the hypertension SDDP analysis are also discussed from a methodological and a decision-maker’s perspective. Future directions in research are suggested based on the limitations of this study.
8.2Summary of research
This thesis was concerned with solving large and complex SDDPs associated with long-term medical conditions, using simulation-based optimisation approaches. The nature of the SDDP was described and defined mathematically. A classification of modelling structures for economic evaluation was proposed to guide the appropriate selection of modelling approaches for SDDPs in the context of economic evaluation in a long-term medical condition. A hypothetical SDDP case-study was undertaken to test the proposed model structures. Lastly, the proposed methods were applied to a comprehensive case study of SDDP in primary hypertension.
This research found that the computational complexity of SDDPs comes from a range of factors: 1) the number of relevant health states, 2) the number of potential drug treatment options, 3) the number of times that a treatment change may occur, particularly where a time-sliced modelling approach is adopted, 4) whether the transition probabilities between health states depend on historic health states and drug uses and 5) relevant clinical-based rules to be incorporated (e.g., contraindication of certain drugs in the event of certain health states). This suggests that for modelling such situations there might be a trade-off between computational complexity and model validity. That is, increasing the number of health states, treatment options and time periods, considering the patient’s medical history and incorporating the treatment decision rules in practice into modelling may improve model validity, but will inevitably increase the computational complexity of the underlying evaluation model. This is also associated with the trade-off between the underlying evaluation model and the optimisation model. Given a limited time to spend on solving the overall SDDP problem, a trade-off option might be a more sophisticated optimisation algorithm applied to a simple evaluation model. However, it is also important to capture those elements of complexity that render the SDDP approach necessary to ensure added value is captured. In this case, a more efficient optimisation algorithm that guarantees a good solution, but saves a computational time, is necessary.
Where classic mathematical programming has a limited capacity for dealing with the complexities required for SDDP, this study suggested the approximate optimisation methods using simulation. Successive decision tree and semi-Markov models can be used as the underlying evaluation model if the programming software selected is fast and capable of handling the partially-relaxed Markovian assumption of the SDDP. However, the conventional modelling software for decision tree and Markov models, such as Microsoft Excel with Visual Basic for Applications (VBA) or TreeAge (2015 TreeAge Software, Inc), would be limited in its ability to build the underlying evaluation model for SDDP and to link the evaluation model with the optimisation model. The hypertension SDDP model was built using Matlab, which enabled a fast and efficient matrix calculation and a flexible model to store previous information and to reuse within the framework of the decision tree and the Markov model. Matlab also facilitates parallel computation, which saved the computational time substantially.
Considering the historical information in guiding treatment choice naturally involves the extension of the standard memoryless decision trees or Markov models and increases the computational complexity. In the hypertension SDDP model, state aggregation was employed to construct a fully observable successive decision tree with reduced computational complexity. State aggregation is an approach to alleviate the curse of dimensionality, where the states aggregated were similar with respect to transition probabilities or rewards. This allowed storing all the information necessary to calculate the transition probability and using them to determine the optimal solution in the next period. Setting a maximum number of drug switches based on clinical ground also greatly contributed to reducing the computational complexity and time where the number of health states was predicted to increase exponentially.
The choice of the search methods in the hypertension SDDP optimisation model depends on whether it is possible to search all possible solutions in a bounded time or not. This study did not try to define the bounded time explicitly because it can be problem-specific and another research area in computer science, which is beyond the scope of this thesis. The total computational time can be approximated by testing a smaller number of policies and calculating the computational time per policy. Taking an example of a decision problem, which has 5,000 possible policies and the computational time per policy (including PSA runs) is 1 hour, the total computational time, which increases linearly with the number of policies, will be approximately 5,000 hours (i.e., 208.33 days). This approach appears to naïve but enables one to make an informed judgement and choice of a feasible first approach to develop. If the estimated computational time of enumeration is not prohibitive, enumeration can guarantee the optimal solution; otherwise, other fast and efficient computational options should be considered. If parallel computation is possible (e.g., using 12 processors like the hypertension SDDP model), the computational time of enumeration can be reduced to 416.67 hours (i.e., 17.36 days). However, this is not the case in most cases: thus, a range of heuristics or meta-heuristics can be used as introduced in this thesis. Although they do not guarantee the optimal solution, the results of the hypertension SDDP model showed that SA and GA are capable of identifying good solutions in reasonable computational times where the given decision problem is large and complex.
As a general rule, the performance of SA depends on the choice of a number of key tuning parameters, such as the initial and stopping temperatures, the neighbourhood structure, the cooling schedule and the maximum number of iterations within a single temperature. Searching within a neighbourhood of the current solution is a useful compromise, but there may be an implicit trade-off between the amount of computational time to search the decision space and the quality of the solution. The hypertension SDDP model indicated that SA gave a better performance where slow cooling schedule was applied. This was agreed with a general finding in other SA literature. Unfortunately, slow cooling schedules may not be feasible in some applications because it increases the computational time; therefore, faster cooling schedules are often adopted in SA applications instead[208].
The strength of GA is maintaining a population of solutions and recombining good solutions to obtain new ones, which improves on average over the generations: therefore, it is important to select the population for the next generation and to mate them in an intelligent manner. Particularly, the hypertension SDDP model shows that selecting the suitable value of crossover and mutation is important to balance preserving the good elements of the population and maintaining the diversity in the population. Population size also affects the performance of the GA. If there are too many policies in the population, the algorithm slows down and leads to a long computational time. If the selected population size is too small, then the algorithm may result in premature convergence without finding an appropriate solution.
The case-study of primary hypertension shows the general rules in the relationship between the key parameter settings and the performance of SA and GA. However, it does not guarantee that the parameter settings used in the hypertension SDDP model will be equally as effective as other SDDPs because they are problem specific. In the absence of an enumeration gold standard, it would be more difficult to judge what the best parameter settings are for the given problem. In this case, a general approach to find the most effective set of parameters is to compare the performance of different scenarios with many different sets of parameters[376-379]. The tuning procedure can start with the values, which have been previously used for similar problems (as the hypertension SDDP model did). The variations in the set of parameter values will show whether the performance of SA or GA are sensitive to the parameter setting, and if so what parameter setting leads to the best performance. For a large and complex problem, the tuning procedure usually starts with a smaller proto-type of problem and then gradually scales up to a bigger problem with a better set of parameters, as the tuning procedure could be inappropriately lengthy.
The applied RL works along with the decomposition method. The strength of RL is that it possesses the strengths of both forward method (i.e., simulation) and backward method (i.e., DP). The backward approach provides the global optimal solution by balancing the immediate reward and the future reward, whereas there is a concern about a logical conflict between backward computation and the nature of real clinical decision-making based on the patient’s medical history. In contrast, the forward approach is able to use historical information to identify the optimal solution in the next period, whereas decision making based on the step-by-step incremental computation is unable to consider the impact of future situations on the optimal solution. Using both forward and backward mechanisms enables RL to progress toward a desirable goal by repeatedly observing the future situations and actions. Another benefit of RL is to reduce the size of the decision space and to allow more freedom in drug choice than constructing a search space with pre-set drug sequences; although this did not lead to a better solution than the other applied heuristic methods in the case of primary hypertension.
Two types of randomness - randomness in evaluating the value of the objective function and randomness in heuristic search - exists in SDDPs. The former comes from the underlying evaluation model that describes the problem using random variables with known probability distributions (e.g., distributions for initial SBP level and SBP lowering effects). In this case, the objective function strongly depends on the probabilistic structure of the model: thus, the final value of the objective function is usually estimated by averaging over a number of replications generated by Monte Carlo simulation. In theory, increasing the sample size reduces the variance of the estimated value of the objective function as the mathematical rule of Monte Carlo sampling; however, multiple replications of simulation runs might be extremely time-consuming where the given problem is large and complex.
The latter randomness is related to stochastic search, which may not be purely random but may be biased depending on the applied heuristic rules. The random search allows for spontaneous movements to explore the unsearched area to find a better solution. Due to the stochastic nature of meta-heuristics, there will always be the possibility that the optimal solution could exist in the unexplored region; thus, the optimal or near optimal solutions obtained from meta-heuristics should be treated as probabilistic.
Dostları ilə paylaş: |