The hypertension SDDP model applied the standard SA, GA and RL. However, there is evidence of the improved performance of heuristic by modifications and/or integration with other heuristics. For SA, the average performance can be improved by extending the neighbourhood[390-392]; or to sample locally optimal configurations in a more efficient way[393, 394]. Dynamic cooling rate, which decreases linearly or geometrically, can be applied to increase the speed of convergence without compromising the solution quality[180, 395]. There is also increasing interest in parallel implementation, which uses a number of parallel processors during the annealing procedure[396-398]. Parallel processing can be used to generate different chains of solutions within the same temperature or to allow all the processors to test several random neighbours for acceptance independent[53]: this allows a single-solution based method to blur the structural difference with population-based meta-heuristics.
To enhance the performance, GA often makes use of a local search method as a form of selection, crossover and mutation. The strength of a local search method is to explore a promising area in the search space in a more structured way, whereas the strength of GA is to identify promising areas in the search space. Through the hybridisation between GA and local search, the applied local search improves the solution until a local optimum is reached, and then the GA operators sample the search space with solutions that are then processed by the local search[399-401]. Independent sub-populations of chromosomes can be also used as parallel processors to explore the search space[402-404]. In this case, the sub-populations constitute the local mating pools in a population and the best individuals in the sub-populations are re-distributed to the next generation.
The performance of SA and GA can be improved by changing the structure of neighbourhood. In the current hypertension SDDP model, the choices of neighbourhoods for SA and the crossover rule for the GA were based on the numbering of drug sequences in their list order. However, it was noted that some movements in the neighbourhood did not consider the actual similarity in drugs used, particularly where the size of the neighbourhood defined was big (i.e., where the size of the neighbourhood was 100 or 200 in SA) or for the policy located near the borderline between the policies starting with a different initial drug (e.g., the policy number 1032 starting with D following by CCB+ACEI/ARB, BB+CCB+ACEI/ARB and D+CCB+ACEI/ARB and the policy number 1033 starting with BB, D, CCB and ACEI/ARB). For SA, an alternative analysis would regard two drug sequences X1-X2-X3-X4 and Y1-Y2-Y3-Y4 as neighbours if exactly three of the equations X1=Y1, X2=Y2, X3=Y3, X4=Y4 are satisfied. A wider neighbourhood would allow two or three of the equations to be satisfied. In GA, a possible offspring of the two sequences given might be X1-Y2-Y3-X4, with 14 different ways of choosing Xs and Ys here so that the offspring is not the same as either parent.
The RL applied in the hypertension SDDP model uses a look-up table, which stores all the Q-values associated with each state and action and finds the action, which brings the best Q-value for each state. For large-scale problems with millions of state-action pairs, however, the look-up table is inefficient not just because of the memory required, but also the time and data needed to fill them accurately[200]. To overcome this problem, function approximation using regression is commonly used. For example, the objective function is fitted with regression using the available data related to the function, and tries to use the predictions to find a better solution.
Dostları ilə paylaş: |