Abstract:
Survival analysis methods solve the problem of describing and predicting events. Models account for cases of censoring in which the true time of the event is unknown due to the withdrawal of the observation from the study. Statistical methods assume that censoring is uninformative and there is no relationship between the reason for the observation withdrawal and the study. This paper investigates the effect of informativeness on the performance of statistical methods. In particular, the log-rank criterion is used to compare hazard functions and has low sensitivity in the case of small samples or multimodal event time distribution. To overcome the shortcomings, we propose a method to compute regularized criteria that use a priori information about the distribution of events over time and evaluate the differences between risk functions for all time points. The regularization method was integrated into the survival tree method and resulted in improved prediction quality on four medical datasets. Also, the proposed method outperformed the existing statistical methods and survival tree realization on all datasets.