Statistical issues in survival analysis (Part XVVVI)

2 min readApr 29, 2024

April 24, 2024

The motivation was assessing continuous risk scores with biomarker distributions. The authors defined that the precision-recall curve is a plot of the true positive rate (which is also known as recall or sensitivity) against the positive predictive value, also known as precision, which is the conditional probability that a subject with a positive test result actually has the disease, for all possible cutoff values. This curve can be summarized by the area under this curve. This has been put forth as a competitor to the ROC and AUC, which are thought to overestimate performances. The method has two issues. One is that the event status is assumed unknown for all subjects in the study. And the second is that it assumes independent censoring. Therefore, their paper aimed to address these limitations and propose a novel nonparametric estimation method for time-dependent precision- recall curve and its area under curve.

In their methods they proposed a time-dependent precision-recall curve which was defined as a plot of the time-dependent true positive rate vs the time-dependent positive predictive values for all possible cutoff values. They defined empirical point estimates for their TPR and their PPV by calculating weights which are related to the conditional survival, calculated by the Beran estimator, for the biomarker. They also calculated variance for their AUC under the TD precision curve by bootstrapping, in order also to create confidence intervals around the estimate. In simulations they compared their method to the Yuan method. Overall, their method performed well on several metrics as compared to the Yuan method in their simulations.

In a real dataset analysis, they used a Mayo clinic PBC dataset that is freely available. The differences between their method and the Yuan methods are not as distinct in the real data analysis, but their AUC estimator came out lower than the Yuan one, however, as they state, this difference was not statistically significant for either method, but then again, it is hard to tell the differences when somehow their method did so well on several simulation parameters. Nevertheless, they came up with an R package tdPRC. Their method was developed under right censoring only so they proposed to incorporate other censoring types.

Written by,

Usha Govindarajulu, PhD

Keywords: survival analysis, nonparametric, right-censoring, time-dependent, precision-recall

References

Beyene KM, Chen D-G, and Kifle YG (2024). “A novel nonparametric time-dependent precision-recall curve estimator for right-censored survival data” Biometrical Journal.

https://doi.org/10.1002/bimj.202300135

https://onlinelibrary.wiley.com/cms/asset/3920ba1e-62ed-49b2-83e1-f24049a2d157/bimj2572-fig-0001-m.jpg

Statistical issues in survival analysis (Part XVVVI)

Written by Usha Govindarajulu