XGBoost Imputation for Time Series Data

Data quality plays an important role in the data-driven based biomedical informatics research because the effectiveness of these research heavily relies on the completeness of data being collected. The problem of missing values, however, is commonly encountered in research practice, which impedes researchers to build accurate models and then make reasonable decisions. Simply removing the data instances having missing value(s) is a candidate strategy, but risks incorporating biases or even yielding incorrect models. [1]. Thus, imputing accurate missing values is a prerequisite to training good machine learning models.