Investigation on the Effect of Data Imbalance on Prediction of Liquefaction

AbstractData imbalance causes learning bias in class identification techniques. A major cause for limited success in the prediction of liquefaction potential by various pattern recognition techniques is because of a liquefaction to nonliquefaction data class imbalance. It is suggested to use a support vector data description (SVDD) strategy to compensate the minority data. SVDD is used to generate virtual data points for the minority class bearing the same characteristics as the nonvirtual samples. Then an adaptive neuro-fuzzy inference system (ANFIS) classifier is employed to determine the liquefaction threshold. The ANFIS predictions are then examined by evaluating the coefficient of determination (COD) and comparing it with the Bayesian updating method. It is shown that for the liquefied data the approach is as efficient as the Bayesian method, but great improvement in the recognition rates of the nonliquefied data have been achieved.