Recognition Rate Versus Substitution Rate Curve: An Objective Utility Assessment Criterion of Simulated Training Data

Data augmentation is beneficial when the measured training data are insufficient to train a robust deep model. One of the promising techniques is to use simulated data generated by physics-based engines. For example, few-shot learning of synthetic aperture radar (SAR) targets could be benefited from simulated SAR images. However, the characteristics of the simulated training data significantly affect the performance of the trained model. Therefore, it is of great significance to evaluate the utility of simulated data objectively and effectively. A recognition rate versus substitution rate curve (RSC)-based assessment criterion is proposed, consisting of substitution rate (SR)-based dataset allocation stage and RSC-based evaluation stage. First, the differential dataset allocation is performed under a progressive SR to obtain paired reference and comparison training sets. Then, the reference and comparison classifiers are trained under different SRs using the same network and parameter configuration in the RSC criterion-based evaluation stage. AconvNet and AlexNet are selected as the backbones of the evaluation network. Especially, k-fold cross-validation is applied to alleviate selection bias. The difference between the integrals of RSCs is defined as the RSC score for the simulated dataset. Experiments conducted on the measured and simulated moving and stationary target acquisition and recognition (MSTAR) database demonstrate the rationality and validity of the proposed RSC criterion. Specifically, multisource simulated datasets are adopted, including the adversarial autoencoder-generated and electromagnetic simulation datasets. The proposed RSC criterion shows promising utility evaluation ability, flexibility, and extensibility compared with traditional full-reference image-quality assessment criteria.