NTCIR13 MedWeb Task: Multi-label Classification of Tweets using an Ensemble of Neural Networks

This paper describes how we tackled the Medical Natural Language Processing for Web Document (MedWeb) task as participants of NTCIR13. We utilized multi-language learning to integrate the multi-language inputs of the task into a single neural network. We then built two neural networks–a hierarchical attention network (HAN) and a deep character convolutional neural network (CharCNN)–with multilanguage learning and combined both outputs to utilize the advantages of each neural network. This combination was carried out using ensembling, specifically the method of bagging. We found that the ensemble using the loss functions NLL and hinge produced the best results with 88.0% accuracy.