今日推荐

2018 - ArXiv

A Hierarchical Multi-task Approach for Learning Embeddings from Semantic Tasks

Thomas Wolf Sebastian Ruder Victor Sanh Sebastian Ruder Thomas Wolf Victor Sanh

0 阅读

Much effort has been devoted to evaluate whether multi-task learning can be leveraged to learn rich representations that can be used in various Natural Language Processing (NLP) down-stream applications. However, there is still a lack of understanding of the settings in which multi-task learning has a significant effect. In this work, we introduce a hierarchical model trained in a multi-task learning setup on a set of carefully selected semantic tasks. The model is trained in a hierarchical fashion to introduce an inductive bias by supervising a set of low level tasks at the bottom layers of the model and more complex tasks at the top layers of the model. This model achieves state-of-the-art results on a number of tasks, namely Named Entity Recognition, Entity Mention Detection and Relation Extraction without hand-engineered features or external NLP tools like syntactic parsers. The hierarchical training supervision induces a set of shared semantic representations at lower layers of the model. We show that as we move from the bottom to the top layers of the model, the hidden states of the layers tend to represent more complex semantic information.

2017 - BMC Bioinformatics

A neural network multi-task learning approach to biomedical named entity recognition

Sampo Pyysalo Anna Korhonen Billy Chiu Gamal K. O. Crichton A. Korhonen Sampo Pyysalo Billy Chiu

0 阅读

BackgroundNamed Entity Recognition (NER) is a key task in biomedical text mining. Accurate NER systems require task-specific, manually-annotated datasets, which are expensive to develop and thus limited in size. Since such datasets contain related but different information, an interesting question is whether it might be possible to use them together to improve NER performance. To investigate this, we develop supervised, multi-task, convolutional neural network models and apply them to a large number of varied existing biomedical named entity datasets. Additionally, we investigated the effect of dataset size on performance in both single- and multi-task settings.ResultsWe present a single-task model for NER, a Multi-output multi-task model and a Dependent multi-task model. We apply the three models to 15 biomedical datasets containing multiple named entities including Anatomy, Chemical, Disease, Gene/Protein and Species. Each dataset represent a task. The results from the single-task model and the multi-task models are then compared for evidence of benefits from Multi-task Learning.With the Multi-output multi-task model we observed an average F-score improvement of 0.8% when compared to the single-task model from an average baseline of 78.4%. Although there was a significant drop in performance on one dataset, performance improves significantly for five datasets by up to 6.3%. For the Dependent multi-task model we observed an average improvement of 0.4% when compared to the single-task model. There were no significant drops in performance on any dataset, and performance improves significantly for six datasets by up to 1.1%.The dataset size experiments found that as dataset size decreased, the multi-output model’s performance increased compared to the single-task model’s. Using 50, 25 and 10% of the training data resulted in an average drop of approximately 3.4, 8 and 16.7% respectively for the single-task model but approximately 0.2, 3.0 and 9.8% for the multi-task model.ConclusionsOur results show that, on average, the multi-task models produced better NER results than the single-task models trained on a single NER dataset. We also found that Multi-task Learning is beneficial for small datasets. Across the various settings the improvements are significant, demonstrating the benefit of Multi-task Learning for this task.

论文关键词

neural network differential equation deep learning convolutional neural network convolutional neural software development deep neural network feature selection open source deep neural optical network learning approach geographic information system distributed generation geographic information stochastic differential equation power allocation open source software stochastic differential deep convolutional neural source software deep convolutional named entity recognition loss function human action space telescope personal area network mental health feature learning wireless personal area elastic optical multi-task learning personal area elastic optical network network reconfiguration wireless personal learning problem hubble space telescope learning task software developer high rate water pressure hubble space doubly stochastic open source tool source software development free and open capacity allocation low rate source software project group lasso path protection sparse learning backup path rate wireles open source gi spare capacity optimal reconfiguration source gi multi-task deep backward doubly stochastic doubly stochastic differential backward doubly big bang-big crunch low rate wireles survivable routing multi-task deep learning guidance sensor bang-big crunch big bang-big multi-task learning approach source software tool deep multi-task deep multi-task learning pose-invariant face high water pressure fine guidance shared backup fine guidance sensor multi-task convolutional neural multi-task learning model source gis software shared backup path multi-task feature bang-big crunch algorithm backup path protection approximating optimal crunch algorithm multi-task learning method software movement multi-task network spare capacity allocation multi-task convolutional multi-task cnn multi-task sparse multi-task feature learning multi-task regression multi-task allocation feature learning algorithm multiple related task multi-task multi-view multi-task learning algorithm soils contaminated