Learning relations using semantic-based vector similarity

The amount of electronic medical documents is growing rapidly every day. While they carry much information, it becomes more and more difficult to manually process it. Our work represents small steps towards automatic knowledge extraction from medical documents using deep learning and similarity based methods. Our goal here is to identify in an unsupervised manner relations between known medical concepts employing a deep learning strategy with Word2Vec. The current solution requires concepts annotations, as it evaluates the similarities between concepts to identify the relationship between them. The experiments suggest that the strategy we considered (to include the POS as part of the information associated to concepts and relation) represents an important step towards a fully unsupervised learning strategy. Although the POS tags alone are not good enough predictors, the addition of other meta-information and sufficient (quantitative and qualitative) training data may enhance the relation identification process, allowing for a meta learning strategy.