Analysis with Deeply Learned Distributed Representations of Variable Length Texts

Learning good semantic vector representations for phrases, sentences and paragraphs is a challenging and ongoing area of research in natural language processing and understanding. In this project, we survey and implement several deeplearning and deep-learning-inspired approaches and evaluate these algorithms on several sentiment-labeled datasets and analysis tasks. In doing so, we demonstrate new state-of-the-art performance on the IMDB Large Movie Review Dataset [5] using highly-tuned paragraph vectors [4], and highly competitive performance on the Stanford Sentiment Treebank dataset [8] using Deep Recursive-NNs and LSTMs for both binary and fine classification tasks. Finally, we compare and analyze each model’s performance on our selection of sentiment analysis tasks.