HyperEmbed: Tradeoffs Between Resources and Performance in NLP Tasks with Hyperdimensional Computing enabled Embedding of n-gram Statistics

Recent advances in Deep Learning have led to a significant performance increase on several NLP tasks, however, the models become more and more computationally demanding. Therefore, this paper tackles the domain of computationally efficient algorithms for NLP tasks. In particular, it investigates distributed representations of n-gram statistics of texts. The representations are formed using hyperdimensional computing enabled embedding. These representations then serve as features, which are used as input to standard classifiers. We investigate the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers. The embedding achieved on par F1 scores while decreasing the time and memory requirements by several times compared to the conventional n-gram statistics, e.g., for one of the classifiers on a small dataset, the memory reduction was 6.18 times; while train and test speed-ups were 4.62 and 3.84 times, respectively. For many classifiers on the large dataset, the memory reduction was about 100 times and train and test speed-ups were over 100 times. More importantly, the usage of distributed representations formed via hyperdimensional computing allows dissecting the strict dependency between the dimensionality of the representation and the parameters of n-gram statistics, thus, opening a room for tradeoffs.

[1]  Aditya Joshi,et al.  Language Geometry Using Random Indexing , 2016, QI.

[2]  D. Rachkovskij Real-Valued Embeddings and Sketches for Fast Distance and Similarity Estimation , 2016, Cybernetics and Systems Analysis.

[3]  Pentti Kanerva,et al.  Hyperdimensional Computing: An Introduction to Computing in Distributed Representation with High-Dimensional Random Vectors , 2009, Cognitive Computation.

[4]  Abhijit Mahabal,et al.  Text Classification with Few Examples using Controlled Generalization , 2019, NAACL-HLT.

[5]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[6]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[7]  Philip Gage,et al.  A new algorithm for data compression , 1994 .

[8]  Tony A. Plate,et al.  Holographic Reduced Representation: Distributed Representation for Cognitive Structures , 2003 .

[9]  Friedrich T. Sommer,et al.  A Theory of Sequence Indexing and Working Memory in Recurrent Neural Networks , 2018, Neural Computation.

[10]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[11]  Rossano Venturini,et al.  Handling Massive N-Gram Datasets Efficiently , 2018, ACM Trans. Inf. Syst..

[12]  Geoffrey E. Hinton,et al.  Distributed Representations , 1986, The Philosophy of Artificial Intelligence.

[13]  Luis Antunes,et al.  Neural Random Projections for Language Modelling , 2018, ArXiv.

[14]  Marcus Liwicki,et al.  Subword Semantic Hashing for Intent Classification on Small Datasets , 2018, 2019 International Joint Conference on Neural Networks (IJCNN).

[15]  P. Kanerva,et al.  Hyperdimensional Computing for Text Classification , 2016 .

[16]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[17]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[18]  Valeriy Vyatkin,et al.  Distributed Representation of n-gram Statistics for Boosting Self-organizing Maps with Hyperdimensional Computing , 2019, Ershov Informatics Conference.

[19]  James L. McClelland,et al.  Parallel Distributed Processing at 25: Further Explorations in the Microstructure of Cognition , 2014, Cogn. Sci..

[20]  S. Furber,et al.  To build a brain , 2012, IEEE Spectrum.

[21]  M. V. Velzen,et al.  Self-organizing maps , 2007 .

[22]  Adrian Hernandez-Mendez,et al.  Evaluating Natural Language Understanding Services for Conversational Question Answering Systems , 2017, SIGDIAL Conference.

[23]  High-Level Expert Group on Artificial Intelligence – Draft Ethics Guidelines for Trustworthy AI , 2019 .

[24]  Jan M. Rabaey,et al.  A Robust and Energy-Efficient Classifier Using Brain-Inspired Hyperdimensional Computing , 2016, ISLPED.

[25]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[26]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[27]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[28]  Kiara F. Bruggeman,et al.  How to build a brain , 2014 .

[29]  Y. Ahmet Sekercioglu,et al.  Holographic Graph Neuron: A Bioinspired Architecture for Pattern Processing , 2015, IEEE Transactions on Neural Networks and Learning Systems.