A Comparative Study on SOM-Based Visualization of Potential Technical Solutions Using Fuzzy Bag-of-Words and Co-occurrence Probability of Technical Words

Self-Organizing Maps (SOM) is a powerful tool in visualizing mutual connection among various objects. In a previous work, SOM-based visualization was applied for revealing potential technical solutions varied in Japanese patent documents, in which meaningful pairs of technical words are implied in SOMs. Before application, text documents were quantified into numerical vectors considering co-occurrence frequency among technical words in sentences, and then, SOMs were constructed summarizing word features of co-occurrence probability vectors or correlation coefficient vectors. Recently, a fuzzy bag-of-words model was proposed for handling sparse characteristics of word feature values and shown to be useful in document classification. In this paper, a comparative study on utilizing fuzzy bag-of-words in conjunction with previous feature values is performed with the goal of revealing potential technical solutions varied in patent documents.

[1]  Katsuhiro Honda,et al.  Visualization of Potential Technical Solutions by Self-Organizing Maps and Co-Cluster Extraction , 2018, 2018 Joint 10th International Conference on Soft Computing and Intelligent Systems (SCIS) and 19th International Symposium on Advanced Intelligent Systems (ISIS).

[2]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[3]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[4]  Jian Su,et al.  Supervised and Traditional Term Weighting Methods for Automatic Text Categorization , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Rui Zhao,et al.  Fuzzy Bag-of-Words Model for Document Representation , 2018, IEEE Transactions on Fuzzy Systems.