论文信息 - All parts are not created equal: SIAM-LSA

All parts are not created equal: SIAM-LSA

All Parts are Not Created Equal: SIAM-LSA Peter Wiemer-Hastings peterwh@cti.depaul.edu DePaul University School of Computer Science, Telecommunications, and Information Systems 243 S. Wabash Chicago IL 60604 pects of the sentences with the semantic similarity rat- ings provided by LSA. To map this approach to sen- tences, we broke the inputs into subject, verb, ob- ject, and indirect object parts. Thus, a simple repre- sentation of the sentence “The dog bit a man” as an object would be: (object1 (verb bit ) (subject The dog ) (object a man )). The advantage of SIAM-LSA over the previous model (Structured LSA, or SLSA) is that its connectionist architecture allows the different components to “compete” for correspon- dence, instead of relying on a direct mapping of sub- ject, verb, and object segments. Our basic hypothesis was that SIAM-LSA would provide a closer match to human ratings than SLSA. A secondary hypothesis was that providing a salience value to give differential weight to the different structural components of the sentences would better match human ratings. In our experiment, we compared human ratings with the basic SIAM-LSA system and the system augmented with salience values. Our results did not support the ba- sic hypothesis. In fact, SIAM-LSA performed worse than LSA or SLSA. When we included empirically derived weights which accentuated verb and object matches but completely devalued subject matches, the ratings corre- lated with human ratings r = 0.59, another 10% reduc- tion in the error over SLSA. In accordance with (Resnik, 1993), this suggests that humans essentially ignore the role of syntactic subjects when matching sentence mean- ings. The ability to assess the similarity of objects in the world is fundamentally important to our survival. Many theories have been proposed for modeling human similar- ity judgments. Most of these theories involve comparing the sets of features of the compared items to determine the overlap between them. Many of them completely ig- nore the structure of the objects and the relationships between the parts. Goldstone (1994) showed that such systems fail to account for human similarity ratings of structured data. His SIAM system used a (non-learning) connectionist architecture to create correspondences be- tween objects and their features in different scenes. Ex- citatory connections reinforced coherent mappings be- tween objects (e.g. ObjectA to ObjectC and ObjectB to ObjectD). Inhibitory connections fought against redun- dant or contradictory mappings. Likewise, connections between the features of objects either supported or in- hibited each other and the corresponding object–object connections. SIAM’s connectionist architecture allowed it to take into account the structure of the scenes and the objects as well as the similarity of the features. Goldstone examined similarity ratings of visual scenes. His approach represented a scene as a spatially related set of objects (for example, pairs of schematic butterflies). Each object has a set of parts each of which has some value. For example, one of Goldstone’s butterflies could be represented as: (object1 (head square) (tail zig-zag) (body-shading white) (wing-shading checkered)). In previous research, we have explored the use of La- tent Semantic Analysis (LSA) for judging the semantic similarity of a given sentence to a set of alternative tar- get sentences. Although LSA has been shown to match the reliability of raters with intermediate domain knowl- edge, the correlation between LSA and human ratings is still somewhat disappointing, generally below 0.5 in a number of studies (Wiemer-Hastings, Wiemer-Hastings, & Graesser, 1999). In recent research, we have pursued the general hypothesis that including structural knowl- edge would improve the correspondence between human and LSA ratings. We found that by performing syntac- tic analysis of the source and target sentences and sepa- rately comparing their subjects, objects, and verbs with LSA, we could reduce the error by over 10% (Wiemer- Hastings & Zipitria, 2001). In the current research, we explored the use of SIAM to combine the analysis of the structural as- References Goldstone, R. (1994). Similarity, Interactive Activation, and Mapping. Journal of Experimental Psychology, Resnik, P. (1993). Selection and Information: A class- based approach to lexical relationships. Ph.D. the- sis, University of Pennsylvania, Philadelphia, PA. Wiemer-Hastings, P., Wiemer-Hastings, K., & Graesser, A. (1999). How Latent is Latent Semantic Analy- sis?. In Proceedings of the Sixteenth International Joint Congress on Artificial Intelligence, pp. 932– 937 San Francisco. Morgan Kaufmann. Wiemer-Hastings, P., & Zipitria, I. (2001). Rules for Syntax, Vectors for Semantics. In Proceedings of the 23 rd Annual Conference of the Cognitive Sci- ence Society Mahwah, NJ. Erlbaum.

Peter Wiemer-Hastings | P. Wiemer-Hastings

[1] Peter Wiemer-Hastings,et al. Rules for Syntax, Vectors for Semantics , 2001 .

[2] Philip Resnik,et al. Measuring Verb Similarity , 2000 .

[3] P. Resnik. Selection and information: a class-based approach to lexical relationships , 1993 .

[4] Brian Falkenhainer,et al. The Structure-Mapping Engine * , 2003 .

[5] Kenneth D. Forbus,et al. MAC/FAC: A Model of Similarity-Based Retrieval , 1995, Cogn. Sci..

[6] Robert L. Goldstone. Similarity, interactive activation, and mapping , 1994 .

[7] R. Shepard. The analysis of proximities: Multidimensional scaling with an unknown distance function. II , 1962 .

[8] R. Shepard. The analysis of proximities: Multidimensional scaling with an unknown distance function. I. , 1962 .

[9] Peter M. Wiemer-Hastings,et al. How Latent is Latent Semantic Analysis? , 1999, IJCAI.

[10] D. Gentner,et al. Respects for similarity , 1993 .

[11] D. Medin,et al. Birds of a Feather Flock Together: Similarity Judgments with Semantically Rich Stimuli , 1997 .

[12] Richard A. Harshman,et al. Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[13] T. Landauer,et al. A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[14] A. Tversky. Features of Similarity , 1977 .