Assessing System Agreement and Instance Difficulty in the Lexical
暂无分享,去创建一个
This paper presents a comparative evaluation among the systems that participated in the Spanish and English lexical sample tasks of SENSEVAL-2. The focus is on pairwise comparisons among systems to assess the degree to which they agree, and on measuring the difficulty of the test instances included in these tasks.
[1] Ted Pedersen. Machine Learning with Lexical Features: The Duluth Approach to SENSEVAL-2 , 2001, SENSEVAL@ACL.
[2] Jean Carletta,et al. Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.
[3] Jacob Cohen. A Coefficient of Agreement for Nominal Scales , 1960 .
[4] Klaus Krippendorff,et al. Content Analysis: An Introduction to Its Methodology , 1980 .