Feature selection for automatic taxonomy induction

Most existing automatic taxonomy induction systems exploit one or more features to induce a taxonomy; nevertheless there is no systematic study examining which are the best features for the task under various conditions. This paper studies the impact of using different features on taxonomy induction for different types of relations and for terms at different abstraction levels. The evaluation shows that different conditions need different technologies or different combination of the technologies. In particular, co-occurrence and lexico-syntactic patterns are good features for is-a, sibling and part-of relations; contextual, co-occurrence, patterns, and syntactic features work well for concrete terms; co-occurrence works well for abstract terms.