Summarizing biological literature with BioSumm

BioSumm is a summarization environment that supports user queries on online repositories of scientific publications by providing abstract descriptions of focused document groups. The summarization approach is driven by a grading function which evaluates the occurrences of domain dictionary terms. The demonstrated system enables users to query and download research papers from online databases (e.g., PubMed) and local repositories. The (possibly large) retrieved document collection is then partitioned into document clusters devoted to homogeneous topics. Finally, documents in a cluster are summarized by extracting sentences relevant for a specific application domain. In the demo the considered domain is the interaction of human genes and proteins.