论文信息 - A method of extracting the number of trial participants from abstracts describing randomized controlled trials

A method of extracting the number of trial participants from abstracts describing randomized controlled trials

We have developed a method for extracting the number of trial participants from abstracts describing randomized controlled trials (RCTs); the number of trial participants may be an indication of the reliability of the trial. The method depends on statistical natural language processing. The number of interest was determined by a binary supervised classification based on a support vector machine algorithm. The method was trialled on 223 abstracts in which the number of trial participants was identified manually to act as a gold standard. Automatic extraction resulted in 2 false-positive and 19 false-negative classifications. The algorithm was capable of extracting the number of trial participants with an accuracy of 97% and an F-measure of 0.84. The algorithm may improve the selection of relevant articles in regard to question-answering, and hence may assist in decision-making.

Grace Chung | Marie J Hansen | Nana Ø Rasmussen | G. Chung | Marie J Hansen

[1] Sophia Ananiadou,et al. Developing a Robust Part-of-Speech Tagger for Biomedical Text , 2005, Panhellenic Conference on Informatics.

[2] Douglas G. Altman,et al. The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials , 2001, The Lancet.

[3] Graeme Hirst,et al. Answering Clinical Questions with Role Identification , 2003, BioNLP@ACL.

[4] Yiming Yang,et al. A re-examination of text categorization methods , 1999, SIGIR '99.

[5] Eduard H. Hovy,et al. Intelligent Approaches to Mining the Primary Research Literature: Techniques, Systems, and Examples , 2008, Computational Intelligence in Medical Informatics.

[6] Deborah A Swinglehurst. Information needs of United Kingdom primary care clinicians. , 2005, Health information and libraries journal.

[7] W. Richardson,et al. The well-built clinical question: a key to evidence-based decisions. , 1995, ACP journal club.

[8] Harris Drucker,et al. Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[9] Johanna I. Westbrook,et al. Do online information retrieval systems help experienced clinicians answer clinical questions? , 2005, Journal of the American Medical Informatics Association : JAMIA.

[10] Stephen B. Johnson,et al. Accessing Heterogeneous Sources of Evidence to Answer Clinical Questions , 2001, J. Biomed. Informatics.

[11] Enrico W. Coiera,et al. A Study of Structured Clinical Abstracts and the Semantic Classification of Sentences , 2007, BioNLP@ACL.

[12] Alan R. Aronson,et al. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program , 2001, AMIA.

[13] Hiroshi Motoda,et al. Computational Methods of Feature Selection , 2022 .

[14] Thorsten Joachims,et al. Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[15] Mattox. Welcome to ARCHIVES CME , 2000, Archives of otolaryngology--head & neck surgery.

[16] Marcelo Fiszman,et al. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text , 2003, J. Biomed. Informatics.

[17] Russ B. Altman,et al. Extracting Subject Demographic Information From Abstracts of Randomized Clinical Trial Reports , 2007, MedInfo.

[18] Jon O Ebbert,et al. Searching the medical literature using PubMed: a tutorial. , 2003, Mayo Clinic proceedings.

[19] Betsy L. Humphreys,et al. Technical Milestone: The Unified Medical Language System: An Informatics Research Collaboration , 1998, J. Am. Medical Informatics Assoc..

[20] Jimmy J. Lin,et al. Evaluation of PICO as a Knowledge Representation for Clinical Questions , 2006, AMIA.

[21] Kevin Knight,et al. Mining online text , 1999, Commun. ACM.

[22] Jimmy J. Lin,et al. Answering Clinical Questions with Knowledge-Based and Statistical Techniques , 2007, CL.