Spá: A Web-Based Viewer for Text Mining in Evidence Based Medicine

Summarizing the evidence about medical interventions is an immense undertaking, in part because unstructured Portable Document Format (PDF) documents remain the main vehicle for disseminating scientific findings. Clinicians and researchers must therefore manually extract and synthesise information from these PDFs. We introduce Spa1,2 a web-based viewer that enables automated annotation and summarisation of PDFs via machine learning. To illustrate its functionality, we use Spa to semi-automate the assessment of bias in clinical trials. Spa has a modular architecture, therefore the tool may be widely useful in other domains with a PDF-based literature, including law, physics, and biology.