The average American radiologist interprets at least 1,777 mammogram reports each year, or approximately one new mammogram every 70 minutes [1]. Because radiologists interpret so many mammograms and because the proper interpretation of a screening mammogram is often a matter of life or death for the woman involved, various attempts have been made to streamline the mammography reporting process and introduce consistent structure and terminology into mammography reports. One important advance is the BI-RADS assessment coding scheme (Figure 1), a seven-level classification used to summarize a report and classify it into a distinct category based on the radiologist’s overall assessment of the case. The BI-RADS assessment codes are designed to be translatable across physicians and institutions, and to serve as a basis for clinical follow-up. They also provide a convenient tool for researchers, since the codes are machineinterpretable and can be used in lieu of unstructured textbased diagnoses in large-scale clinical studies.
[1]
Ian H. Witten,et al.
The WEKA data mining software: an update
,
2009,
SKDD.
[2]
Slobodan Vucetic.
Substring selection for biomedical document classification
,
2006,
TMBIO '06.
[3]
Chih-Jen Lin,et al.
LIBLINEAR: A Library for Large Linear Classification
,
2008,
J. Mach. Learn. Res..
[4]
Nello Cristianini,et al.
Classification using String Kernels
,
2000
.
[5]
Chih-Jen Lin,et al.
A sequential dual method for large scale multi-class linear svms
,
2008,
KDD.
[6]
Chih-Jen Lin,et al.
A Practical Guide to Support Vector Classication
,
2008
.
[7]
Gerard Salton,et al.
Term-Weighting Approaches in Automatic Text Retrieval
,
1988,
Inf. Process. Manag..
[8]
D. Miglioretti,et al.
Physician workload in mammography.
,
2008,
AJR. American journal of roentgenology.