Automated generation of radiologic descriptions on brain volume changes from T1-weighted MR images: Initial assessment of feasibility

The Alzheimer's Disease Neuroimaging Initiative

Research output: Contribution to journalArticlepeer-review


Purpose: To examine the feasibility and potential difficulties of automatically generating radiologic reports (RRs) to articulate the clinically important features of brain magnetic resonance (MR) images. Materials and Methods: We focused on examining brain atrophy by using magnetization-prepared rapid gradient-echo (MPRAGE) images. The technology was based on multi-atlas whole-brain segmentation that identified 283 structures, from which larger superstructures were created to represent the anatomic units most frequently used in RRs. Through two layers of data-reduction filters, based on anatomic and clinical knowledge, raw images (~10 MB) were converted to a few kilobytes of human-readable sentences. The tool was applied to images from 92 patients with memory problems, and the results were compared to RRs independently produced by three experienced radiologists. The mechanisms of disagreement were investigated to understand where machine-human interface succeeded or failed. Results: The automatically generated sentences had low sensitivity (mean: 24.5%) and precision (mean: 24.9%) values; these were significantly lower than the inter-rater sensitivity (mean: 32.7%) and precision (mean: 32.2%) of the radiologists. The causes of disagreement were divided into six error categories: mismatch of anatomic definitions (7.2 ± 9.3%), data-reduction errors (11.4 ± 3.9%), translator errors (3.1 ± 3.1%), difference in the spatial extent of used anatomic terms (8.3 ± 6.7%), segmentation quality (9.8 ± 2.0%), and threshold for sentence-triggering (60.2 ± 16.3%). Conclusion: These error mechanisms raise interesting questions about the potential of automated report generation and the quality of image reading by humans. The most significant discrepancy between the human and automatically generated RRs was caused by the sentence-triggering threshold (the degree of abnormality), which was fixed to z-score >2.0 for the automated generation, while the thresholds by radiologists varied among different anatomical structures.

Original languageEnglish (US)
Article number7
JournalFrontiers in Neurology
Issue numberJAN
StatePublished - 2019


  • 3D T1 weighted image
  • Automated generation
  • Brain atlas
  • Brain atrophy
  • Dementia
  • Radiologic description

ASJC Scopus subject areas

  • Neurology
  • Clinical Neurology

Fingerprint Dive into the research topics of 'Automated generation of radiologic descriptions on brain volume changes from T1-weighted MR images: Initial assessment of feasibility'. Together they form a unique fingerprint.

Cite this