Schöckel, L. et al. Developments in X-ray contrast media and the potential impact on computed tomography. Invest. Radiol. 55, 592–597 (2020).
Kanal, K. M. et al. U.S. diagnostic reference levels and achievable doses for 10 adult CT examinations. Radiology 284, 120–133 (2017).
Taschetta-Millane, M. The evolving computed tomography market. Imaging Technology News https://www.itnonline.com/article/evolving-computed-tomography-market (2024).
Hudnall, C. Maximum capacity: overloaded radiologists are grappling with solutions to a booming volume crisis. American College of Radiology https://www.acr.org/Practice-Management-Quality-Informatics/ACR-Bulletin/Articles/April-2024/Maximum-Capacity (2024).
Milburn, J. Workforce-shortage. How will we solve our radiology workforce shortage? American College of Radiology https://www.acr.org/Practice-Management-Quality-Informatics/ACR-Bulletin/Articles/March-2024/How-Will-We-Solve-Our-Radiology-Workforce-Shortage (2024).
Rimmer, A. Radiologist shortage leaves patient care at risk, warns royal college. BMJ 359, j4683 (2017).
Paschali, M. et al. Foundation models in radiology: what, how, why, and why not. Radiology 314, e240597 (2025).
Zhang, S. et al. A multimodal biomedical foundation model trained from fifteen million image–text pairs. NEJM AI 2, AIoa2400640 (2025).
Chaves, J. M. et al. A clinically accessible small multimodal radiology model and evaluation metric for chest X-ray findings. Nat. Commun. 16, 3108 (2025).
Tu, T. et al. Towards generalist biomedical AI. NEJM AI 1, AIoa2300138 (2024).
Wu, C., Zhang, X., Zhang, Y., Wang, Y. & Xie, W. Towards generalist foundation model for radiology by leveraging web-scale 2D & 3D medical data. Nat. Commun. 16, 7866 (2025).
Chen, Z. et al. CheXagent: Towards a foundation model for chest X-ray interpretation. In AAAI 2024 Spring Symposium on Clinical Foundation Models (AAAI, 2024).
Udare, A. et al. Radiologist productivity analytics: factors impacting abdominal pelvic CT exam reporting times. J. Digit. Imaging 35, 87–97 (2022).
Liu, D. et al. Fully automated CT-based adiposity assessment: comparison of the L1 and L3 vertebral levels for opportunistic prediction. Abdom. Radiol. 48, 787–795 (2023).
Blankemeier, L. et al. Opportunistic incidence prediction of multiple chronic diseases from abdominal CT imaging using multi-task learning. In Proc. 25th International Conference on Medical Image Computing and Computer-Assisted Intervention 309–318 (Springer, 2022).
Zambrano Chaves, J. M. et al. Opportunistic assessment of ischemic heart disease risk using abdominopelvic computed tomography and medical record data: a multimodal explainable artificial intelligence approach. Sci. Rep. 13, 21034 (2023).
Cao, K. et al. Large-scale pancreatic cancer detection via non-contrast CT and deep learning. Nat. Med. 29, 3033–3043 (2023).
Wang, Y.-R. et al. Screening and diagnosis of cardiovascular disease using artificial intelligence-enabled cardiac magnetic resonance imaging. Nat. Med. 30, 1471–1480 (2024).
Langlotz, C. P. The future of AI and informatics in radiology: 10 predictions. Radiology 309, e231114 (2023).
Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices (US Food and Drug Administration, 2023).
Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning 8748–8763 (PMLR, 2021).
Schuhmann, C. et al. Laion-5b: an open large-scale dataset for training next generation image-text models. Adv. Neural Inf. Process. Syst. 35, 25278–25294 (2022).
Larson, D. B., Magnus, D. C., Lungren, M. P., Shah, N. H. & Langlotz, C. P. Ethics of using and sharing clinical imaging data for artificial intelligence: a proposed framework. Radiology 295, 675–682 (2020).
Hyland, S. L. et al. MAIRA-1: a specialised large multimodal model for radiology report generation. Preprint at https://arxiv.org/abs/2311.13668 (2023).
Huang, S.-C. et al. PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging. npj Digit. Med. 3, 61 (2020).
Christensen, M., Vukadinovic, M., Yuan, N. & Ouyang, D. Vision–language foundation model for echocardiogram interpretation. Nat. Med. 30, 1481–1488 (2024).
Polevikov, S. Med-gemini by Google: A boon for researchers, a bane for doctors. AI Health Uncut https://sergeiai.substack.com/p/googles-med-gemini-im-excited-and (2024).
Fleming, S. L. et al. Medalign: a clinician-generated dataset for instruction following with electronic medical records. Proc. AAAI Conf. Artif. Intell. 38, 22021–22030 (2024).
Liebl, H. et al. A computed tomography vertebral segmentation dataset with anatomical variations and multi-vendor scanner data. Sci. Data 8, 284 (2021).
Wasserthal, J. et al. TotalSegmentator: robust segmentation of 104 anatomic structures in CT images. Radiol. Artif. Intell. 5, e230024 (2023).
Cherti, M. et al. Reproducible scaling laws for contrastive language–image learning. In Proc. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2818–2829 (IEEE, 2023).
Löffler, M. T. et al. A vertebral segmentation dataset with fracture grading. Radiol. Artif. Intell. 2, e190138 (2020).
Carreira, J. & Zisserman, A. Quo vadis, action recognition? A new model and the kinetics dataset. In Proc. 2017 IEEE Conference on Computer Vision and Pattern Recognition 6299–6308 (IEEE, 2017).
Denny, J. C. et al. Systematic comparison of phenomewide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102–1111 (2013).
Liu, Z. et al. A convnet for the 2020s. In Proc. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition 11976–11986 (IEEE, 2022).
Liu, Z. et al. Swin transformer: hierarchical vision transformer using shifted windows. In Proc. 2021 IEEE/CVF International Conference on Computer Vision 10012–10022 (IEEE, 2021).
Li, Y., Wehbe, R. M., Ahmad, F. S., Wang, H. & Luo, Y. Clinical-Longformer and Clinical-BigBird: transformers for long clinical sequences. Preprint at https://arxiv.org/abs/2201.11838 (2022).
Delbrouck, J.-B. et al. Improving the factual correctness of radiology report generation with semantic rewards. In Findings of the Association for Computational Linguistics: EMNLP 2022 4348–4360 (Association for Computational Linguistics, 2022).
Zhang, T., Kishore, V., Wu, F., Weinberger, K. Q. & Artzi, Y. BERTScore: Evaluating text generation with BERT. In International Conference on Learning Representations (ICLR, 2020).
Lin, C.-Y. ROUGE: a package for automatic evaluation of summaries. In Proc. Text Summarization Branches Out 74–81 (Association for Computational Linguistics, 2004).
Papineni, K., Roukos, S., Ward, T. & Zhu, W.-J. BLEU: a method for automatic evaluation of machine translation. In Proc. 40th Annual Meeting of the Association for Computational Linguistics 311–318 (Association of Computational Linguistics, 2002).
Isensee, F., Jaeger, P. F., Kohl, S. A. A., Petersen, J. & Maier-Hein, K. H. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18, 203–211 (2021).
Codella, N. C. F. et al. MedImageInsight: an open-source embedding model for general domain medical imaging. Preprint at https://arxiv.org/abs/2410.06542 (2024).
Yang, L. et al. Advancing multimodal medical capabilities of Gemini. Preprint at https://arxiv.org/abs/2405.03162 (2024).
Hamamci, I. E. et al. Developing generalist foundation models from a multimodal dataset for 3D computed tomography. Preprint at https://arxiv.org/abs/2403.17834 (2024).
Niu, C. et al. Medical multimodal multitask foundation model for lung cancer screening. Nat. Commun. 16, 1523 (2025).
Pai, S. et al. Vision foundation models for computed tomography. Preprint at https://arxiv.org/abs/2501.09001 (2025).
Huang, S.-C. et al. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. npj Digit. Med. 6, 74 (2023).
Tang, Y. et al. Self-supervised pre-training of Swin transformers for 3D medical image analysis. In Proc. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition 20730–20740 (IEEE, 2022).
He, K. et al. Masked autoencoders are scalable vision learners. In Proc. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition 16000–16009 (IEEE, 2021).
Laurençon, H., Tronchon, L., Cord, M. & Sanh, V. What matters when building vision-language models? In Proc. 38th International Conference on Neural Information Processing Systems 87874–87907 (NIPS, 2024).
Li, Z. et al. Monkey: Image resolution and text label are important things for large multi-modal models. In Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition 26763–26773 (IEEE, 2024).
Chen, T., Kornblith, S., Norouzi, M. & Hinton, G. A simple framework for contrastive learning of visual representations. In Proc. International Conference on Machine Learning 1597–1607 (PMLR, 2020).
Van den Oord, A., Li, Y. & Vinyals, O. Representation learning with contrastive predictive coding. Preprint at https://arxiv.org/abs/1807.03748 (2018).
Reis, E. P. Automated abdominal CT contrast phase detection using an interpretable and open-source artificial intelligence algorithm. Eur. Radiol. 34, 6680–6687 (2024).
Van Uden, C. et al. Exploring the versatility of zero-shot CLIP for interstitial lung disease classification. Preprint at https://arxiv.org/abs/2306.01111 (2023).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR, 2019).
Kirkpatrick, J. et al. Overcoming catastrophic forgetting in neural networks. Proc. Natl Acad. Sci. USA 114, 3521–3526 (2017).
Chronic Kidney Disease in the United States, 2023 (Centers for Disease Control and Prevention, 2023).
By the Numbers: Diabetes in America (Centers for Disease Control and Prevention, 2022).
Facts about Hypertension (Centers for Disease Control and Prevention, 2023).
What is Coronary Heart Disease? (US Department of Health and Human Services, 2023).
Gu, J., Sanchez, R., Chauhan, A., Fazio, S. & Wong, N. Lipid treatment status and goal attainment among patients with atherosclerotic cardiovascular disease in the United States: a 2019 update. Am. J. Prev. Cardiol. 10, 100336 (2022).
Wright, N. C. et al. The recent prevalence of osteoporosis and low bone mass in the United States based on bone mineral density at the femoral neck or lumbar spine. J. Bone Miner. Res. 29, 2520–2526 (2014).
Johnson, A. E. W. et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci. Data 10, 1 (2023).
Hu, E. J. et al. LoRA: low-rank adaptation of large language models. In International Conference on Learning Representations (ICLR, 2022).
Van Veen, D. et al. Adapted large language models can outperform medical experts in clinical text summarization. Nat. Med. 30, 1134–1142 (2024).
Van Veen, D. et al. RadAdapt: radiology report summarization via lightweight domain adaptation of large language models. In Proc. The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks 449–460 (Association for Computational Linguistics, 2023).
Ronneberger, O., Fischer, P. & Brox, T. U-Net: convolutional networks for biomedical image segmentation. In Proc. Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015 234–241 (Springer, 2015).
Hatamizadeh, A. et al. UNETR: transformers for 3D medical image segmentation. In Proc. 2022 IEEE/CVF Winter Conference on Applications of Computer Vision 574–584 (IEEE, 2022).
Xue, C. et al. AI-based differential diagnosis of dementia etiologies on multimodal data. Nat. Med. 30, 2977–2989 (2024).
Yang, A. et al. Qwen3 technical report. Preprint at https://arxiv.org/abs/2505.09388 (2025).

