A vision–language foundation model for precision oncology

Sammut, S.-J. et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature 601, 623–629 (2022).

Article
ADS
CAS
PubMed
MATH

Google Scholar

Vanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer 3, 1151–1164 (2022).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).

Article
CAS
PubMed

Google Scholar

Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22, 114–126 (2022).

Article
CAS
PubMed

Google Scholar

Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).

Article
ADS
CAS
PubMed
MATH

Google Scholar

Kim, C. et al. Transparent medical image AI via an image–text foundation model grounded in medical literature. Nat. Med. 30, 1154–1165 (2024).

Article
CAS
PubMed
MATH

Google Scholar

Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

Article
ADS
CAS
PubMed
PubMed Central
MATH

Google Scholar

Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).

Article
ADS
CAS
PubMed
PubMed Central
MATH

Google Scholar

Xu, H. et al. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188 (2024).

Article
ADS
CAS
PubMed
PubMed Central
MATH

Google Scholar

Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Vorontsov, E. et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nat. Med. 30, 2924–2935 (2024).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Wang, X. et al. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature 634, 970–978 (2024).

Article
CAS
PubMed
MATH

Google Scholar

Christensen, M., Vukadinovic, M., Yuan, N. & Ouyang, D. Vision–language foundation model for echocardiogram interpretation. Nat. Med. 30, 1481–1488 (2024).

Article
CAS
PubMed
PubMed Central

Google Scholar

Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical Twitter. Nat. Med. 29, 2307–2316 (2023).

Article
CAS
PubMed

Google Scholar

Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Lu, M. Y. et al. A multimodal generative AI copilot for human pathology. Nature 634, 466–473 (2024).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. Int. Conf. Machine Learning (eds Meila, M. & Zhang, T.) 8748–8763 (PMLR, 2021).

Schuhmann, C. et al. LAION-5B: an open large-scale dataset for training next generation image-text models. Adv. Neural Inf. Process. Syst. 35, 25278–25294 (2022).

MATH

Google Scholar

Bhinder, B., Gilvary, C., Madhukar, N. S. & Elemento, O. Artificial intelligence in cancer research and precision medicine. Cancer Discovery 11, 900–915 (2021).

Article
CAS
PubMed
PubMed Central

Google Scholar

Wang, W. et al. Image as a foreign language: BEiT pretraining for vision and vision-language tasks. In Proc. IEEE/CVF Conf. Computer Vision Pattern Recognition (eds Brown, M. S., Li, F.-F., Mori, G. & Sato, Y.) 19175–19186 (IEEE, 2023).

Gamper, J. & Rajpoot, N. Multiple instance captioning: learning representations from histopathology textbooks and articles. In Proc. IEEE/CVF Conf. Computer Vision Pattern Recognition (eds Brown, M. S., Sukthankar, R., Tan, T. & Zelnik, L.) 16549–16559 (IEEE, 2021).

Sun, Y. et a. PathMMU: a massive multimodal expert-level benchmark for understanding and reasoning in pathology. In Eur. Conf. Computer Vision (eds Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T. & Varol, G.) 56–73 (Springer, 2025).

Kim, J.-H., Jun, J. & Zhang, B.-T. Bilinear attention networks. In Adv. Neural Inf. Process. Syst. (eds Bengio, S.,Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N. &Garnett, R.). 1571–1581 (PMLR, 2018).

Nguyen, B. D. et al. Overcoming data limitation in medical visual question answering. In Proc. Medical Image Computing Computer Assisted Intervention–MICCAI 2019: 22nd Int. Conf. (eds Shen, D. et al.) 522–530 (Springer, 2019).

Li, L. H., Yatskar, M., Yin, D., Hsieh, C.-J. & Chang, K.-W. VisualBERT: a simple and performant baseline for vision and language. Preprint at https://arxiv.org/abs/1908.03557 (2019).

Naseem, U., Khushi, M., Dunn, A. G. & Kim, J. K-PathVQA: knowledge-aware multimodal representation for pathology visual question answering. IEEE J. Biomed. Health Inf. 28, 1886–1895 (2024).

Article

Google Scholar

He, X., Zhang, Y., Mou, L., Xing, E. & Xie, P. PathVQA: 30000+ questions for medical visual question answering. Preprint at https://arxiv.org/abs/2003.10286 (2020).

Barbano, C. A. et al. Unitopatho, a labeled histopathological dataset for colorectal polyps classification and adenoma dysplasia grading. In 2021 IEEE Int. Conf. Image Processing (ICIP) (eds alZahir, S., Labeau, F. & Mock, K.) 76–80 (IEEE, 2021).

Brancati, N. et al. BRACS: a dataset for breast carcinoma subtyping in H&E histology images. Database 2022, baac093 (2022).

Article
PubMed
PubMed Central
MATH

Google Scholar

Veeling, B. S., Linmans, J., Winkens, J., Cohen, T. & Welling, M. Rotation equivariant CNNs for digital pathology. In Proc. Medical Image Computing Computer Assisted Intervention, MICCAI 2018: 21st Int. Conf. (eds Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C. & Fichtinger, G) 210–218 (Springer, 2018).

Kriegsmann, K. et al. Deep learning for the detection of anatomical tissue structures and neoplasms of the skin on scanned histopathological tissue sections. Front. Oncol. 12, 1022967 (2022).

Article
PubMed
PubMed Central

Google Scholar

Kumar, N. et al. A multi-organ nucleus segmentation challenge. IEEE Trans. Med. Imaging 39, 1380–1391 (2019).

Article
PubMed
PubMed Central
MATH

Google Scholar

Silva-Rodríguez, J., Colomer, A., Sales, M. A., Molina, R. & Naranjo, V. Going deeper through the gleason scoring scale: an automatic end-to-end system for histology prostate grading and cribriform pattern detection. Comput. Methods Programs Biomed. 195, 105637 (2020).

Article
PubMed

Google Scholar

Borkowski, A. A. et al. Lung and colon cancer histopathological image dataset (lc25000). Preprint at https://arxiv.org/abs/1912.12142 (2019).

Brummer, O., Pölönen, P., Mustjoki, S. & Brück, O. Integrative analysis of histological textures and lymphocyte infiltration in renal cell carcinoma using deep learning. Preprint at bioRxiv https://doi.org/10.1101/2022.08.15.503955 (2022).

Kather, J. N. et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med. 16, e1002730 (2019).

Article
PubMed
PubMed Central

Google Scholar

Arunachalam, H. B. et al. Viable and necrotic tumor assessment from whole slide images of osteosarcoma using machine-learning and deep-learning models. PLoS One 14, e0210706 (2019).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Han, C. et al. Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch-level classification labels. Med. Image Anal. 80, 102487 (2022).

Article
PubMed

Google Scholar

Kather, J. N. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat. Cancer 1, 789–799 (2020).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Xu, F. et al. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides. Front. Oncol. 11, 759007 (2021).

Article
PubMed
PubMed Central

Google Scholar

Roetzer-Pejrimovsky, T. et al. The digital brain tumour atlas, an open histopathology resource. Sci. Data 9, 55 (2022).

Article
PubMed
PubMed Central

Google Scholar

Atkins, M. B. et al. The state of melanoma: emergent challenges and opportunities. Clin. Cancer Res. 27, 2678–2697 (2021).

Article
CAS
PubMed
PubMed Central
MATH

Google Scholar

Thompson, A. K., Kelley, B. F., Prokop, L. J., Murad, M. H. & Baum, C. L. Risk factors for cutaneous squamous cell carcinoma recurrence, metastasis, and disease-specific death: a systematic review and metaanalysis. JAMA Dermatol. 152, 419–428 (2016).

Article
PubMed
PubMed Central

Google Scholar

VisioMel. Visiomel Challenge: Predicting Melanoma Relapse (2023) (accessed 1 April 2023); https://www.drivendata.org/competitions/148/visiomel-melanoma/page/674/.

Ikezogwo, W. et al. Quilt-1m: one million image-text pairs for histopathology. Adv. Neural Inf. Process. Syst. 36, 37995–38017 (2024).

Google Scholar

Zhang, S. et al. Large-scale domain-specific pretraining for biomedical vision-language processing. Preprint at https://arxiv.org/abs/2303.00915 (2023).

Hellmann, M. D. et al. Nivolumab plus ipilimumab in advanced non-small-cell lung cancer. N. Engl. J. Med. 381, 2020–2031 (2019).

Article
CAS
PubMed
MATH

Google Scholar

Gandhi, L. et al. Pembrolizumab plus chemotherapy in metastatic non-small-cell lung cancer. N. Engl. J. Med. 378, 2078–2092 (2018).

Article
CAS
PubMed
MATH

Google Scholar

Samstein, R. M. et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nat. Genet. 51, 202–206 (2019).

Article
CAS
PubMed
PubMed Central

Google Scholar

Cristescu, R. et al. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade-based immunotherapy. Science 362, eaar3593 (2018).

Article
PubMed
PubMed Central

Google Scholar

Bagaev, A. et al. Conserved pan-cancer microenvironment subtypes predict response to immunotherapy. Cancer Cell 39, 845–865 (2021).

Article
CAS
PubMed
MATH

Google Scholar

Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).

Article
CAS
PubMed
MATH

Google Scholar

Mok, T. S. et al. Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): a randomised, open-label, controlled, phase 3 trial. Lancet 393, 1819–1830 (2019).

Article
CAS
PubMed

Google Scholar

Johnson, D. B., Nebhan, C. A., Moslehi, J. J. & Balko, J. M. Immune-checkpoint inhibitors: long-term implications of toxicity. Nat. Rev. Clin. Oncol. 19, 254–267 (2022).

Article
PubMed
PubMed Central

Google Scholar

Bray, F. et al. Global cancer statistics 2022: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians 74, 229–263 (2024).

PubMed
MATH

Google Scholar

Bruni, D., Angell, H. K. & Galon, J. The immune contexture and immunoscore in cancer prognosis and therapeutic efficacy. Nat. Rev. Cancer 20, 662–680 (2020).

Article
CAS
PubMed

Google Scholar

Herbst, R. S. et al. Atezolizumab for first-line treatment of PD-L1-selected patients with NSCLC. N. Engl. J. Med. 383, 1328–1339 (2020).

Article
CAS
PubMed
MATH

Google Scholar

Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q. V., Hinton, G. E., & Dean, J. Outrageously large neural networks: the Sparsely-Gated Mixture-of-Experts layer. Int. Conf. Learning Representations (eds Bengio, Y. & LeCun, Y.) 1–19 (OpenReview.net, 2017).

Bao, H. et al. Vlmo: unified vision-language pre-training with mixture-of-modality-experts. Adv. Neural Inf. Process. Syst. 35, 32897–32912 (2022).

Google Scholar

Esser, P. et al. Scaling rectified flow transformers for high-resolution image synthesis. In Forty-first Int. Conf. Machine Learning (eds Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J. & Berkenkamp, F.) 12606–12633 (PMLR, 2024).

Sun, Y. et al. PathAsst: a generative foundation AI assistant towards artificial general intelligence of pathology. In AAAI Conf. Artificial Intelligence (ed. Wooldridge, M.) 5034–5042 (AAAI, 2024).

Li, J., Li, D., Xiong, C. & Hoi, S. C. H. BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation. In Int. Conf. Machine Learning (eds Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G. & Sabato, S.) 12888–12900 (PMLR, 2022).

Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In North American Chapter Assoc. Comp. Linguistics (eds Burstein, J., Doran, C., Pedersen, T. & Solorio, T.) 4171–4186 (ACL, 2019).

Ramesh, A. et al. Zero-shot text-to-image generation. In Int. Conf. Machine Learning (eds Meila, M. & Zhang, T.) 8821–8831 (PMLR, 2021).

Peng, Z., Dong, L., Bao, H., Ye, Q. & Wei, F. BEiT v2: masked image modeling with vector-quantized visual tokenizers. Preprint at https://arxiv.org/abs/2208.06366 (2022).

Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022).

Article
PubMed
MATH

Google Scholar

Shen, Y., Luo, Y., Shen, D. & Ke, J. RandStainNA: learning stain-agnostic features from histology slides by bridging stain augmentation and normalization. In Int. Conf. Medical Image Computing and Computer-Assisted Intervention (eds Wang, L., Dou, Q., Fletcher, P. T., Speidel, S. & Li, S.) 212–221 (Springer, 2022).

Kang, M., Song, H., Park, S., Yoo, D. & Pereira, S. Benchmarking self-supervised learning on diverse pathology datasets. 2023 IEEE/CVF Conf. Computer Vision Pattern Recognition (CVPR) (eds Chellappa, R., Matas, J., Quan, L. & Shah, M.) 3344–3354 (IEEE, 2022).

Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In Int. Conf. Learning Representations (Tara Sainath, T.) 1–18 (OpenReview.net, 2019).

Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Int. Conf. Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (PMLR, 2018).

Weinstein, J. N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

Article
PubMed
PubMed Central
MATH

Google Scholar

Kefeli, J. & Tatonetti, N. TCGA-reports: a machine-readable pathology report resource for benchmarking text-based AI models. Patterns 5, 100933 (2024).

Article
PubMed
PubMed Central
MATH

Google Scholar

Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S. & Avila, R. Gpt-4 technical report. arXiv https://arxiv.org/abs/2303.08774 (2023).

Callahan, A. et al. The Stanford Medicine data science ecosystem for clinical and translational research. JAMIA Open 6, ooad054 (2023).

Article
PubMed
PubMed Central
MATH

Google Scholar

Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).

Article
PubMed
PubMed Central
MATH

Google Scholar

A vision–language foundation model for precision oncology

Caffeine might reduce dementia risk and slow cognitive decline

Giant magnetocaloric effect and spin supersolid in a metallic dipolar magnet

Targeting excessive cholesterol deposition alleviates secondary lymphoedema

Most Popular

HGTV’s Nicole Curtis Is a Bad Person, But She’s Not a Racist, Ex Says

Lykke Li Details Final Album The Afterparty, Shares New Song

New Balance P350 Basketball Release Date UHSL5L1

A look at Higgsfield, an AI video startup that grew to $300M ARR within 11 months, as it faces a creator backlash over aggressive...

Recent Comments

ABOUT US

POPULAR POSTS

HGTV’s Nicole Curtis Is a Bad Person, But She’s Not a Racist, Ex Says

Lykke Li Details Final Album The Afterparty, Shares New Song

New Balance P350 Basketball Release Date UHSL5L1

POPULAR CATEGORY