Thursday, January 9, 2025
No menu items!
HomeNatureA vision–language foundation model for precision oncology

A vision–language foundation model for precision oncology

  • Sammut, S.-J. et al. Multi-omic machine learning predictor of breast cancer therapy response. Nature 601, 623–629 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Vanguri, R. S. et al. Multimodal integration of radiology, pathology and genomics for prediction of response to PD-(L)1 blockade in patients with non-small cell lung cancer. Nat. Cancer 3, 1151–1164 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Acosta, J. N., Falcone, G. J., Rajpurkar, P. & Topol, E. J. Multimodal biomedical AI. Nat. Med. 28, 1773–1784 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Boehm, K. M., Khosravi, P., Vanguri, R., Gao, J. & Shah, S. P. Harnessing multimodal data integration to advance precision oncology. Nat. Rev. Cancer 22, 114–126 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lipkova, J. et al. Artificial intelligence for multimodal data integration in oncology. Cancer Cell 40, 1095–1110 (2022).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Moor, M. et al. Foundation models for generalist medical artificial intelligence. Nature 616, 259–265 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Kim, C. et al. Transparent medical image AI via an image–text foundation model grounded in medical literature. Nat. Med. 30, 1154–1165 (2024).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Singhal, K. et al. Large language models encode clinical knowledge. Nature 620, 172–180 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Xu, H. et al. A whole-slide foundation model for digital pathology from real-world data. Nature 630, 181–188 (2024).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Chen, R. J. et al. Towards a general-purpose foundation model for computational pathology. Nat. Med. 30, 850–862 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Vorontsov, E. et al. A foundation model for clinical-grade computational pathology and rare cancers detection. Nat. Med. 30, 2924–2935 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Wang, X. et al. A pathology foundation model for cancer diagnosis and prognosis prediction. Nature 634, 970–978 (2024).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Christensen, M., Vukadinovic, M., Yuan, N. & Ouyang, D. Vision–language foundation model for echocardiogram interpretation. Nat. Med. 30, 1481–1488 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Huang, Z., Bianchi, F., Yuksekgonul, M., Montine, T. J. & Zou, J. A visual–language foundation model for pathology image analysis using medical Twitter. Nat. Med. 29, 2307–2316 (2023).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Lu, M. Y. et al. A visual-language foundation model for computational pathology. Nat. Med. 30, 863–874 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Lu, M. Y. et al. A multimodal generative AI copilot for human pathology. Nature 634, 466–473 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. Int. Conf. Machine Learning (eds Meila, M. & Zhang, T.) 8748–8763 (PMLR, 2021).

  • Schuhmann, C. et al. LAION-5B: an open large-scale dataset for training next generation image-text models. Adv. Neural Inf. Process. Syst. 35, 25278–25294 (2022).

    MATH 

    Google Scholar
     

  • Bhinder, B., Gilvary, C., Madhukar, N. S. & Elemento, O. Artificial intelligence in cancer research and precision medicine. Cancer Discovery 11, 900–915 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Wang, W. et al. Image as a foreign language: BEiT pretraining for vision and vision-language tasks. In Proc. IEEE/CVF Conf. Computer Vision Pattern Recognition (eds Brown, M. S., Li, F.-F., Mori, G. & Sato, Y.) 19175–19186 (IEEE, 2023).

  • Gamper, J. & Rajpoot, N. Multiple instance captioning: learning representations from histopathology textbooks and articles. In Proc. IEEE/CVF Conf. Computer Vision Pattern Recognition (eds Brown, M. S., Sukthankar, R., Tan, T. & Zelnik, L.) 16549–16559 (IEEE, 2021).

  • Sun, Y. et a. PathMMU: a massive multimodal expert-level benchmark for understanding and reasoning in pathology. In Eur. Conf. Computer Vision (eds Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T. & Varol, G.) 56–73 (Springer, 2025).

  • Kim, J.-H., Jun, J. & Zhang, B.-T. Bilinear attention networks. In Adv. Neural Inf. Process. Syst. (eds Bengio, S.,Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N. &Garnett, R.). 1571–1581 (PMLR, 2018).

  • Nguyen, B. D. et al. Overcoming data limitation in medical visual question answering. In Proc. Medical Image Computing Computer Assisted Intervention–MICCAI 2019: 22nd Int. Conf. (eds Shen, D. et al.) 522–530 (Springer, 2019).

  • Li, L. H., Yatskar, M., Yin, D., Hsieh, C.-J. & Chang, K.-W. VisualBERT: a simple and performant baseline for vision and language. Preprint at https://arxiv.org/abs/1908.03557 (2019).

  • Naseem, U., Khushi, M., Dunn, A. G. & Kim, J. K-PathVQA: knowledge-aware multimodal representation for pathology visual question answering. IEEE J. Biomed. Health Inf. 28, 1886–1895 (2024).

    Article 

    Google Scholar
     

  • He, X., Zhang, Y., Mou, L., Xing, E. & Xie, P. PathVQA: 30000+ questions for medical visual question answering. Preprint at https://arxiv.org/abs/2003.10286 (2020).

  • Barbano, C. A. et al. Unitopatho, a labeled histopathological dataset for colorectal polyps classification and adenoma dysplasia grading. In 2021 IEEE Int. Conf. Image Processing (ICIP) (eds alZahir, S., Labeau, F. & Mock, K.) 76–80 (IEEE, 2021).

  • Brancati, N. et al. BRACS: a dataset for breast carcinoma subtyping in H&E histology images. Database 2022, baac093 (2022).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Veeling, B. S., Linmans, J., Winkens, J., Cohen, T. & Welling, M. Rotation equivariant CNNs for digital pathology. In Proc. Medical Image Computing Computer Assisted Intervention, MICCAI 2018: 21st Int. Conf. (eds Frangi, A., Schnabel, J., Davatzikos, C., Alberola-López, C. & Fichtinger, G) 210–218 (Springer, 2018).

  • Kriegsmann, K. et al. Deep learning for the detection of anatomical tissue structures and neoplasms of the skin on scanned histopathological tissue sections. Front. Oncol. 12, 1022967 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Kumar, N. et al. A multi-organ nucleus segmentation challenge. IEEE Trans. Med. Imaging 39, 1380–1391 (2019).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Silva-Rodríguez, J., Colomer, A., Sales, M. A., Molina, R. & Naranjo, V. Going deeper through the gleason scoring scale: an automatic end-to-end system for histology prostate grading and cribriform pattern detection. Comput. Methods Programs Biomed. 195, 105637 (2020).

    Article 
    PubMed 

    Google Scholar
     

  • Borkowski, A. A. et al. Lung and colon cancer histopathological image dataset (lc25000). Preprint at https://arxiv.org/abs/1912.12142 (2019).

  • Brummer, O., Pölönen, P., Mustjoki, S. & Brück, O. Integrative analysis of histological textures and lymphocyte infiltration in renal cell carcinoma using deep learning. Preprint at bioRxiv https://doi.org/10.1101/2022.08.15.503955 (2022).

  • Kather, J. N. et al. Predicting survival from colorectal cancer histology slides using deep learning: a retrospective multicenter study. PLoS Med. 16, e1002730 (2019).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Arunachalam, H. B. et al. Viable and necrotic tumor assessment from whole slide images of osteosarcoma using machine-learning and deep-learning models. PLoS One 14, e0210706 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Han, C. et al. Multi-layer pseudo-supervision for histopathology tissue semantic segmentation using patch-level classification labels. Med. Image Anal. 80, 102487 (2022).

    Article 
    PubMed 

    Google Scholar
     

  • Kather, J. N. et al. Pan-cancer image-based detection of clinically actionable genetic alterations. Nat. Cancer 1, 789–799 (2020).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Xu, F. et al. Predicting axillary lymph node metastasis in early breast cancer using deep learning on primary tumor biopsy slides. Front. Oncol. 11, 759007 (2021).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Roetzer-Pejrimovsky, T. et al. The digital brain tumour atlas, an open histopathology resource. Sci. Data 9, 55 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Atkins, M. B. et al. The state of melanoma: emergent challenges and opportunities. Clin. Cancer Res. 27, 2678–2697 (2021).

    Article 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Thompson, A. K., Kelley, B. F., Prokop, L. J., Murad, M. H. & Baum, C. L. Risk factors for cutaneous squamous cell carcinoma recurrence, metastasis, and disease-specific death: a systematic review and metaanalysis. JAMA Dermatol. 152, 419–428 (2016).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • VisioMel. Visiomel Challenge: Predicting Melanoma Relapse (2023) (accessed 1 April 2023); https://www.drivendata.org/competitions/148/visiomel-melanoma/page/674/.

  • Ikezogwo, W. et al. Quilt-1m: one million image-text pairs for histopathology. Adv. Neural Inf. Process. Syst. 36, 37995–38017 (2024).


    Google Scholar
     

  • Zhang, S. et al. Large-scale domain-specific pretraining for biomedical vision-language processing. Preprint at https://arxiv.org/abs/2303.00915 (2023).

  • Hellmann, M. D. et al. Nivolumab plus ipilimumab in advanced non-small-cell lung cancer. N. Engl. J. Med. 381, 2020–2031 (2019).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Gandhi, L. et al. Pembrolizumab plus chemotherapy in metastatic non-small-cell lung cancer. N. Engl. J. Med. 378, 2078–2092 (2018).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Samstein, R. M. et al. Tumor mutational load predicts survival after immunotherapy across multiple cancer types. Nat. Genet. 51, 202–206 (2019).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Cristescu, R. et al. Pan-tumor genomic biomarkers for PD-1 checkpoint blockade-based immunotherapy. Science 362, eaar3593 (2018).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bagaev, A. et al. Conserved pan-cancer microenvironment subtypes predict response to immunotherapy. Cancer Cell 39, 845–865 (2021).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Cui, H. et al. scGPT: toward building a foundation model for single-cell multi-omics using generative AI. Nat. Methods 21, 1470–1480 (2024).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Mok, T. S. et al. Pembrolizumab versus chemotherapy for previously untreated, PD-L1-expressing, locally advanced or metastatic non-small-cell lung cancer (KEYNOTE-042): a randomised, open-label, controlled, phase 3 trial. Lancet 393, 1819–1830 (2019).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Johnson, D. B., Nebhan, C. A., Moslehi, J. J. & Balko, J. M. Immune-checkpoint inhibitors: long-term implications of toxicity. Nat. Rev. Clin. Oncol. 19, 254–267 (2022).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bray, F. et al. Global cancer statistics 2022: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians 74, 229–263 (2024).

    PubMed 
    MATH 

    Google Scholar
     

  • Bruni, D., Angell, H. K. & Galon, J. The immune contexture and immunoscore in cancer prognosis and therapeutic efficacy. Nat. Rev. Cancer 20, 662–680 (2020).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Herbst, R. S. et al. Atezolizumab for first-line treatment of PD-L1-selected patients with NSCLC. N. Engl. J. Med. 383, 1328–1339 (2020).

    Article 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Shazeer, N., Mirhoseini, A., Maziarz, K., Davis, A., Le, Q. V., Hinton, G. E., & Dean, J. Outrageously large neural networks: the Sparsely-Gated Mixture-of-Experts layer. Int. Conf. Learning Representations (eds Bengio, Y. & LeCun, Y.) 1–19 (OpenReview.net, 2017).

  • Bao, H. et al. Vlmo: unified vision-language pre-training with mixture-of-modality-experts. Adv. Neural Inf. Process. Syst. 35, 32897–32912 (2022).


    Google Scholar
     

  • Esser, P. et al. Scaling rectified flow transformers for high-resolution image synthesis. In Forty-first Int. Conf. Machine Learning (eds Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J. & Berkenkamp, F.) 12606–12633 (PMLR, 2024).

  • Sun, Y. et al. PathAsst: a generative foundation AI assistant towards artificial general intelligence of pathology. In AAAI Conf. Artificial Intelligence (ed. Wooldridge, M.) 5034–5042 (AAAI, 2024).

  • Li, J., Li, D., Xiong, C. & Hoi, S. C. H. BLIP: bootstrapping language-image pre-training for unified vision-language understanding and generation. In Int. Conf. Machine Learning (eds Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G. & Sabato, S.) 12888–12900 (PMLR, 2022).

  • Devlin, J., Chang, M.-W., Lee, K. & Toutanova, K. BERT: pre-training of deep bidirectional transformers for language understanding. In North American Chapter Assoc. Comp. Linguistics (eds Burstein, J., Doran, C., Pedersen, T. & Solorio, T.) 4171–4186 (ACL, 2019).

  • Ramesh, A. et al. Zero-shot text-to-image generation. In Int. Conf. Machine Learning (eds Meila, M. & Zhang, T.) 8821–8831 (PMLR, 2021).

  • Peng, Z., Dong, L., Bao, H., Ye, Q. & Wei, F. BEiT v2: masked image modeling with vector-quantized visual tokenizers. Preprint at https://arxiv.org/abs/2208.06366 (2022).

  • Wang, X. et al. Transformer-based unsupervised contrastive learning for histopathological image classification. Med. Image Anal. 81, 102559 (2022).

    Article 
    PubMed 
    MATH 

    Google Scholar
     

  • Shen, Y., Luo, Y., Shen, D. & Ke, J. RandStainNA: learning stain-agnostic features from histology slides by bridging stain augmentation and normalization. In Int. Conf. Medical Image Computing and Computer-Assisted Intervention (eds Wang, L., Dou, Q., Fletcher, P. T., Speidel, S. & Li, S.) 212–221 (Springer, 2022).

  • Kang, M., Song, H., Park, S., Yoo, D. & Pereira, S. Benchmarking self-supervised learning on diverse pathology datasets. 2023 IEEE/CVF Conf. Computer Vision Pattern Recognition (CVPR) (eds Chellappa, R., Matas, J., Quan, L. & Shah, M.) 3344–3354 (IEEE, 2022).

  • Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. In Int. Conf. Learning Representations (Tara Sainath, T.) 1–18 (OpenReview.net, 2019).

  • Ilse, M., Tomczak, J. & Welling, M. Attention-based deep multiple instance learning. In Int. Conf. Machine Learning (eds Dy, J. & Krause, A.) 2127–2136 (PMLR, 2018).

  • Weinstein, J. N. et al. The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Kefeli, J. & Tatonetti, N. TCGA-reports: a machine-readable pathology report resource for benchmarking text-based AI models. Patterns 5, 100933 (2024).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Achiam, J., Adler, S., Agarwal, S., Ahmad, L., Akkaya, I., Aleman, F.L., Almeida, D., Altenschmidt, J., Altman, S., Anadkat, S. & Avila, R. Gpt-4 technical report. arXiv https://arxiv.org/abs/2303.08774 (2023).

  • Callahan, A. et al. The Stanford Medicine data science ecosystem for clinical and translational research. JAMIA Open 6, ooad054 (2023).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Lu, M. Y. et al. Data-efficient and weakly supervised computational pathology on whole-slide images. Nat. Biomed. Eng. 5, 555–570 (2021).

    Article 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • RELATED ARTICLES

    Most Popular

    Recent Comments