Academic Conference

Literature Review | Progress and Challenges of AI in the Era of Large Models

2025-07-04 4 minHeer Medical Brand Center

Literature review: Progress and challenges of pathology AI in the era of large models

The era of large models has brought unprecedented opportunities for artificial intelligence, but it also presents numerous challenges.

Article image

In March 2025, the Department of Pathology at Shanghai Jiao Tong University Affiliated Ruijin Hospital, in collaboration with the Department of Pathology at China-Japan Friendship Hospital, jointly published "Progress and Challenges of Pathology AI in the Era of Large Models" in the Chinese Journal of Pathology. This article reviewed the research progress of international pathology large models, analyzed the key technologies and core algorithms for building large models in the pathology field, explored the potential value of large models in clinical practice, education, and research, and summarized the challenges encountered in practical applications along with directions for improvement.

Citation: Da Q, Wang S, Wang W, et al. Progress and challenges of pathology AI in the era of large models[J]. Chinese Journal of Pathology, 2025, 54(03): 305-309.
1Overview of Pathology Large Models

Pathology large models refer to large-scale models applied in the pathology field. By training on massive pathology data, they effectively assist in pathological diagnosis, feature analysis, and medical decision-making. Pathology large models can generally be divided into three categories: language models, image models, and multimodal models (e.g., language-image models).

  • Language models primarily process text. In pathology applications, they are mainly used for information extraction, knowledge Q&A, summarization, and report text generation tasks.
  • Image models are built based on pathology images as foundation models, encoding pathological image features and converting images into feature values for subsequent image-based downstream task modeling, such as tumor region identification, benign/malignant classification or grading, prognosis prediction, and quantitative analysis.
  • Multimodal models, such as language-image models, are trained on paired pathology images and pathology report texts, enabling cross-modal application tasks such as image-based text Q&A.

Since 2023, with the rapid development of histopathology and molecular pathology, combined with innovations in computer vision and bioinformatics, multiple large models have been released globally, including PLIP, Virchow, CONCH, UNI, Prov-GigaPath, PRISM, and PathChat.

Article image

Figure 1. Overview of pathology large models

2Research and Development of Pathology Large Models

The R&D pipeline for pathology large models encompasses data preprocessing, model construction, algorithm optimization, computing power integration, and downstream task fine-tuning. Large models demonstrate many new trends in datasets, algorithms, computing power, and collaborative models.

  • Data | Large model training datasets are massive and widely distributed

    Large model training involves petabyte (PB)-level or even hundreds of trillions of pixels of pathology images and text information. Multimodal, cross-scale, and broad-disease spectrum sample data lay the foundation for performance improvement.

  • Algorithm | Large model algorithm architectures are highly scalable

    Large models often employ self-supervised learning strategies such as DINO and DINOv2, allowing training on unlabeled data, greatly expanding available data sources, reducing dependence on manually annotated data, and enhancing model generalization capabilities.

  • Computing Power | Training computing power shows a distributed parallel trend

    Large model training is built on distributed and parallel computing architectures, combining model and data parallel strategies to construct large-scale computing clusters.

  • Collaboration Model | Team division of labor and collaboration

Self-supervised pathology large models first pre-train on large-scale unlabeled data, then fine-tune with small samples after introducing downstream tasks, involving data processing, algorithm construction, and computing power support, which promotes multi-team collaboration.

Article image

Figure 2. R&D workflow of pathology large models

3Advantages of Pathology Large Models
  • At the data level, AI large models can efficiently process multi-dimensional massive data, fuse multimodal data including text, WSI images, and genomics. Orthogonal information modalities improve medical data associations, providing more accurate clinical trajectories and prognoses.
  • At the algorithm level, on one hand, large models evolve from visual encoders to vision-language models; on the other hand, large models use zero-shot classification capabilities to meet multiple downstream task needs.
4Application Scenarios of Pathology Large Models

Pathology large models demonstrate outstanding performance in downstream tasks, mainly in the following areas:

  • Cancer typing diagnosis.
  • Prediction of tumor-related biomarkers.
  • Prediction of disease prognosis and patient survival.
  • Multimodal retrieval and automatic pathology image report generation.
  • Multimodal generative AI assistants.
  • Few-shot and zero-shot downstream tasks.
5Limitations and Challenges of Pathology Large Models
  • Training Datasets

    Data collection, processing, and usage face numerous challenges, such as inconsistent pathology scanner image formats, non-standard and imbalanced training data volumes, diverse data types, and high annotation costs, leading to underutilized data value that hinders development.

  • Training Costs

    The pre-training phase of pathology large models requires extremely high R&D and deployment costs, demanding massive computing resources, hardware deployment, and system maintenance. The economic costs are substantial with significant energy consumption. The minimum hardware standard for pathology large models is 8 NVIDIA V100 GPUs with 32GB VRAM each.

  • Inherent Model Limitations
    1. Insufficient generalization capability of models for diverse pathology sample data; 2) Model interpretability during decision-making is not transparent; 3) Model accuracy and stability are affected by data quality; 4) Large model "hallucinations"; 5) Comprehensive evaluation methods for large model accuracy and safety are still insufficient.
  • Large Model Regulations

    Physicians should balance AI assistance with clinical skills to avoid over-reliance, as model updates may lag behind medical advances, potentially leading to diagnostic bias.

6Future Outlook for Pathology Large Models

Pathology large models are expected to expand healthcare horizons through full-lifecycle health data management in the future, reflected in:

  • Multimodal data integration and interdisciplinary fusion, providing comprehensive diagnosis and treatment recommendations, integrating genomic data for precision medicine;
  • Personalized precision medicine, customizing plans based on patient conditions, improving efficacy and quality of life;
  • Real-time dynamic monitoring of pathological processes, timely adjusting plans, and monitoring pathology images to predict relapse risk;
  • Serving as educational and training tools, enhancing skills and knowledge, simulating diagnoses with case feedback, and improving learning outcomes through interactive learning;
  • In global health and prevention, analyzing pathology data to reveal trends and risks, providing evidence for policy-making and prevention, and promoting health equity through cross-regional applications.

Article image

Academic Conference