Acquired genetic and cell-state changes in IDH-mutant glioma progression

June 4, 2026

51

Human glioma samples

Frozen glioma tissue specimens with the IDH1 or IDH2 mutation were collected with informed consent from the following tissue source sites: Saint Joseph’s Hospital, the Luxembourg Institute of Health-NORLUX (LIH-NORLUX), the MD Anderson Cancer Center (MDACC), Seoul National University (SNU) Hospital and the Pitié-Salpêtrière Hospital. Sample collection was approved by each tissue source site’s Institutional Review Board (IRB). The IRB protocol numbers of respective institutes are as follows; MDACC, 2012-0441; LIH-NORLUX, 201201/06; SNU, H-2004-049-1116; Pitié-Salpêtrière Hospital, 96-900; and Saint Joseph’s Hospital, 2020-NHSR-0084. IDH mutation status and chromosome 1p and 19q co-deletion status were obtained from pathology reports and confirmed by DNA-seq in this study when available. Detailed clinical information, including patient sex, age at diagnosis and treatment information, is provided in Supplementary Table 1. Astrocytoma samples with evidence of CDKN2A homozygous deletion were assigned grade 4 status in accordance with the 2021 WHO diagnostic criteria¹. Adjacent fragments from each glioma sample were aliquoted for single-nucleus assays and bulk tissue nucleic acid isolation when sufficient material was available.

Tissue processing and nucleus isolation

Isolation of nuclei from frozen glioma samples was performed by three separate laboratories using two different protocols. Frozen glioma samples from the same patient were always processed together. Laboratory 1 and laboratory 2 processed the cohorts from MDACC and Pitié-Salpêtrière Hospital using a previously described protocol⁴⁶. In brief, frozen samples were thawed and mechanically dissociated in 0.49% CHAPS with salts and Tris (ST) buffer (10 mM Tris-HCL pH 7.5, 146 mM NaCl, 1 mM CaCl₂ and 21 mM MgCl₂). Single-nucleus suspensions were filtered using a 40 μm strainer, centrifuged at 500g for 5 min and resuspended in ST buffer with 0.01% BSA (Sigma). Final nucleus suspensions were stained by Trypan blue and counted using a haemocytometer. The number of nuclei was then determined for use in a 10x Genomics workflow (that is, targeted nucleus recovery) or a FACS-sorting for Smart-seq2 workflow.

Laboratory 3 processed the cohorts from LIH-NORLUX, SNU and Saint Joseph’s Hospital with a protocol using EZ lysis buffer (Millipore Sigma). In brief, frozen samples were thawed and mechanically dissociated in Nuclei EZ lysis buffer via dounce homogenization. The solutions were incubated on ice for 5 min and mixed 1–2 times during incubation. Single-nucleus suspensions were filtered through a 70 μm strainer and centrifuged at 500g for 5 min at 4 °C, resuspended in Nuclei EZ lysis buffer and incubated on ice for 5 min. The solutions were centrifuged at 500g for 5 min at 4 °C and resuspended 3 times in 1% BSA and 0.2 U μl^–1 RNase inhibitor and PBS buffer. The final suspension of nuclei was stained with DAPI, filtered through a 40 μm strainer and counted using a Countess II automated cell counter (Thermo Fisher Scientific). The number of nuclei was then determined for use in 10x Genomics 3′ RNA or Multiome workflow.

Bulk DNA-seq

Bulk DNA-seq was performed in a tissue source site-specific manner. For samples from the MD Anderson Cancer Center, DNA was extracted from each frozen glioma sample and blood sample corresponding to the patients using a DNeasy Blood & Tissue kit (Qiagen). The genomic DNA (100–250 ng) was acoustically sheared using an ultrasonicator (Covaris), targeting 150 bp fragments. Library preparation was performed using a KAPA HyperPrep kit (KAPA Biosystems) followed by clean-up using AMPure XP beads (Beckman Coulter). Exome capture was performed using a custom exome bait (manufactured by Twist Biosciences to The Broad Institute’s specification). Captured libraries were sequenced with 150-bp paired-end sequencing on a NovaSeq 6000 (Illumina). For the samples from Pitié-Salpêtrière Hospital, after the DNA was fragmented using a LE220 ultrasonicator (Covaris) and size selected, library preparation and capture were performed using a Twist Human Core Exome kit (Twist Bioscience) according to the manufacturer’s protocol. Sequencing was performed on a NovaSeq 6000 (Illumina). WGS data were generated for frozen samples from Saint Joseph’s Hospital, SNU and LIH-NORLUX. In brief, DNA was extracted from each glioma sample using an AllPrep DNA/RNA Minikit (Qiagen) for samples with sufficient tumour tissue and matched normal blood when it was available. DNA was sheared to 400 bp using a LE220 ultrasonicator (Covaris) and size-selected using AMPure XP beads (Beckman Coulter). Whole-genome libraries were prepared and sequenced with 150-bp paired-end sequencing on a NovaSeq 6000 (Illumina). Whole-exome sequencing was additionally performed for Saint Joseph’s Hospital samples using Agilent SureSelect Human All Exon v7 capture kit, followed by 150-bp paired-end sequencing on a NovaSeq 6000 (Illumina). The WGS data for the SNU cohort was previously reported¹⁴.

snRNA-seq and ATAC–seq

A 10x Chromium Single Cell 3′ Reagent kit v.3 (10x Genomics) was used according to the manufacturer’s protocol. In brief, nuclei were loaded on a Chromium chip (10x Genomics) with a target cell recovery of 6,000–8,000 nuclei and processed in a Chromium Controller. Single-nucleus samples were partitioned into gel beads-in-emulsion (GEMs), followed by RNA reverse transcription with barcoding. Libraries were created by breaking GEMs and pooling barcoded fractions, cDNA amplification, fragmentation and attachment of a sample index and adapter and sequenced on a Nextseq500 or Novaseq (Illumina). For three samples in the Saint Joseph’s Hospital cohort, a 10x Chromium Single Cell 3′ Reagent kit v.3 was used. For LIH-NORLUX, SNU and the remainder of the Saint Joseph’s Hospital samples, nuclei were loaded on a Chromium chip with a target cell recovery of 6,000 nuclei for 10x single-cell Multiome ATAC and gene expression according to the manufacturer’s protocol.

Nucleus sorting for Smart-seq2

Single-nucleus samples stained using Vybrant DyeCycle Ruby stain (Thermo Fisher Scientific) were sorted on a FACS Aria Fusion sorter (Becton Dickinson) with a 640 nm laser (670/14 filter). After doublets were discriminated, singlet nuclei were selected with Ruby positive and were sorted into 96-well plates containing TCL buffer (Qiagen) with 1% β-mercaptoethanol. After sorting, plates were immediately frozen on dry ice and stored at –80 °C before Smart-seq2 experiments.

The Smart-seq2 protocol was performed as previously published with slight modification for single-nucleus profiling^3,47. In brief, on 96-well plates, RNA derived from single nuclei was first purified with Agencourt RNAClean XP beads (Beckman Coulter). Then, Oligo-dT primed reverse transcription was performed using Maxima H Minus reverse transcriptase (Thermo Fisher Scientific) and locked TSO oligonucleotides (Qiagen). This was followed by PCR amplification (22 cycles) using KAPA HiFi HotStart ReadyMix (KAPA Biosystems) with subsequent Agencourt AMPure XP bead purification. Libraries were tagmented using a Nextera XT Library Prep kit (Illumina) with custom barcode adapters. Pooled libraries were sequenced on a NextSeq 500 or Novaseq 6000 sequencer (Illumina).

Somatic variant detection and analysis

DNA-seq alignment, fingerprinting, somatic variant detection (Mutect2) and copy number segmentation were performed in accordance with the Genome Analysis Toolkit (GATK) best practices using GATK (v.4.0.10.1), as previously described^11,14. DNA fingerprint analysis using ‘CrosscheckFingerprints’ (Picard) confirmed that all samples belonging to a patient came from the same individual, which indicated that there were no sample mismatches in this study. Patient glioma samples without matched normal blood were analysed for CNAs, and tumour-only single-nucleotide variant detection was performed with a panel of normal references according to GATK best practices. Tumour-only somatic variants were used only to confirm IDH1 mutation status and to identify any hypermutation events. To identify samples with increased small deletion mutation burden, the following thresholds were applied: the recurrence-specific small deletion mutation burden needed to be >0.2 mut per Mb sequenced and a >0.1 mut per Mb increase when comparing all small deletion variants in the matched recurrence versus the initial glioma sample. Mutational signature estimation was performed using COSMIC signatures as previously published^11,48, including the SigProfilerExtractor tool (v.1.2.1). For samples with matched blood, treatment-associated DNA hypermutation was determined by an elevated mutation burden (>10 mut per Mb) and COSMIC SBS11 signal (alkylating agent-associated signature). For samples without matched blood, there were no samples with evidence of hypermutation. Samples with acquired increases in the proportion of the genome with SCNAs were defined as samples with a >50% increase in total SCNA burden at matched recurrence relative to the initial sample. In samples without bulk DNA samples at both time points, acquired SCNA burden increase was determined using the snRNA-inferred CNA profiles. Focal PDGFRA amplifications were determined by a combination of GATK CNA detection, DNA methylation array, Sequenza⁴⁹ and AmpliconArchitect, which is part of AmpliconSuite-pipeline (v.0.1344.2), which was run using default settings⁵⁰.

snRNA-seq analysis

All snRNA-seq 10x data were preprocessed with cellranger count (cellranger v.6.1.2) using the GRCh38-2020-A reference transcriptome downloaded from the 10x website and with ‘include introns’ set to true. Filtered feature count matrices were loaded into R using Seurat⁵¹ (v.4.1.3) with the Read10X function. To retain high-quality nuclei, nuclei with <1,000 genes detected, >10,000 genes detected or >5% mitochondrial reads were filtered out. DoubletFinder (v.2.0.3) was then used to identify nuclei doublets independently in each sample, and the doublets were subsequently removed⁵². The expression data were then processed by normalization, scaling, PCA based on the 5,000 most highly variable genes, Harmony batch correction with the batch variable set to patient (all samples belonging to a patient were processed in the same 10x run) and laboratory (for astrocytomas), UMAP and clustering using the Louvain algorithm in Seurat with resolution set to 0.6 (ref. ⁵³). Cell-type annotation was performed on the basis of gene expression markers for each cluster and confirmed via mapping to a normal reference brain atlas (Azimuth) that provided a per-nuclei cell-type prediction. This process identified clusters with a small number of nuclei expressing multiple cell-type signatures. These nuclei were presumed to be residual doublet populations and were removed. Azimuth integration provided classifications at the cell class level (that is, neuronal versus non-neuronal) and a more granular subclass (astrocyte, oligodendrocyte precursor cell, endothelial, among others) that confirmed appropriate assignment by expression clustering.

Malignant and non-malignant cell-type assignment

Malignant status was confirmed by integrating expression-based clusters with snRNA-inferred copy number. Specifically, large-scale CNAs were determined for each sample separately using the software inferCNV of the Trinity CTAT Project (https://github.com/broadinstitute/inferCNV), with the reference nuclei set to myeloid and oligodendrocytes from that sample. For one sample without sufficient myeloid or oligodendrocyte nuclei, its matched initial sample’s reference nuclei were used to infer copy number. Predicted copy number altered regions based on a moving average of a 101 gene window were determined with the hidden Markov model (HMM) parameter enabled. The HMM-predicted CNA levels were extracted to define three metrics used in defining malignant and non-malignant nuclei:

1)

CNA signal per chromosome: the proportion of genes gained and lost per chromosome that produces two chromosome-specific metrics (for example, the proportion of chromosome 1 with copy gain and the proportion of chromosome 1 with copy loss) that reflected the CNA burden per chromosome.
2)

CNA signal per nucleus: the mean of all per-nucleus chromosome CNA signals, which reflects the total CNA burden of that nucleus independent of gains or losses.
3)

CNA correlation: the Pearson’s correlation between the CNA signal per chromosome values and the average per chromosome CNA profile of all expression-based malignant nuclei or bulk DNA CNA profile from the same glioma sample.

To assign an integrated expression and to inferCNV malignant cell definition, different strategies were used for oligodendrogliomas and astrocytomas, respectively. For oligodendrogliomas, the presence or absence of the tumour-type-defining chromosome 1p and 19q co-deletion was used to confirm malignant status. Specifically, nuclei that were annotated as malignant by gene expression clusters that had >0.15 proportion of chromosome 1 or chromosome 19 with copy number loss were retained as malignant, whereas nuclei not meeting this criterion were classified as ‘unresolved’. For the nuclei annotated as non-malignant by expression clustering, that were not assigned to the neuronal class by Azimuth integration and had >0.15 proportion of chromosome 1 and chromosome 19 with copy number loss (that is, a clonal CNA event) were classified as unresolved. For astrocytomas, nuclei annotated as malignant by gene expression that had a >0.3 CNA correlation coefficient or a per nucleus CNA signal of >0.15 were retained as malignant, whereas nuclei not meeting this criterion were classified as unresolved. For the nuclei annotated as non-malignant by expression clustering, that were not assigned to the neuronal class by Azimuth integration and had >0.3 CNA correlation or >0.15 per nucleus CNA signal were classified as unresolved. Nuclei classified as unresolved were excluded from downstream analyses. Furthermore, for samples in which bulk DNA CNA data that passed quality control or had purity metrics were available, a per-chromosome metric representing the fraction of the chromosome gained or lost in bulk DNA was determined and correlated with the per-chromosome CNA signal (inferCNV) of each nucleus, which provided an additional layer of assessment.

Deriving metaprograms from gene expression data

To capture the heterogeneity among nuclei of the same type, we used NMF⁵⁴, a method previously applied to identify variability in single-cell and single-nucleus expression data^18,19,20. NMF was performed on the relative expression values of each sample independently after setting negative values to zero. The NMF algorithm requires the pre-definition of the k parameter, which represents the expected number of latent features in the data. As k varies between samples and is largely unknown, we applied the NMF algorithm to each sample using a range of values (3–10). Each of these NMF programs was summarized by the top 50 genes based on NMF coefficients. The process of deriving metaprograms from the NMF programs has been previously detailed¹⁸ and is described in brief here. The method first eliminates NMF programs that are either not robust (that is, they do not recur in or across samples) or redundant in a sample (that is, they significantly overlap with other NMF programs in the same sample). The robust NMF programs are then clustered based on Jaccard similarity using a clustering algorithm that iteratively assesses the degree of overlap between programs, combining highly overlapping ones into a cluster. Each cluster is defined by the top 50 most recurrent genes to form a metaprogram.

The algorithm identified 12 malignant metaprograms, which were annotated using functional enrichment analysis (for example, GO terms, mSigDB Hallmark gene sets and gene sets derived from normal brain development datasets). Metaprograms were excluded if they reflected quality control issues (for example, mitochondrial or ribosomal genes). Metaprograms were also excluded that were not detected in both tumour types with a minimum 25% contribution from at least one tumour type. The metaprograms derived from the primary analysis included OPC, AC1, AC2, MES and CC. We repeated the metaprogram analysis for the Undifferentiated population and identified a NPC metaprogram, which was added to the malignant metaprograms used for cell-state assignment.

To provide an orthogonal description of the gene-based metaprograms, we modified the NMF-based algorithm by considering single-nucleus pathway activities instead of gene expression. To do this, we identified 6,086 gene sets collected from Hallmark, Gene Ontology and KEGG that had a minimum of 15 genes that also overlapped with the snRNA data. Pathways were scored independently for each malignant nuclei in a sample. The NMF-based approach to determine metaprograms was then applied as defined above for the pathway activities scores. This resulted in 11 pathway-based metaprograms (PMPs) that were named according to the most represented pathway activities, which included pathways such as mitochondrial energy production as well as glycolysis and stress PMPs shown in Extended Data Fig. 5i.

Cell-state scoring and assignment of nuclei to states

The Seurat function AddModuleScore was used to score each 50-gene metaprogram against a background set of genes to derive a score. Malignant nuclei were scored for the NMF metaprograms independently for each sample. To facilitate cell-state classification, we generated 20 shuffled expression matrices by sampling each time 5,000 nuclei and shuffling the expression values for each gene. We then scored each shuffled matrix for the NMF metaprograms, thereby producing 100,000 normally distributed scores for each expression program. These served as null distributions for cell-state classification. For each nucleus, we computed a P value for each of the expression programs with a Z-test (R’s pnorm function) using the statistics of the null distributions that we previously generated. We adjusted all P values for multiple testing using the Holm method. Each cell was classified into a specific state if the adjusted P value computed for that state was <0.05. Nuclei that achieved an adjusted P < 0.05 for multiple states were assigned to the state for which they scored maximally. Nuclei that did not achieve an adjusted P < 0.05 for any of the states were assigned an Undifferentiated state. Unlike the other metaprograms that are positively defined by metaprogram expression, the Undifferentiated state is defined by the absence of high AC-like, MES-like, NPC-like and OPC-like expression. Although we observed that Undifferentiated nucleus transcriptional profiles are consistent with an intermediate or transitional stem-like state, it is possible that rare states not commonly found across patients with an IDH mutation are assigned to the Undifferentiated state. Nuclei were assigned a ‘cycling’ state on top of their cellular state if they achieved an adjusted P < 0.05 for the cell cycle metaprogram and non-cycling otherwise. Cell-state assignment for the external scRNA-seq, internal Smart-seq2 cohorts, and myeloid nuclei used a threshold of a nominal P < 0.05. Separately, malignant nuclei were scored for previously published malignant and non-malignant signatures using Seurat’s AddModuleScore function. Following gene set score calculation, the IDH-mutant hierarchy coordinates for stemness and lineage scores were calculated as previously described^10,24. In brief, the lineage scores were determined by the maximum AC-like and OC-like gene set program, whereas the stemness score reflected the difference in the stemness gene set and lineage score.

Pseudobulk snRNA analyses

Pseudobulk transcriptional profiles for PCA and differential expression from snRNA data were generated with the R package presto’s collapse_counts function (https://github.com/immunogenomics/presto). Pseudobulk profiles were constructed for malignant state-specific profiles with a minimum number of 20 nuclei per sample per state. Differential expression for pseudobulk profiles was performed using presto’s pseudobulk_deseq2 function with a design matrix that included patient, tumour type and time point. Enrichment of differentially expressed genes was performed using the R package fgsea (https://github.com/ctlab/fgsea/) and gene sets from MsigDB’s Hallmark gene sets. The pseudobulk profiles were log₂-normalized, and the top 1,000 most variable genes were used as input for PCA.

Analysis of external IDH-mutant scRNA-seq and snRNA-seq datasets

Re-analysis of external cohorts were separately preprocessed from raw counts. When unavailable, raw counts were recovered from normalized counts using a script from GitHub (https://github.com/immunitastx/recover-counts). Cell-type identification was either provided with the study or was determined via dimensionality reduction and clustering followed by copy number inference to confirm malignant status. Derivation of gene expression metaprograms for malignant cells or nuclei and malignant-cell-state classification was performed as described above for the CARE IDH-mutant cohort. The gene expression metaprograms for CARE IDH wild-type glioblastoma were determined by the same laboratories for 121 IDH wild-type glioblastomas using the same profiling and analytical methods^23,55. The Jaccard similarity index was used to assess overlap of specific genes used across the metaprograms derived from different cohorts.

Bulk RNA deconvolution and cell cycle scoring

Cell-type abundances were estimated from bulk RNA-seq gene expression matrices using CIBERSORTx^14,56. In brief, a reference signature matrix was created from the CARE IDH-mutant snRNA dataset for samples with accompanying bulk RNA-seq using the ‘Create Signature Matrix’ module on the CIBERSORTx webserver (https://cibersortx.stanford.edu) with default parameters, except the minimum expression was set to 0.5 per instructions for 10x data. We downsampled our larger single-nucleus gene expression matrix and used the major cell types and the malignant-cell states to construct the signature matrix. Transcriptionally similar cell types (for example, excitatory and inhibitory neurons, mural and endothelial cells) were collapsed following initial assessment of CIBERSORTx deconvolution performance. Cell-type abundances were then imputed using the ‘Impute Cell Fractions’ module in single-cell mode (S-mode batch correction) and 100 permutation parameters for TCGA and GLASS bulk RNA expression datasets^13,14. Cell-type abundance inference performance in bulk RNA samples was assessed using Pearson’s correlation coefficient for glioma samples for which both snRNA and bulk RNA-seq were collected from the same tumour sample. Malignant cell cycle and myeloid metaprograms in bulk RNA-seq were assessed using a ssGSEA score from the R package GSVA⁵⁷.

snATAC–seq analysis

snATAC–seq data were available for a subset of the CARE IDH-mutant cohort from 10x Multiome data. snATAC–seq data were processed using Cell Ranger ARC (v.2.0.2). The resulting fragment files were imported by ArchR⁵⁸ and Arrow files were created with minimum quality metrics of transcriptional start site (TSS) enrichment of 4 and 1,000 fragments. The ArchR function addDoubletScores with k = 10, and filterDoublets was used to identify and remove doublets. Following doublet removal, nuclei were filtered down those that passed quality control for both snRNA and snATAC. The addIterativeLSI function was used to compute an iterative latent semantic indexing dimensionality reduction for a tile matrix of both all nuclei and then separately restricted to only malignant nuclei. For malignant nuclei, Harmony batch correction was performed by sample. During preprocessing, ArchR created a gene activity score, which reflects the regulatory region chromatin accessibility around a gene. For differential accessibility analyses, cell-state-specific comparisons of both gene activity scores and peak matrices were performed using a Wilcoxon rank-sum test adjusted for bias in TSS enrichment and the log₁₀[number of fragments], as implemented by ArchR. TF activity was predicted on a per-cell basis using chromVAR to calculate a z-score based on the per-cell accessibility of a given TF motif (cisbp) that deviates from the expected accessibility based on the average of all nuclei. The R package Copy-scAT was used to determine focal PDGFRA CNA in the snATAC data³⁶. This analysis was restricted to samples with a focal PDGFRA amplification determined by WGS and, in one case for which the whole genome was unavailable, DNA methylation array. In brief, the Copy-scAT function ‘identifyDoubleMinutes’ was used to detect focal copy number amplifications per malignant nucleus from a read depth normalized and scaled fragment matrix. This method uses a mean-variance changepoint analysis to detect amplified signal across each chromosome and amplified regions containing the PDGFRA locus were extracted for downstream analysis.

Single-nucleus genotyping of bulk WGS-derived mutation calls in hypermutant samples

To detect bulk DNA-seq variants in the single-nucleus data, VarTrix was used (v.1.1.22) to extract single-nucleus variant information (https://github.com/10XGenomics/vartrix). VarTrix takes an input variant call format (VCF) file, single nucleus bam file, fasta file and outputs whether the variant had no call (no reads detected), ref/ref (only reference allele detected), alt/alt (only alternative allele detected) or alt/ref (both alleles detected) in a single cell. VarTrix was run both on snRNA and snATAC data and only applied to patients who had at least one glioma sample with hypermutation owing to the few callable sites in samples with only whole-exome data and non-hypermutant samples. All analyses were restricted to nuclei with at least 20 callable sites and at least one variant consistent with a mutation. A callable site was defined as any reference or alternative allele detected at a variant for the associated bulk DNA VCF. The C>T mutation burden metric was defined as the total number of detected C>T variants in the single-nucleus data divided by the total number of callable sites for that cell. A threshold of C>T mutation burden of high versus low was determined per platform (that is, RNA or ATAC). For each platform, a background C>T mutation burden was established by running VarTrix using hypermutant VCF from a patient on its matched non-hypermutant initial sample. Among the hypermutant samples, all nuclei with a C>T mutation burden two times the maximum non-hypermutant (initial) nuclei C>T mutation burden value were assigned as C>T high and all others as C>T low.

Measuring the transcriptional distance between matched pairs

To quantify the transcriptional distance between matched longitudinal pairs, state-controlled pseudobulk profiles were constructed. State was controlled for by downsampling both samples to 25 nuclei per state, provided that at least 25 nuclei were present in each sample for that state. This process ensured that both samples had an equal number of nuclei and were balanced in terms of gene expression across the different cellular states. The pseudobulk profiles were generated by averaging gene expression for each gene across the selected nuclei. Finally, Euclidean distance was calculated between the two resulting vectors, which was used as the measure of transcriptional distance.

Patient-derived GSCs

IDH-mutant GSCs T394NS and T407NS were derived from patient tumours (P61T1 and P61T2, respectively in this cohort) and expanded in orthotopic xenograft models as previously described³⁸. Cells were cultured as non-adherent neurospheres in serum-free Neurobasal medium (ThermoFisher, Gibco 21103049) supplemented with 1× B-27 (ThermoFisher, 17504044), 2 mM UltraGlutamine (Lonza, BE17-605E/U1), 0.1 U ml^–1 heparin (Sigma-Aldrich, H3149), 20 ng ml^–1 bFGF (Miltenyi, 130-093-839), 20 ng ml^–1 EGF (Provitro, 1325960500) and 100 U ml^–1 penicillin–streptomycin at 37 °C, 5% CO₂. Lines were authenticated by short tandem repeat profiling and routinely tested mycoplasma-negative.

IDH1 mutation validation in human MGG152 by Sanger sequencing

Genomic DNA was extracted from IDH-mutant GSC MGG152 cells³⁷ using a QIAamp DNA Micro kit (Qiagen 56304). Next, 500-bp IDH1 amplicons were amplified by using IDH1 sense primers 5′-AATGAGCTCTATATGCCATCACTG-3′ and antisense primers 5′-TTCATACCTTGCTTAATGGGTGT-3′ with 200 ng genomic DNA template. The PCR conditions were as follows: 5 min at 95 °C, 40 cycles of 30 s at 95 °C, 30 s at 61 °C, and 30 s at 72 °C, followed by a final extension step for 7 min at 72 °C. PCR products were purified using a QIAquick PCR Purification kit (Qiagen 28104). By using both IDH1 sense primers and antisense primers listed above, purified PCR amplicons were sequenced on Applied Biosystems 3730xL DNA Analyzers at Yale DNA Sequencing Core.

Immunoblotting for MGG152 CDKN2A
^–/– experiments

IDH1^R132H and CDKN2A^−/− in cells was confirmed via immunoblotting (IDH1^R132H Dianova, DIA-H09 and p16 (CDKN2A), Cell Signaling 80772T). β-Actin (Cell Signaling, 3700S) was used as a loading control.

Co-culture of mouse macrophage and human IDH1-mutant GSC MGG152 cells

Primary mouse macrophages were isolated from the peritoneal cavity of stimulated C57BL/6 (wild-type) mice (3 males and 3 females, 10 weeks of age), pooled, aliquoted and cryopreserved for downstream use. On day 1 of the co-culture experiment, frozen mouse primary macrophages were rapidly thawed, and 250,000 cells were seeded in 6-well plates in DMEM medium (ThermoFisher, Gibco 11965092) with 10% FBS (ThermoFisher, Gibco A5670701) supplemented with 10 ng ml^–1 recombinant M-CSF (BioLegend, 576406). After 12 h, macrophages were washed with MGG152 complete medium 4 times to remove residual FBS, and 450,000 dissociated MGG152 cells were seeded into the wells with macrophages for co-culture. MGG152 complete medium consisted of Neurobasal medium (ThermoFisher, Gibco 21103049) with 0.5% N-2 supplement (100×) (ThermoFisher, Gibco 17502048), 2% B-27 supplement (50×) (ThermoFisher, Gibco 17504044), 1.5% l-glutamine (Corning 25-005-CI), 0.5% antibiotic–antimycotic (Corning 30-004-CI), 20 ng ml^–1 human FGF (PeproTech AF-100-26), 40 ng ml^–1 human EGF (PeproTech AF-100-15), and 10 ng ml^–1 recombinant M-CSF. The same composition for the medium was used for both monoculture and co-culture conditions. Both monoculture and co-culture cells were irradiated using Irradiator X-RAD 320 (Precision X-ray) with 2 Gy dose on each of the subsequent 3 days. After 72 h of recovery period, the cells were collected, dissociated and submitted for 10x scRNA-seq (10x Genomics).

PDGFRA inhibitor dose–response assay

T394NS and T407NS GCSs were dissociated with accutase and seeded into ultralow-attachment 384-well plates at 500 cells per well in 25 µl complete medium, then placed on an orbital shaker for 24 h to allow uniform sphere formation. Inhibitors targeting the PDGFRA pathway, dasatinib (inhibitor of SRC, ABL and PDGFR) and CP-673451 (selective PDGFRα and PDGFRβ inhibitor) from Selleck Chemicals, were prepared as 10 mM DMSO stocks (Sigma-Aldrich) and applied in a twofold, nine-point dilution series spanning 20 µM to 40 nM; DMSO controls were included on every plate. Each condition was tested with four technical replicates and repeated in two independent experiments. Viability was measured after 3 and 6 days after treatment using CellTiter-Glo 3D (Promega) according to the manufacturer’s instructions, and luminescence was recorded on a CLARIOstar plate reader (BMG Labtech). Signals were background-subtracted and normalized to DMSO controls, and dose–response curves were fit in GraphPad Prism using a four-parameter logistic (variable-slope) model with log[inhibitor] versus normalized response to derive IC₅₀ values.

PDGFRA inhibitor treatment

Based on IC₅₀ determination, 3 × 10⁶ IDH-mutant GSCs were seeded per T75 flask in 15 ml NSC medium and treated with either dasatinib or CP-673451 at 1 µM or 5 µM; DMSO served as the vehicle control with matched final concentrations across all conditions. GSCs were exposed continuously for 6 days, after which spheres were dissociated to single cells, washed in PBS and viability assessed with Trypan blue. Each condition was set up in triplicate with one replicate subjected to FACS sorting of viable cells for scRNA-seq.

scRNA-seq using 10x Genomics

PDGFRA inhibitor experiments

From each condition, one replicate was FACS-enriched for viable cells and prepared using a 10x Genomics Chromium GEM-X Single Cell 3′ Gene Expression kit (v.4). Cells were loaded onto a Chromium chip with a target recovery of about 6,000–8,000 cells and run on a Chromium Controller. Individual cells were encapsulated in GEMs, in which reverse transcription and barcode incorporation occurred. Libraries were generated by disrupting the emulsions, pooling the barcoded material, amplifying cDNA, fragmenting it and adding sample indices and adapters. Library size and concentration were assessed with an Agilent High Sensitivity DNA kit on a Bioanalyzer 2100. The pooled single-cell libraries were sequenced on an Illumina NovaSeq 6000.

CDKN2A
^–/– experiments

Parental MGG152, MGG152 non-targeting gRNA control, MGG152 CDKN2A^−/− sg1 and MGG152 CDKN2A^−/− sg3 cells were collected and dissociated into single cells. The single-cell suspension of all four cell lines was washed individually 3 times with 0.1% BSA in PBS and resuspended at 1 × 10⁶ cells per ml. Single cells were processed through a 10x Chromium 3′ Single Cell Platform using Chromium Single Cell 3′ Library, Gel Bead and Chip kits (v.3) following the manufacturers protocol. In brief, 10,000 cells were loaded into each channel of the chip to be partitioned into GEMs in a Chromium instrument, followed by barcoded reverse transcription of RNA in the droplets. This was followed by amplification, fragmentation and addition of an adaptor and sample index. Libraries from four 10x channels were pooled together (twice) and sequenced on one lane of an Illumina Next-Seq 1000/2000 sequencer with paired-end reads: read 1, 28 nucleotides; read 2, 90 nucleotides; index 1, 10 nucleotides; and index 2, 10 nucleotides.

Co-culture experiments

Mouse macrophage and human MGG152 cells were dissociated and processed using the 10x Chromium Single Cell 3′ v4 On-Chip Multiplexing (OCM) protocol. In total, 5,000 cells per experimental group were loaded onto the microfluidic system and all four conditions (MGG152-only, MGG152 + macrophage, MGG152-only with irradiation, and MGG152 + macrophage with irradiation) were processed together for each replicate. The resulting libraries were sequenced on an Illumina NovaSeq X.

IDH-mutant organoids

Three IDH-mutant astrocytoma patient-derived organoids were established from freshly resected tumour specimens obtained from the Yale University Department of Neurosurgery, following a previously described protocol⁵⁹. Tumour tissue diagnosis was performed by neuropathological assessment and confirmed with molecular information provided by clinical whole-exome sequencing. All three organoid models maintained the IDH1^R132H mutation as confirmed by immunohistochemistry. For irradiation experiments, organoids were divided and allocated to either a single 10 Gy dose on day 0 or an untreated control condition, followed by a 7-day recovery period before collection. Organoids were dissociated using a Miltenyi Biotec tumour dissociation kit (Miltenyi Biotec 130-095-929) and processed through the 10x Chromium Single Cell 3′ v4 protocol. Owing to low number of organoids per condition, and therefore the low number of cells per experimental condition, all cells from each model were loaded. The resulting libraries were sequenced on an Illumina NovaSeq X. Two organoid models (GBO60 and GBO74) had sufficient myeloid cells to be analysed in Extended Data Fig. 10i.

scRNA-seq analysis for perturbation experiments

All in vitro perturbation 10x Genomics Single Cell 3′ data were analysed using Cell Ranger (v.9.1). This version was selected because Cell Ranger v.9.0 and later are required to analyse 10x OCM libraries. For OCM libraries, the reference transcriptome was the hybrid GRCh38_and_mm10-2020-A, per the manufacturer’s instructions. For the remaining libraries, the GRCh38-2020-A reference transcriptome was used, consistent with the human glioma samples.

Survival analyses

Survival analyses, including Kaplan–Meier plots and Cox proportional hazards models, were conducted using the R packages survminer and survival. Multiple-comparison correction was applied where indicated. Wilcoxon rank-sum and Wilcoxon signed-rank tests were two-sided unless otherwise noted.

Statistical methods

All data analyses were conducted with R (v.4.2.0 and above), Python (v.3.6 and above) and PostgreSQL (v.14.4).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.