Non-invasive profiling of the tumour microenvironment with spatial ecotypes

Human subjects

All human samples included in this study were collected with informed consent for research use and received approval from the Institutional Review Boards of Yale University School of Medicine and Washington University School of Medicine, in accordance with the principles of the Declaration of Helsinki (2013). These samples, obtained from a total of 123 human subjects, were divided into four cohorts.

Cohort 1

Intact and dissociated tumour samples were collected from seven patients (four with colon cancer and three with melanoma) at the time of surgery. Each sample underwent bulk RNA sequencing and the dissociated tumour samples also underwent scRNA-seq.

Cohort 2

Matched tumour and plasma cfDNA samples were collected from 23 patients with metastatic melanoma, with matched PBMCs also collected for seven. Tumour samples for each patient were profiled by ST and/or whole-genome EM-seq, depending on availability (Visium, Visium HD and/or EM-seq). A further 23 plasma cfDNA samples were collected from healthy individuals. All PBMC and plasma cfDNA samples were profiled by whole-genome EM-seq. Matched melanoma samples are graphically depicted in Fig. 4d and a full inventory is provided in Supplementary Table 17.

Cohort 3

Plasma samples were collected from 78 patients with melanoma treated at Yale Cancer Center, including seven from cohort 2, who received ICI monotherapy (30 received anti-PD-1 and five received anti-CTLA-4) or combination therapy (43 received anti-PD-1 and anti-CTLA-4). Samples were collected before treatment initiation (before or on the first day of ICI cycle 1) and underwent whole-genome EM-seq. ICI response was classified as either durable clinical benefit or no durable benefit by a board-certified medical oncologist, reflecting each patient’s disease response six months after ICI initiation. Progression-free survival was determined from the start of ICI treatment.

Cohort 4

Plasma samples were collected from ten patients with melanoma treated at Siteman Cancer Center, including eight from cohort 2, who received immune checkpoint inhibitor (ICI) monotherapy (seven received anti-PD-1) or combination therapy (three received anti-PD-1 and anti-CTLA-4). Samples were collected before or during treatment and underwent whole-genome EM-seq. ICI response was classified as described for cohort 3.

All clinical features, including age and sex, were documented using electronic medical records from Siteman Cancer Center (cohorts 1, 2 and 4) and Yale Cancer Center (cohorts 2 and 3). De-identified clinical characteristics are provided in Supplementary Tables 12, 17, 19 and 20 for the four cohorts. Details of sample processing and sequencing are provided in the Supplementary Methods.

Data collection and processing

scRNA-seq

Single-cell RNA-seq atlases of carcinomas and melanomas, either generated in this work or obtained from published studies as preprocessed data^{4,27,36,59,60,61,62,63,64,65,66} (Supplementary Fig. 1a and Supplementary Table 2), were annotated for B cells, plasma cells, CD4⁺ T cells, CD8⁺ T cells, NK cells, macrophages (including monocytes), dendritic cells, fibroblasts, endothelial cells, epithelial or melanoma cells and other (unclassified) cell types. For publicly available datasets with author-supplied annotations (breast cancer, colon cancer, liver cancer, squamous cell carcinoma, melanoma and brain metastasis), annotations were mapped to the above cell-type labels. Cohort 1 scRNA-seq data, as well as publicly available data without cell-type annotations (bladder cancer, lung cancer, ovarian cancer, prostate cancer and pancreatic cancer), were analysed using Seurat (v.4.3.0)⁶⁷ as described below. For quality control, cells with fewer than 200 detected genes or more than 25% of reads mapped to mitochondrial genes were excluded. Raw counts were imported and cells clustered following SCTransform with the glmGamPoi method, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors and FindClusters. Cell-type annotations were then manually assigned to clusters based on the expression of canonical lineage markers (Supplementary Fig. 1c): MS4A1, CD19, CD79A and CD79B for B cells, IGKC and MZB1 for plasma cells, CD3D, CD4 and IL7R for CD4⁺ T cells, CD3D, CD8A and CD8B for CD8⁺ T cells, GNLY and NCAM1 for NK cells, CD68 and CD14 for macrophages/monocytes, CD1C for dendritic cells, COL1A1, COL3A1, PDGFRA and FAP for fibroblasts, PECAM1 and VWF for endothelial cells, EPCAM for epithelial cells and SOX9, MET, MITF and MLANA for melanoma cells. Small clusters with multilineage marker expression were considered potential doublets or multiplets and eliminated from further analysis (3% of filtered cells from cohort 1, on average).

Visium and legacy ST

Processed data from 54 Visium (standard) and 54 legacy ST profiles of carcinoma and melanoma samples were downloaded from 10x Genomics (https://www.10xgenomics.com/resources/datasets) and 12 previous studies^{27,60,66,68,69,70,71,72,73,74,75,76} (Supplementary Fig. 1a and Supplementary Table 1). Legacy ST refers to the predecessor of 10x Visium, a lower-resolution ST assay with 100 μm spot diameter reported in ref. ⁷⁷. For quality control of publicly available data, genes expressed in fewer than five spots and spots expressing fewer than 200 unique genes were omitted. Processing details of Visium data generated in this study are provided in ‘10x Visium (standard)’ in Supplementary Methods.

MERSCOPE

Preprocessed MERSCOPE profiles of 15 FFPE human tumour specimens, spanning melanoma and six distinct carcinomas, were downloaded from Vizgen (MERSCOPE FFPE Human Immuno-oncology program; https://info.vizgen.com/merscope-ffpe-solution). Three ovarian cancer samples were excluded owing to substantial tissue fragmentation (Supplementary Table 8). For quality control, genes expressed in fewer than five cells and cells with fewer than 300 total transcripts were excluded from each remaining sample.

For cell-type annotation, transcripts were downsampled to 300 per cell, and cells were clustered with Seurat (v.4.3.0)⁶⁷ using the following steps: NormalizeData, FindVariableFeatures (nfeatures = 300), ScaleData, RunPCA, FindNeighbors and FindClusters (resolution = 1). Cell-type annotations were then assigned by cluster based on the expression of canonical markers (Supplementary Fig. 1b) as described for scRNA-seq above, with reclustering of individual clusters, particularly those containing mixed lymphocyte groups, performed for greater granularity as needed.

To remove ambient and improperly segmented mRNAs, publicly available tumour scRNA-seq atlases (‘scRNA-seq’ above) were used to identify genes commonly expressed in each cell type. Specifically, for each cell type, genes expressed in at least 5% of cells in three or more cancer types were identified, resulting in a whitelist for each cell type. Genes in each cell type that were absent from the corresponding whitelist were then set to zero expression in the MERSCOPE data. Subsequently, cells with fewer than five detectably expressed genes were excluded, resulting in a final dataset of 5.6 million evaluable cells from 12 samples (Supplementary Fig. 1a and Supplementary Tables 1 and 8). For MERSCOPE data generated in this study, see ‘Vizgen MERSCOPE’ in Supplementary Methods.

Xenium

Space Ranger results from 11 samples (nine carcinomas and two melanomas) profiled with Xenium V1 (n = 5) and Xenium Prime (n = 6) were downloaded from https://www.10xgenomics.com/resources/datasets (Supplementary Tables 1 and 8). Cell-type annotation by canonical marker expression within clusters (Supplementary Fig. 1b) and subsequent postprocessing were done following the workflow described in ‘MERSCOPE’ above, with two modifications, owing to the lower average number of detected genes in Xenium compared with MERSCOPE: omission of the downsampling step to 300 transcripts per cell; and variable feature selection using FindVariableFeatures with nfeatures = 200. One sample lacking annotatable CD4⁺ T cells was excluded from downstream analysis (Supplementary Table 8). For quality control, cells with fewer than 20 detected genes or with greater than 10% mitochondrial transcript content were omitted.

Visium HD

Space Ranger results from five carcinoma samples profiled with Visium HD (bins of 8 μm × 8 μm) were downloaded from https://www.10xgenomics.com/resources/datasets (Supplementary Tables 1 and 8). Cell-type annotation and postprocessing were performed as described for Xenium data (Supplementary Fig. 1b). For these samples, Visium HD bins generally yielded robust cell-type discrimination, in line with a previous report⁷⁸. One sample lacking annotatable CD8⁺ T cells was excluded from downstream analysis (Supplementary Table 8). For quality control of publicly available data, bins with fewer than 20 detected genes or with greater than 10% mitochondrial transcript content were omitted. Processing details for Visium HD data generated in this study, which were used for SE deconvolution as described below in section ‘Paired tumour and plasma from patients with melanoma’, are provided in ‘10x Visium HD’ in the Supplementary Methods.

Analysis of tumour versus adjacent stroma

Integration of single-cell and spatial transcriptomes

CytoSPACE (v.1.0.3)²⁵ was used to align scRNA-seq data to ST data from the same cancer type, reconstructing transcriptome-wide spatially resolved expression profiles of single cells (Fig. 1c). Alignment was done separately for each ST sample, with source data enumerated in Supplementary Tables 1 and 2. To eliminate potential bias arising from different total unique molecular identifiers (UMIs), raw counts from droplet-based scRNA-seq data were downsampled to 1,500 total UMIs per cell, whereas transcripts per million (TPM) data from Smart-seq2 samples (melanoma) were used without downsampling. The lap_CSPR solver and recommended settings were applied for all analyses, including the default mode for bulk ST (Visium and legacy ST), single-cell mode for single-cell ST data (MERSCOPE) and an average of five cells per spot for Visium data and 20 cells per spot for legacy ST data.

Differential expression analysis

For the results presented in Fig. 1d and Extended Data Fig. 1a–d,e,g, we analysed scRNA-seq data mapped to ST samples, as described above. We also analysed MERSCOPE data directly (without scRNA-seq integration) as a form of reciprocal validation (Extended Data Fig. 1a–c and Supplementary Table 4). As input to the analyses described below, UMI-based scRNA-seq data were normalized for each cell type separately using SCTransform from Seurat (v.4.3.0)⁶⁷; Smart-seq2 data from melanoma samples were normalized to log₂[TPM]; and MERSCOPE data were normalized using NormalizeData from Seurat (v.4.3.0)⁶⁷.

To study transcriptome-wide variation in TME cell types localized to the tumour or adjacent stroma, and given broad cancer coverage and sample availability, CytoSPACE-enhanced Visium data were selected as the primary discovery cohort. Differential expression between tumour and adjacent stroma (annotated as described in Supplementary Methods) was first determined for each cell type and sample separately using the wilcoxauc function from presto (v.1.1.0; https://github.com/immunogenomics/presto)⁷⁹. Log₂-transformed fold changes (LFCs) for each gene were then aggregated by median to avoid bias, and corresponding meta P-values were calculated using Stouffer’s approach⁸⁰ following conversion of two-sided P-values to z-scores. The calculations were done first across sample replicates, then across all samples within each cancer type, and finally across all cancer types in the discovery cohort. Meta P-values were adjusted using the Benjamini–Hochberg method to derive Q-values⁸¹. Significantly differentially expressed genes between tumour and adjacent stroma were identified as genes with: significant differential expression (per-cancer LFC > 0.05 and Q < 0.05) in at least three cancer types; a pan-cancer Q < 0.05; and a pan-cancer median LFC > 0.02. Genes with conserved pan-cell-type enrichment were omitted from this analysis and examined elsewhere (Extended Data Fig. 1e; see below). Among the remaining genes, up to 400 HUGO protein-coding genes (https://www.genenames.org) with the highest LFC in tumour (n $\le $ 200) or adjacent stroma (n $\le $ 200) were visualized across all held-out samples mapped to scRNA-seq data (Fig. 1d and Extended Data Fig. 1d). For cell types with fewer than 200 genes in either region, the minimum number per compartment was selected for balance (Supplementary Table 3). For visualization, data were scaled per column to a maximum absolute LFC of one and genes were ordered by the resulting median enrichment balanced across platforms (Fig. 1d, Extended Data Fig. 1d and Supplementary Table 3).

To cross-validate CytoSPACE-enhanced Visium against single-cell ST data directly (without scRNA-seq integration), we repeated the above analysis using the 500 genes covered by the MERSCOPE panel (section ‘MERSCOPE’ above). We then identified up to 50 HUGO protein-coding genes by median LFC (tumour, n $\le $ 25; adjacent stroma, n $\le $ 25) for each TME cell type and repeated this step independently for each platform. Cross-platform concordance was quantified by Spearman correlation of median LFCs and by the directionality of expression (higher in tumour or adjacent stroma) (Extended Data Fig. 1a–c). All results are detailed in Supplementary Table 4.

To identify spatially polarized genes with conservation across TME cell types, genes with differential expression (pan-cancer LFC > 0.05 and Q < 0.05) in more than 50% of cell types (n = 5) in the Visium discovery cohort were ranked by average LFC across all cell types. For balanced representation, the minimum number of top-ranking genes per compartment was selected for visualization (tumour, n = 138; adjacent stroma, n = 138) (Extended Data Fig. 1e and Supplementary Table 6).

Spatial EcoTyper framework

Despite experimental advances enabling high-resolution expression profiling of cells in situ, leveraging such data to systematically profile the co-association of cell states into SEs and discover conserved SEs across specimens and cancer types has remained challenging. The spatial organization of cell states and their relative abundances in an ecotype can vary across regions and between samples, and even expression profiles of individual cells sharing the same phenotypic state exhibit natural variability. Furthermore, technical drop-out and sample- or platform-specific batch effects all pose obstacles to SE discovery.

With these considerations in mind, we developed Spatial EcoTyper (Fig. 2a and Extended Data Fig. 2). At its core, the framework relies upon a network integration technique to identify common patterns of ST variation shared across samples. This is achieved by adapting similarity network fusion (SNF), a previously described approach for multi-omics data integration across patients³². By introducing a series of carefully constructed spatial GEPs, our approach mitigates technical drop-out while providing stability under biological variation. Once defined, SEs can be robustly recovered in a supervised manner from non-spatial data using unique cell states and molecular signatures that are learnt from spatial data.

Spatial EcoTyper consists of five key components, described in detail in the following sections.

Determination of sample-level spatial clusters. In each single-cell ST sample, spatial expression data are encoded into cell-type-specific GEPs of spatial neighbourhoods (SNs), spatial covariation among the SNs is computed by SNF and SNs are clustered over the resulting network (Fig. 2a; steps 1–4, Extended Data Fig. 2a).
Identification of conserved SEs. Spatial clusters discovered from individual single-cell ST samples are represented by GEPs of their associated cell states, and clusters with similar GEPs are aggregated across samples into conserved SEs (Extended Data Fig. 2; steps 1–9, Extended Data Fig. 2b).
Discovery of conserved SE-specific cell states. Cell states uniquely enriched in each SE and conserved across samples are identified using a specialized variant of NMF (Extended Data Fig. 6a).
Recovery of conserved SE-specific cell states. An NMF model is developed to recover SE-specific cell states in external single-cell or spatial expression datasets (Extended Data Fig. 6b).
Deconvolution of SEs from bulk RNA-seq. The approach from the previous component is generalized to the task of recovering SE abundances from bulk RNA-seq, using a training cohort of pseudo-bulk mixtures (Fig. 3 and Extended Data Fig. 8).

Determination of sample-level spatial clusters

The Spatial EcoTyper framework, schematically illustrated in Extended Data Fig. 2, begins by identifying clusters of SNs in each single-cell ST sample (steps 1–3, Extended Data Fig. 2a). Although ST data should be generated by the same assay, the discovery phase is applicable to diverse single-cell ST platforms. In this work, we used tumour samples profiled by MERSCOPE for discovery (Supplementary Table 8), with normalization performed as described in the ‘Spatial EcoTyper discovery cohort’ section below. To assess reproducibility, we also applied discovery mode to Xenium Prime data as described in ‘Robustness of spatial ecotype discovery to single-cell ST platform’ in Supplementary Methods (Extended Data Fig. 6k–n).

Assembly of cell-type-specific SN expression profiles

Spatial proximity is explicitly used by Spatial EcoTyper in two ways: when analysing individual cells, and when analysing distinct cell types. To accomplish the former, cell-type-specific GEPs are first aggregated by SNs centred along a regular grid (step 1, Extended Data Fig. 2a). This involves constructing a vector for each cell type c, denoted snGEP^c, by averaging the normalized GEPs of the nearest up to k cells of cell type c located within radius r of the centre of each SN (selected to be 50 μm in practice; section ‘SN radius’ below). The snGEP^c vectors for all SNs are then concatenated into matrix E^c with g genes (rows) and m snGEP^c vectors (columns). Crucially, the latter is consistently ordered left-to-right by SN coordinates, enabling co-registration across cell types. In this way, for any given SN with coordinates i, j, snGEP vectors for cell types x and y will occupy the same column index in E^x and E^y, respectively (step 1, Extended Data Fig. 2a). To identify SEs containing multiple cell types, SNs characterized by a single cell type are eliminated by default. Furthermore, from each E^c matrix, genes expressed in fewer than five SNs, SNs with no cells of type c and SNs expressing fewer than five genes are excluded with associated entries set to NA.

The snGEP^c vector serves as a fundamental data unit for Spatial EcoTyper, analogous to a ‘spatial meta-cell’ (Supplementary Fig. 5). It mitigates technical drop-out in single-cell gene expression profiling by aggregating over multiple cells while simultaneously reducing the influence of cell-type abundance. Hence, the snGEP^c representation is suitable for ecotype detection based on cell-state variation, rather than shifts in local cell-type composition alone.

SN similarity network construction

To incorporate the spatial proximity of distinct cell types, a pairwise similarity matrix A^c of dimension m × m is constructed for each matrix E^c (step 2, left; Extended Data Fig. 2a). In detail, Spatial EcoTyper first performs dimension reduction on matrix E^c to identify the top 20 principal components using the RunPCA function from Seurat (v.4.3.0)⁶⁷. Pairwise similarities between all E^c columns are then calculated as inverted Euclidean distance, yielding matrix A^c for each cell type c. Given the typically large number of SNs, we retained only the top α highest similarities in each row and column of A^c, setting all other values to zero to create a sparse matrix. Although α = 50 was used in this work, we note that our results were robust to a range of empirically tested values of α (data not shown). This step maintains key edges in the similarity network and enhances scalability. For any instance in which the given cell type c was not represented in both SNs, the corresponding entry in A^c was assigned as NA.

SN similarity network fusion

Because all SNs are co-registered across cell types, Spatial EcoTyper fuses all A^c matrices into a single similarity matrix A of dimension m × m using SNF (step 2, right; Extended Data Fig. 2a). Matrix A combines shared patterns of transcriptional covariance across colocalized cell types. To achieve network fusion in practice, we implemented an enhanced version of the SNF function from the SNFtool R package (v.2.3.1), adding support for sparse matrices and missing values while otherwise preserving the original functionality, then we applied this updated function to our set of A^c matrices. We then performed a rank normalization over the columns of the resulting matrix to transform similarity values per column into a standard space. Ranked values per column were subsequently converted to zero minimum and unit maximum.

Spatial clustering and cluster profiling

Given the fused similarity matrix A, Spatial EcoTyper groups SNs into clusters (step 3; Extended Data Fig. 2a), which will become candidates for SEs when considered across multiple samples as described in section ‘Identification of conserved SEs’ below. Clustering each input sample prior to multisample SE discovery serves two related purposes. First, it reduces the dimensionality of the data by grouping SNs with similar spatial covariance patterns. Second, it simplifies cross-sample integration by de-noising the data and minimizing drop-out. To cluster matrix A, we leveraged standard processing procedures optimized within Seurat (v.4.3.0)⁶⁷, sequentially applying RunPCA, FindNeighbors and FindClusters (Louvain) functions. For single-sample analyses, the resulting spatial clusters represent sample-level SEs. For integrative analysis across samples, a higher clustering resolution is recommended to enhance robustness to parameter variation (section ‘Louvain clustering resolution’ below). In this work, we selected a resolution of 30, reducing the dimensionality from tens of thousands of individual SNs to hundreds of SN clusters per sample. Extended Data Fig. 2a shows the robustness of SE discovery to different clustering resolutions (1–50).

In practice, SNs in pre-annotated domains can be balanced to ensure equal representation before clustering. To obtain an equal number of SNs from tumour and adjacent stromal regions in this work, we uniformly downsampled the one with more SNs (for example, tumour) before integrative analysis.

Identification of conserved SEs

Beyond sample-level SE analysis, a key strength of Spatial EcoTyper lies in its ability to identify SEs conserved across a variety of conditions, such as samples, patients and cancer types. To identify such SEs, Spatial EcoTyper uses a variant of the sample-level process described above (section ‘Determination of sample-level spatial clusters’), using SN clusters rather than SNs as the fundamental units (steps 4–9; Extended Data Fig. 2).

Assembly of cell-type-specific spatial cluster gene expression profiles

Following clustering of SNs in a given input sample (section ‘Spatial clustering and cluster profiling’ above), each cell is assigned to the cluster of its spatially nearest SN. To minimize batch effects across samples, row-based standardization is then applied to each single-cell GEP, normalizing gene expression to zero mean and unit variance per gene. Spatial EcoTyper then computes the average cell-type-specific GEP for each cell type c and SN cluster, referred to as ccGEP^c.

Once defined, ccGEP^c vectors are aggregated per cell type c for a given input sample into matrix E′^c, with g genes (rows) and s SN clusters (columns), with genes restricted to those with non-zero expression in at least some SN in each sample (step 4; Extended Data Fig. 2a). Importantly, all E′^c matrices are co-registered across cell types, with any given SN cluster occupying the same column index across E′^c matrices. Moreover, to ensure sufficient representation and well-defined computations per sample, we require a minimum of three SN clusters containing the cell type c for inclusion of a sample into each E′^c. In preparation for cross-sample integration, the feature space of each E′^c is reduced to the top 200 variable genes (by default), where highly variable genes per matrix are computed according to their rank product of variances across all input samples. In other words, for each cell type c, the variance of each gene across SN cluster ccGEP^c vectors is computed per sample, with genes then assigned a rank by variance. Ranks are then aggregated across samples by geometric mean, and the top highly variable genes are selected per E′^c from the result.

Cross-sample SN similarity network construction

When E′^c matrices have been created for all input samples (step 5; Extended Data Fig. 2b), they are concatenated column-wise across samples yielding E*^c, stratified by cell type c (step 6; Extended Data Fig. 2b). Similarity networks are then computed by Spearman correlation over the columns of E*^c for each cell type c, yielding a set of c pairwise similarity matrices A*^c describing the similarity of E*^c columns across sample-level SN clusters (step 7; Extended Data Fig. 2b). For any pairwise comparison of SN clusters in which cell type c is not represented in both clusters, the corresponding entry in A*^c is assigned NA. To minimize any remaining batch effects, Spatial EcoTyper standardizes similarity matrices A*^c by performing rank normalization independently on each submatrix $\bfA_ij^* c$, which represents the similarities of SN clusters between samples i and j. The normalization is performed by converting the non-NA entries of each column in $\bfA_ij^* c$ to ranks and rescaling the ranks to the unit interval.

Cross-sample SN similarity network fusion

In step 8 (Extended Data Fig. 2b), Spatial EcoTyper fuses all A*^c matrices across cell types into a single similarity matrix A* using the enhanced SNF function as described above in ‘SN similarity network fusion’. The resulting multisample matrix encodes the conservation of spatial community structures across cell types and SNs.

Clustering of sample-level SN clusters into SEs

In the final step, to group sample-level SN clusters into cross-sample SEs, NMF clustering is applied to A*, with the number of clusters (rank) set according to the following procedure (step 9; Extended Data Fig. 2b). In this work, NMF clustering of A* was tested for ranks ranging from 2 to 50, with 50 runs per rank using the Brunet method⁸², with optimal threshold selected as the highest rank for which the cophenetic coefficient exceeded 0.95 and subsequently showed the greatest drop (Extended Data Fig. 4b). NMF results derived from the selected rank, here identified as 11, were used to group sample-level SN clusters and corresponding SNs and single cells into SEs. This number was further reduced to nine final SEs by excluding candidate ecotypes that were devoid of SE-specific cell states (section ‘Discovery of conserved SE-specific cell states’ below) and that did not exhibit maximal within-cluster similarity. Similarity between two clusters was calculated as the average value of the block of A* corresponding to rows of the first cluster and columns of the second. Notably, NMF has well-established performance characteristics for robustly clustering high-dimensional genomic data encompassing hundreds to thousands of data points^6,70. However, for step 3 of the multi-sample workflow (Extended Data Fig. 2a), we used the more efficient Louvain clustering (Seurat), owing to the large SNF matrices arising from single-sample analysis. For robustness testing, see Extended Data Fig. 4a.

Assembly of cell-type-specific SE gene expression profiles

Having assigned single cells to SEs, Spatial EcoTyper then determines SE cell-state gene expression profiles (csGEPs). For each cell type c, NMF is performed on single-cell GEP G^c and corresponding SE label matrix H^c to derive csGEPs. Here, G^c represents a gene-by-cell matrix for each cell type c. To construct G^c, GEPs are normalized with NormalizeData from Seurat (v.4.3.0)⁶⁷, standardized to zero mean and unit variance per gene in each sample, and then posneg transformed as described previously⁶. To ensure balanced representation, an equal number of cells (at least 300, and up to 5,000 cells) are randomly selected from each sample-SE pair. Owing to the computational constraints of NMF, a maximum of 25,000 cells is selected through random down-sampling. GEPs of the selected cells are concatenated column-wise into csGEP matrix G^c. The SE label matrix H^c is a binary cell by SE matrix indicating SE membership for each cell in G^c. NMF is then applied to solve the equation:

$$G^c=W^c\times H^c,$$

where W^c represents the basis matrix containing csGEPs. To refine csGEPs for cell-state recovery, the top 50 genes are chosen per SE based on the largest positive delta compared with the second-highest expression across SEs. Each basis matrix is then reduced to the selected genes. These refined basis matrices enable the recovery of SEs and their cell states from independent data using NMF (section ‘Recovery of SE-specific cell states’ below).

Discovery of conserved SE-specific cell states

Although SEs are derived from spatial covariation in cell states across cell types, shared across samples, not every cell state associated with an SE need be specific to that SE. To identify and validate cell states specifically enriched in each SE and conserved across discovery samples, we performed leave-one-sample-out cross-validation (LOOCV), repeating the csGEP construction as described above (‘Assembly of cell-type-specific SE gene expression profiles’ above) for each training fold. We then used the resulting NMF basis matrices to predict cell-state labels on the held-out sample. For label assignment, NMF prediction output matrices H^c were standardized to unit sum per column, yielding a probability matrix encoding the probability of each single cell being localized in each SE. Cells were then assigned to the SE-associated cell state with the highest probability. For each LOOCV iteration, the enrichment of each cell state in each SE was assessed by its ability to correctly assign cells to that SE using an F1 score.

Because the csGEP construction involves subsampling of cells, this LOOCV process was repeated 20 times to ensure robustness, with F1 scores averaged across all repetitions. Cell states were considered specific to an SE only if the associated F1 score exceeded the second highest F1 score for the cell state across other SEs by at least 0.1. Otherwise, the cell state was deemed either broadly distributed across multiple SEs or not conserved across samples. Using this approach, 38 SE-specific cell states were identified, specific to nine SEs (Fig. 2f, Extended Data Fig. 6a and Supplementary Table 10). Two SEs identified initially were excluded from further analysis owing to a lack of specific cell states, and the remaining SEs were renumbered accordingly from 1 to 9 based on their average distance across discovery samples to the tumour margin (Fig. 2d and Extended Data Fig. 4c).

Recovery of SE-specific cell states

After identifying conserved SE-specific cell states through the above LOOCV process, we used all discovery-cohort samples to prepare an ensemble basis matrix W*^c for each cell type c. Specifically, for each cell type, we repeated the process described above in ‘Assembly of cell-type-specific SE gene expression profiles’ 50 times and then averaged the resulting basis matrices to produce W*^c. For feature selection, genes showing the highest and most specific expression in each cell state were identified from each basis matrix as described, and then genes that were selected for the same cell state in more than half of the repetitions were retained in the ensemble matrix.

The resulting ensemble matrices are a core component of the Spatial EcoTyper framework and can be used to recover SE-specific cell states from external single-cell-scale transcriptomics datasets using NMF, as described previously⁶. To recover SE-specific cell states from a query dataset, NMF is applied with single-cell-scale gene expression matrix G^c and the ensemble matrix W*^c as input, to yield a probability matrix H^c denoting the probability of each cell belonging to each SE. Cells are then assigned to SE-specific cell states when the prediction probability exceeds 0.6, otherwise they are designated to a null class, referred to as non-SE. In practice, single-cell-scale GEPs in G^c should be normalized (to counts per million (CPM), TPM or by SCTransform⁶⁷, as appropriate) and then scaled to zero mean and unit variance per gene.

Deconvolution of SEs from bulk RNA-seq

To enable profiling of SEs in bulk expression data, the Spatial EcoTyper framework includes an NMF model trained over simulated bulk RNA-seq prepared by aggregation of scRNA-seq data into pseudo-bulk mixtures for which ground-truth SE proportions are known.

Construction of pseudo-bulk mixtures

Previously described publicly available scRNA-seq data (section ‘scRNA-seq’ above) from ten cancer types were used to create pseudo-bulk mixtures. First, cells were annotated as described in section ‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’ below, labelled according to the parent SE of their assigned cell state or, if unassigned, designated non-SE, for a total of ten label classes. The fractional composition of each pseudo-bulk was generated by random sampling of a value per label class from the Gaussian distribution N(μ = 2, σ = 1), with negative values set to zero and the resulting values normalized to unit sum.

Pseudo-bulks were assembled separately by cancer type, with GEPs constructed by aggregating 1,000 cells randomly selected to satisfy the predefined fractions of cell states. For UMI- and plate-seq-based data, raw counts and TPM values were respectively summed across selected cells. We generated 100 pseudo-bulks per cancer type, and the resulting GEPs were normalized using the NormalizeData function from Seurat (v.4.3.0)⁶⁷. To mitigate cancer type and batch differences, GEPs were further normalized to zero mean and unit variance per gene in each cancer type.

NMF model training for bulk deconvolution

The resulting profiles were concatenated into gene by sample matrix E^P, with genes limited to those detected across all cancer types, and pseudo-bulk SE fractional abundances were encoded in sample by label matrix H^P. From these, a basis matrix was derived by application of NMF followed by feature selection as described above (section ‘Assembly of cell-type-specific SE gene expression profiles’). The resulting basis matrix W^B constitutes another core component of the Spatial EcoTyper framework and can be used to deconvolve SE fractional abundances from bulk gene expression data. For a given bulk expression dataset, predictions are performed by NMF as described in section ‘Discovery of conserved SE-specific cell states’, excluding the final classification step to yield H^P, in which the values represent SE abundances across input bulk samples. In practice, to perform deconvolution, input data should be normalized to TPM or CPM as appropriate, log₂-adjusted and then normalized across samples to zero mean and unit variance per gene.

Spatial EcoTyper discovery cohort

MERSCOPE samples were selected for SE discovery owing to their high spatial resolution and the availability of uniformly processed samples across multiple cancer types (Supplementary Table 8). Before analysis, MERSCOPE samples were preprocessed as described in section ‘MERSCOPE’ above, then standardized using NormalizeData from Seurat (v.4.3.0)⁶⁷. For SE discovery and characterization, we focused on nine main TME cell types with strong representation across cancer types and tumour samples: B cells, plasma cells, CD4⁺ T cells, CD8⁺ T cells, NK cells, macrophages, dendritic cells, fibroblasts and endothelial cells. Malignant cells were not included owing to significant differences across tumour types. Other TME cell types, such as smooth muscle cells, pericytes and neutrophils, were not confidently detected in the MERSCOPE dataset, probably because of limitations in transcript capture inherent to MERSCOPE and the 500-gene panel that we analysed.

To capture spatial microenvironments from both tumour and adjacent stroma (annotated as described in Supplementary Methods), we selected samples in which each region included more than 5% of the total TME (immune and/or stromal) cells, yielding two melanomas, two colon cancer and two liver cancer samples, and one breast cancer, one prostate cancer and one ovarian cancer sample (Extended Data Fig. 3f and Supplementary Table 8). Five of these samples, each from a different cancer type, were used for SE discovery (Extended Data Fig. 3f and Supplementary Table 8).

Selection of Spatial EcoTyper parameters

SN radius

When applying Spatial EcoTyper to individual samples, we consistently observed a spatial gradient resembling the physical distance of SNs to the tumour margin (Fig. 2b). To assess robustness, Spatial EcoTyper analyses were performed with ten different SN radii, ranging from 10 μm to 100 μm, on a MERSCOPE melanoma sample (melanoma 1). For each SN radius, the relationship between gene expression similarity and physical distance of SNs to the tumour margin was evaluated. To do this, we used the procedure described in ‘Cells, meta-cells, and Spatial EcoTyper embeddings vs. distance to the margin’ in Supplementary Methods, with the exception that PCA was applied to the spatial embedding produced by Spatial EcoTyper (step 2; Extended Data Fig. 2) and SNs (rather than single cells) were used as the unit of analysis. To strike a balance between SN granularity and correlation with distance to the margin, a radius of 50 µm was selected for SE discovery (Extended Data Fig. 3g).

Louvain clustering resolution

A key parameter in the Spatial EcoTyper discovery pipeline (step 3; Extended Data Fig. 2a) is the resolution for Louvain clustering, which groups SNs into clusters in each sample. To ensure robustness, 11 different resolutions ranging from 1 to 50 were tested, and all resulting spatial clusters were grouped into ten clusters following the multisample discovery pipeline. The similarity between clusters derived at different resolutions was evaluated using the average adjusted Rand index (ARI), comparing each resolution with the others (Extended Data Fig. 4a). The discovered clusters showed high overall consistency, with results being more stable when a resolution higher than 15 was used. The resolution of 30, which had the highest average ARI, was selected for SE discovery in our analysis (Extended Data Fig. 4a).

Recovery of cell states and SEs in ST and scRNA-seq validation datasets

Single-cell-scale ST recovery

To validate SEs and their associated cell states, we analysed nine samples profiled by MERSCOPE (five discovery samples and four held-out samples) and 12 held-out samples profiled by different ST platforms: Xenium V1 (n = 5), Xenium Prime (n = 4) and Visium HD (n = 3) (Supplementary Table 8). The latter, drawn from publicly available data described in section ‘Data collection and processing’ above, were selected for consistency with the TME content threshold required for the MERSCOPE discovery cohort (section ‘Spatial EcoTyper discovery cohort’ above). In all held-out samples, SE-specific cell states were recovered using the approach described above in section ‘Recovery of SE-specific cell states’, assigning single cells (or 8-µm² bins from Visium HD) to either non-SE or SE-specific cell states, which allowed for further grouping into the respective SEs. The same procedure was also applied to the MERSCOPE discovery cohort but with LOOCV to avoid overfitting (section ‘Discovery of conserved SE-specific cell states’ above).

Bulk ST recovery

For the analysis presented in Extended Data Fig. 6j, 26 Visium and 48 legacy ST samples were selected, each containing at least five spots located more than 500 µm away from the tumour margin (Supplementary Methods). Spatial spots were normalized to zero mean and unit variance per gene across all spots in each sample. We applied the SE cell-state recovery models (section ‘Recovery of SE-specific cell states’ above) to obtain an H*^c matrix for each cell type c, and then averaged the matrices across cell types to estimate relative SE levels across spots. Each spot was then assigned to the dominant SE.

Single-cell RNA-seq recovery

For the analyses presented in Extended Data Fig. 7b, we queried SE content in scRNA-seq profiles from 144 tumour samples spanning ten cancer types (Supplementary Table 2). For each cell type from each carcinoma type, the scRNA-seq data were normalized using SCTransform from Seurat (v.4.3.0)⁶⁷, and log₂ TPM data were used for melanoma Smart-seq2 profiles. Cell-state recovery was then performed as described above (‘Recovery of SE-specific cell states’), and the abundance of cell state i of parent cell type c (for example, CD8⁺ T cells) in sample s was then determined as the fraction of cells assigned to state i out of the total cells of cell type c in sample s. We repeated the above process for scRNA-seq profiles of 64 brain metastases from melanoma and five types of carcinoma³⁶, using NormalizeData with Seurat (v.4.3.0)⁶⁷ before SE recovery (Extended Data Fig. 7c and Supplementary Table 2).

Metrics and analyses for SE recovery in ST and scRNA-seq data

Cell-state colocalization in single-cell-scale ST data

The spatial colocalization patterns of SE-specific cell states were assessed using single-cell-scale ST data from four platforms, with cell states recovered from each sample as described above (‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’) (Extended Data Fig. 6c and Supplementary Table 8). For each single cell (or 8-µm bin for Visium HD), the fractional abundances of neighbouring cell states within a radius of 50 µm were determined, resulting in an N × S matrix, F, where F_ij denotes the fraction of cells within a 50-µm radius of cell N_i that are of state S_j. Single-cell-level fractions were subsequently averaged in each cell state, producing an S × S colocalization matrix, L, where L_ij represents the average fractional abundance of cell state S_i near cell state S_j. To control for biases, cell-state assignments were shuffled 10,000 times and colocalization matrices were recomputed, yielding 10,000 random colocalization matrices, L^rand. The colocalization matrix L was then normalized by subtracting the average of the L^rand matrices and dividing by their standard deviation for each element:

$$L_ij^\prime =\frac{L_ij-\,\mu _L_ij^\rmr\rma\rmn\rmd}{\rm\sigma _L_ij^\rmr\rma\rmn\rmd}.$$

This resulted in a matrix L′, where L_ij′ represents the colocalization index between cell state S_i and cell state S_j. Finally, the colocalization indexes from multiple samples in each dataset were integrated using Stouffer’s approach⁸⁰, with L′ capped at an absolute value of 5 per sample to prevent any single sample from disproportionately influencing the meta-analysis (Extended Data Fig. 6c–g).

Cell-state co-association in scRNA-seq data

Cell-state abundances were determined in scRNA-seq tumour atlases as described above in ‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’. Abundances were computed under four schemes: (i) including all SE and non-SE states; (ii) excluding non-SE states; third, the same as (i) except treating zero abundance as missing values (NA); and (iv) the same as (ii) except treating zero abundance as NA. Pairwise Pearson correlations between cell states were then calculated across all scRNA-seq samples for each abundance matrix, using the cor function in R with pairwise complete observations. The final co-association values were obtained by averaging the correlations across the four schemes (Extended Data Fig. 7a–c).

Significance of cell-state colocalization and co-association

Permutation experiments were done to assess the significance of cell-state cooccurrence indices, whether for colocalization in ST data (L′ from ‘Cell-state colocalization in single-cell-scale ST data’ above) or for co-associations in scRNA-seq data (‘Cell-state co-association in scRNA-seq data’ above). Let square matrix C represent all pairwise co-occurrence indices between SE cell states. Let Θ_w represent the average of all co-occurrence indices in C for cell states within SE class w. Θ_w was compared with 10,000 corresponding scores $\theta _w^\rmrand$ obtained by randomly shuffling the order of all columns in C, then determining the average co-occurrence index for all cell-state indices corresponding to SE w. The mean co-occurrence score Θ_w was then normalized by subtracting the average of $\theta _w^\rmrand$ and dividing by its standard deviation, yielding a two-sided z-score. For scRNA-seq co-association analyses, the z-score was directly converted into a P-value, and the process was repeated for each SE (Extended Data Fig. 7b,c). For ST colocalization analyses, each ST sample was analysed individually. Stouffer’s method⁸⁰ was then used to aggregate SE-specific z-scores across all ST samples in a dataset, resulting in a meta-z-score for each SE, which was directly converted into a P-value (Extended Data Fig. 6d–f). To incorporate SE2, which is composed of a single cell state, colocalization testing was applied to individual SE2 cells. Otherwise, self-comparisons of cell states were excluded from significance testing.

Spatial autocorrelation

The spatial coherence of SE-specific cell states was evaluated using Moran’s I across 21 single-cell-scale ST datasets, with cell states recovered independently for each sample using Spatial EcoTyper (Extended Data Fig. 6j). We constructed a k-nearest-neighbour graph (k = 3) using spdep (v.1.3.11)⁸³ over all annotated TME cells. We converted the resulting graph into a spatial weights matrix using the nb2listw function (with default parameters) and then calculated Moran’s I for each SE class using the moran function, where cells belonging to the SE were encoded as 1 and all others as 0.

To control for bias, we performed 1,000 permutation experiments in which SE labels were randomly shuffled within each cell type. Moran’s I was recalculated for each permutation to generate null distributions for every SE. Observed Moran’s I values were then normalized into z-scores by subtracting the mean of the null distribution and dividing by its standard deviation.

Distance to tumour margin

Following SE recovery from ST data as described above in section ‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’, the distance of each SE to the tumour margin (in micrometres) was computed by first averaging the Euclidean distance to the nearest tumour margin of all SNs (single-cell-scale ST) or spots (bulk ST) assigned to each SE in each sample, then averaging the resulting quantities by SE across samples in each cohort (Extended Data Fig. 6i,j). Positive and negative distances were used for cells and spots localized to the tumour region and adjacent stroma, respectively (see, for example, Fig. 1c). Finally, for each cohort, the consistency between expected distances (SE-specific distances to the tumour margin derived from the MERSCOPE discovery cohort) and predicted distances was evaluated using Pearson correlation (Extended Data Fig. 6i,j).

Characterization of SEs and associated cell states

Identification of SE cell-state markers

To identify SE-specific cell-state markers, we analysed scRNA-seq data from 144 tumours and ten cancer types (Supplementary Table 2), with all cells grouped into SE-specific cell states or the non-SE null class. The scRNA-seq data were normalized as described in section ‘Differential expression analysis’ above. Differential expression analysis was performed by comparing each cell state with all of the other cells of the same cell type using the wilcoxauc function from the presto package (v.1.0.0; https://github.com/immunogenomics/presto)⁷⁹. LFCs were extracted from each cancer type and then aggregated across the ten cancer types by median, yielding pan-cancer LFCs.

To identify the markers most specific to each cell state i, we selected genes whose pan-cancer LFC in i was at least 0.1 higher than in any other state. Among these, the top 30 genes by pan-cancer LFC were considered to be marker genes (Fig. 2g and Supplementary Table 10). Markers for all SE cell types annotated in at least five cancer types in the combined scRNA-seq tumour atlas were determined (all except B and NK cells).

Reproducibility of cell-state markers

To assess the reproducibility of cell-state marker genes, we parcelled all ten scRNA-seq atlases into discovery (n = 5 cancer types corresponding to those used for the MERSCOPE discovery cohort) and validation (n = 5 remaining cancer types) cohorts, each with non-overlapping cancer types. Marker genes identified in the discovery cohort were evaluated in the validation cohort (Extended Data Fig. 7d) and compared with markers derived from the validation cohort or all ten cancer types using the Jaccard similarity index (Extended Data Fig. 7e).

Annotation of cell states

SE-specific cell states (n = 38) were annotated based on top markers and 135 previously reported reference cell states (Supplementary Table 10). Specifically, we assessed similarity to each of the 135 reference states by computing enrichment scores using AddModuleScore in Seurat (v.4.3.0)⁶⁷. For each reference state, we then averaged enrichment scores across all cells of the state and tested whether the mean score was significantly higher than that of randomly sampled cells by permutation testing over 1,000 iterations. Of the 38 SE-specific states, 18 showed significant overlap with at least one reference state and were annotated with the most associated reference state by significance (Supplementary Table 10). To augment these assignments, all 38 SE cell states were also annotated based on the corresponding marker gene with the highest LFC compared with other cell states of the same cell type (Fig. 2g, Supplementary Table 10).

Identification of SE consensus markers

To identify SE-specific genes with conservation across cell types, termed consensus markers (Fig. 2h), the top 1,000 markers with positive pan-cancer LFCs—or all positive markers if fewer than 1,000 were available—were selected for each SE cell state from the analysis of scRNA-seq data described above (‘Identification of SE cell-state markers’). Consensus SE markers were then defined as genes with at least 80% conservation across all evaluable cell states in a SE (equivalent to 100% conservation for ecotypes with fewer than five states) for a minimum of 20 markers per SE. For SE2, which comprises a single-cell state, we limited consensus markers to those with statistically significant conservation (Q < 0.05) in at least three cancer types. To eliminate overlap among consensus markers, genes associated with multiple SEs were assigned to the SE with the highest number of significantly conserved cancer types. In cases where markers overlapped between SE2 and other SEs, genes were preferentially assigned to non-SE2 states. Given that SE3 had fewer than 20 genes following these steps (n = 16), we augmented it by including genes at a relaxed cell-type-conservation threshold of 60%, selected in order of decreasing conservation across cancer types, until the 20-marker minimum was satisfied. Consensus markers and normalized expression values are provided in Supplementary Table 11 (see also Supplementary Methods).

Biological pathways associated with SE consensus markers

To identify biological pathways associated with SE consensus markers, we performed overlap analysis using the enricher function from the clusterProfiler (v.4.14.6) R package⁸⁴. Consensus marker sets were individually evaluated against hallmark (H) and biological process (C5:BP) gene sets from MSigDB⁸⁵. Pathways with significant overlap (Q < 0.1) were retained, and for each SE, pathways showing the strongest overlap relative to other SEs were selected (Fig. 2h and Extended Data Fig. 7h).

Association between SEs and carcinoma ecotypes

To study the relationship between SEs and previously defined CEs⁶, SEs and CEs were recovered from the same scRNA-seq data across ten cancer types (Supplementary Table 1) using SE-specific and previously published⁶ recovery methods, respectively. The fraction of cells in each SE i that were also assigned to CE j was computed for each dataset, resulting in an overlap matrix O with rows representing nine SEs and columns representing nine CEs (excluding CE7 because of its low validation rate in a previous study⁶). To control for potential biases from different abundances, permutation experiments were performed by shuffling the cell-state assignments 10,000 times. In each iteration, the matrix O was recomputed, yielding $O_i,j^\rmrand$. The matrix O was then normalized using the mean and variance of $O_i,j^\rmrand$ (as described in section ‘Cell-state colocalization’ above), producing a normalized matrix O′, where $O_i,j^\prime $ represents the overlap index between SE i and CE j. Finally, the overlap indexes across the ten cancer types were aggregated using Stouffer’s method⁸⁰ (Extended Data Fig. 7j).

Validation of SE deconvolution

Cross-validation over training pseudo-bulks

To evaluate the Spatial EcoTyper deconvolution model (section ‘Deconvolution of SEs from bulk RNA-seq’ above), we first applied a LOOCV procedure in which we trained the NMF model on pseudo-bulk GEPs from nine cancer types (n = 900 mixtures) and applied the trained model to pseudo-bulk GEPs from the remaining cancer type (n = 100 mixtures; see section ‘Construction of pseudo-bulk mixtures’ above). Consistency between predicted and ground-truth SE abundances was assessed by Pearson correlation (Fig. 3a, Extended Data Fig. 8a–c and ‘Benchmarking of SE deconvolution‘ in Supplementary Methods).

Paired bulk RNA-seq and scRNA-seq

We further evaluated the Spatial EcoTyper deconvolution model using paired scRNA-seq and bulk RNA-seq data from cohort 1 (Fig. 3a,d, Extended Data Fig. 8d and Supplementary Table 12). SE cell states were first recovered from scRNA-seq data (section ‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’ above) and SE abundances defined as the number of cells assigned to each SE over total evaluable cells per sample. Because mitochondrial quality-control filtering (section ‘Data collection and processing’ above) disproportionately removed cancer cells from a minority of samples, cancer and TME abundances were rescaled to match their proportions in the mitochondrial-unfiltered data, and SE proportions were adjusted accordingly. Next, SE abundances were inferred from bulk RNA-seq using the Spatial EcoTyper deconvolution model (section ‘NMF model training for bulk deconvolution’ above). Owing to the limited number of samples, which could bias the centring and unit variance normalization of gene expression required for SE deconvolution, we combined cohort 1 and TCGA RNA-seq data from matched cancer types, removed batch effects between the two datasets using the Combat function from the sva R package (v.3.46.0) with default parameters⁸⁶, and then normalized the batch-corrected gene expression to zero mean and unit variance per gene across all samples. This process was conducted for melanoma and colon cancer separately, and for intact and digested bulk RNA-seq datasets separately (‘Bulk RNA sequencing’ in Supplementary Methods). The resulting data were used to infer SE abundances with Spatial EcoTyper (Extended Data Fig. 8e and ‘Benchmarking of SE deconvolution’ in Supplementary Methods).

Paired bulk RNA-seq and Visium ST

We downloaded bulk RNA-seq data as FASTQ files (n = 54 samples) and preprocessed Visium ST profiles (n = 103 samples) from 47 patients spanning five carcinoma types from the HTAN⁴⁴. These data were processed and subjected to quality control as described in ‘Bulk RNA sequencing’ and ‘Paired bulk RNA-seq and ST quality control’, respectively, in Supplementary Methods, resulting in matched pairs from 42 of 47 patients, comprising 46 bulk RNA-seq and 88 Visium samples. Next, bulk RNA-seq data (log₂ TPM) were scaled to unit variance per gene across samples for input to the Spatial EcoTyper deconvolution model (section ‘NMF model training for bulk deconvolution’ above). The log₂ CPM data from each Visium sample were scaled to mean of zero and unit variance per gene across all spots. Deconvolution was then performed for each sample separately to determine SE abundances across Visium spots. To obtain sample-level SE abundances accounting for geographic variation in cell density, we estimated cell counts per spot using CytoSPACE (v.1.0.3)²⁵. We then used these count estimates to compute a weighted average of SE levels across all spots and renormalized the resulting SE levels to unit sum in each sample. We averaged SE levels across samples per modality for each patient. The resulting SE abundances from Visium and matched bulk RNA-seq were then compared (Fig. 3b–d, Extended Data Fig. 8h,i). We also repeated this analysis, replacing bulk RNA-seq data with pseudo-bulk profiles constructed from Visium samples as described in Supplementary Methods, but without batch correction (Extended Data Fig. 8h).

Paired Visium and single-cell-scale ST

To assess whether spot-level deconvolution from bulk Visium data is consistent with single-cell-scale SE recovery (Fig. 3e–f and Extended Data Fig. 9a,b), we prospectively generated paired Visium data (quality control and processing as described in ‘10x Visium (standard)’ in Supplementary Methods) and MERSCOPE data (‘Vizgen MERSCOPE’ in Supplementary Methods) from adjacent melanoma sections (melanoma 3 from patient WU2109; Supplementary Tables 12, 17 and 20). We also downloaded matched Visium and Visium HD (8-μm² bins) data for adjacent colon cancer sections (‘Visium HD, Sample P2 CRC’ and ‘Visium CytAssist v2, Sample P2 CRC’) from 10x Genomics (https://www.10xgenomics.com/platforms/visium/product-family/dataset-human-crc) and preprocessed them as described in section ‘Data collection and processing’ above. To co-register adjacent tissue sections profiled by Visium (standard) ST and paired single-cell-scale ST data (MERSCOPE or Visium HD), we manually selected four reference points at the edges of distinct morphological structures visible in both datasets. These references were used to learn a linear affine transformation function, which was subsequently applied to transform all coordinates from Visium into the coordinate space of the paired single-cell-scale ST dataset.

SE abundances were inferred for all Visium spots as described above (‘Paired bulk RNA-seq and Visium ST’). SEs were recovered from all single-cell-scale ST data as described in section ‘Recovery of cell states and SEs in ST and scRNA-seq validation datasets’ above. Two strategies were used to overcome potential imprecision in the co-registration procedure. First, SE abundances in single-cell-scale ST data corresponding to each co-registered Visium spot were estimated as the fraction of cells or bins assigned to each SE within a SN of 50 µm radius, requiring at least five cells or bins for robustness. Second, SE abundances in both datasets were smoothed by averaging across each co-registered spot and its six nearest neighbours. Non-SE cells, comprising cancer cells, non-SE TME cells and low-confidence cells (for example, because of limited gene detection; section ‘Recovery of SE-specific cell states’ above), were excluded from analysis.

Concordance between platforms was determined by Spearman correlation, adjusting for background dependencies between SE levels in the paired single-cell-scale ST sample (Fig. 3e–f and Extended Data Fig. 9a,b). To do this, pairwise Spearman correlations were computed between the levels of each Visium-derived SE i and each single-cell-scale-ST-derived SE j, conditioning on SE i in the single-cell-scale ST data, for all non-matching SE pairs, using the pcor function in the R package ppcor (v.1.1)⁸⁷. Direct Spearman correlations were calculated for all matching SE pairs. The P-values of the resulting correlation coefficients were transformed into signed –log₁₀ Q-values indicating the polarity of the correlation following Benjamini–Hochberg correction⁸¹.

Large-scale assessment of SE levels in human tumours

Overall survival and pathway analysis

We applied the Spatial EcoTyper deconvolution model (section ‘NMF model training for bulk deconvolution’ above) to infer SE levels from 7,076 bulk tumour RNA-seq profiles across 17 cancer types from TCGA, including melanoma and 16 carcinomas (Supplementary Table 13). SE deconvolution was performed separately for each cancer type using TPM data obtained from the PanCanAtlas (https://gdc.cancer.gov/about-data/publications/pancanatlas), which were log₂-adjusted. To investigate SE prognostic associations, a Cox regression analysis was conducted to examine the association between SE abundance and patient overall survival, adjusting for age and sex, using the survival R package (v.3.6.4). This analysis was done separately for each cancer type (Fig. 3g and Supplementary Table 14). To determine the pan-cancer survival associations of SEs, a meta-analysis was done by combining the resulting z-scores from each SE across all 17 cancer types using Stouffer’s method⁸⁰. For clarity, all z-scores and meta z-scores were converted to directional –log₁₀ P-values (Fig. 3g and Supplementary Table 14). For pathway analysis details (Extended Data Fig. 9d), see ‘Pathways associated with inferred SE abundance in TCGA’ in Supplementary Methods. For the analysis in Extended Data Fig. 9e, MHC levels were computed by first averaging the log₂ expression of MHC-I (HLA-A/B/C) and MHC-II genes (HLA-D*) separately, then averaging the resulting quantities together. Stromal levels were assessed using ESTIMATE (v.1.0.13)⁸⁸.

Associations with immunotherapy response

We obtained publicly available bulk tumour RNA-seq data from patients with melanoma and carcinoma treated with ICIs, including anti-PD-1, anti-PD-L1 and combinations of anti-PD-1 and anti-CTLA-4 therapies, after tumour sample collection. All patients were grouped into responders (partial or complete response) and non-responders (stable or progressive disease) based on collected clinical information. To mitigate within-dataset heterogeneity, patients who received prior immunotherapy or chemotherapy were separated into independent datasets. A minimum of five responders and five non-responders was required for each dataset, resulting in 1,249 total patients from 15 datasets from 12 studies^{51,89,90,91,92,93,94,95,96,97,98,99}, representing four cancer types (melanoma and three carcinoma types) (Supplementary Table 15). All expression data were normalized to TPM before analysis.

Using the Spatial EcoTyper deconvolution model, we predicted SE abundances across tumours in each dataset. We also evaluated the activity of publicly available transcriptional features associated with immunotherapy response¹⁰⁰, including carcinoma ecotypes⁶, T-cell dysfunction⁵⁸, T-cell exclusion, microsatellite instability (MSI)⁵⁸, tumour immune dysfunction and exclusion (TIDE)⁵⁸, immune resistance signatures¹⁰¹, IMPRES¹⁰², TLS signatures^57,103, cytolytic score (GZMA and PRF1)¹⁰⁴, MHC-I signature (HLA-A, HLA-B, HLA-C, B2M and CASP8), PD-L1 (CD274), 18-gene inflammatory signatures⁵⁵, combined tumour and immune signals (MAP4K and TBX3)¹⁰⁵, an M1 macrophage signature⁵⁶ and an IFNG signature⁵⁵. The activity of carcinoma ecotypes from ref. ⁶, T-cell dysfunction, exclusion, MSI and TIDE from ref. ⁵⁸, immune resistance signatures from ref. ¹⁰¹ and IMPRES from ref. ¹⁰² was evaluated using their respective algorithms with default settings. For the remaining features, average gene expression was computed using log₂ TPM data.

The association between each feature and ICI response was assessed using a z-score derived from a two-sided Wilcoxon rank-sum test, within each dataset. These z-scores were then combined across datasets using Liptak’s method¹⁰⁶, weighted by the square root of sample sizes. The resulting combined z-scores were converted to two-sided P-values (Supplementary Table 16). For the analysis in Fig. 3h, data from all four cancer types were included, whereas the comparison in Fig. 5b was restricted to melanoma datasets.

SE levels versus TMB and CD274 expression in ICI-treated patients

Of the pretreatment bulk RNA-seq tumour profiles analysed in ‘Associations with immunotherapy response’, 465 patients from four studies with melanoma (n = 150), non-small cell lung cancer (n = 43) or bladder cancer (n = 272) have whole-exome-sequencing-derived TMB data available (Supplementary Table 15). TMB values, defined as the total number of non-synonymous mutations per patient, and CD274 (encoding PD-L1) expression levels were log₂-transformed. For each dataset, univariate Cox proportional hazards regression models were fitted to evaluate the association between standardized feature levels (SE7, SE8, SE4, TMB and CD274 expression) and overall survival. The resulting HRs and their associated standard errors were pooled across datasets within each cancer type, and across cancer types, using a nested random-effects meta-analysis implemented in the rma.mv function of the metafor R package¹⁰⁷ (v.4.8.0), with default parameters. Specifically, for each covariate, we used rma.mv to combine the log-hazard ratios and their corresponding variances (standard error squared) across outer and inner grouping factors (cancer type and dataset, respectively), then extracted pooled HRs, 95% confidence intervals and associated P-values to generate the forest plot in Fig. 3i. Because all covariates were standardized, each HR reflects the association with overall survival for the same (1 s.d.) change in the predictor, enabling direct comparison of their relative influence. For the analysis in Extended Data Fig. 9f, multivariable Cox proportional HR models were applied to each dataset, including standardized levels of SE8, SE7 or SE4 jointly with TMB and CD274 expression. The resulting log-HRs and standard errors were combined using the same nested random-effects framework described above.

Liquid EcoTyper framework

Current analyses of the tumour microenvironment rely on invasive solid-tumour biopsies, which are prone to sampling bias and generally restricted to a single diagnostic biopsy. This limitation hinders the application of SEs as biomarkers in clinical settings. To address these challenges, we developed Liquid EcoTyper, a deep-learning framework for non-invasive profiling of SEs using plasma cfDNA methylation profiles.

Liquid EcoTyper is built around a CpG set binary network (CSBN), in which informative CpG sets and associated weights are learnt simultaneously within a unified framework to enable multivariate prediction of SE levels (Fig. 4a). This CSBN approach draws on the gene set binary network model originally introduced for predicting cellular developmental potential from single-cell RNA-seq data¹⁰⁸. Notably, sample methylation profiles are encoded at the CpG set level for model inference, analogous to gene sets in the previous study. This representation improves robustness to batch effects and technical dropout in methylation sequencing data, and enhances generalizability across data types, including both tumour and plasma methylation profiles.

Liquid EcoTyper network architecture

Input and output

As input, the model takes an n × s matrix X containing the preprocessed methylation levels of n CpGs over s samples (section ‘Methylation data preprocessing’ below). At evaluation, the model yields an s × l matrix Ŷ containing the predicted levels of l classes over s samples. Model classes include SEs, non-SE (section ‘Recovery of SE-specific cell states’ above) and a ‘background’ class representing DNA not derived from the tumour compartment (Fig. 4a).

CpG set binary network

Model input X is passed to a core binary module in which m CpG sets are learnt in binary n × m matrix W^B. As described previously¹⁰⁸, W^B has a continuous equivalent W used during training for model initialization and back-propagation that undergoes binarization at each forward pass. Here, CpG sets encoded within W^B are scored simply by averaging normalized methylation values over the selected CpGs per sample:

$$S:=\rms\rmc\rmo\rmr\rme(X,W^B)\in \mathbbR^s\times m,$$

$$S_\bullet ,1\le j\le m=\fracX^\rm\top W_\bullet ,j^B\parallel W_\bullet ,j^B\parallel _1,$$

where$\,S_\bullet ,j$ and $W_\bullet ,j^B$ denote the j-th columns of the respective matrices. Scores are standardized across samples by batch normalization, yielding s × m score matrix S^norm.

Prediction layer

The CpG set scores encoded in S^norm are then passed through a linear layer to produce s × l matrix Q. This matrix is then transformed using the sigmoid function σ and rescaled to yield the final output prediction Ŷ:

$$P=\rm\sigma (Q)+0.001\in \mathbbR^s\times l,\hatY_1\le i\le s,\bullet =\fracP_i,\bullet \parallel P_i,\bullet \parallel _1.$$

where $\hatY_i,\bullet $ and $P_i,\bullet $ denote the i-th rows of the respective matrices.

Liquid EcoTyper training

The Liquid EcoTyper model was implemented and trained using PyTorch 2.2.0. In practice, the model feature space is of size n = 38,431 CpGs and the CpG set binary module included m = 400 CpG sets.

Loss function

For model training we define a custom loss function incorporating both the mean cross-SE Pearson correlation per sample and the mean cross-sample Pearson correlation per SE. In combination with the model structure, this loss function prioritizes robust maintenance of linear relationships, particularly of SE levels across samples—essential for clinical utility—while discouraging overfitting to training values. In detail, we define prediction loss as follows:

$$\beginarrayc\rmP\rmr\rme\rmd\rmL\rmo\rms\rms(\hatY,Y)=1-\frac1s\mathop\sum \limits_i=1^s\rmP\rme\rma\rmr\rms\rmo\rmn(\hatY_i^\rm\top ,Y_i^\rm\top )\\ \,\,\,\,\,\,\,\,\,+\,1-\frac1l\mathop\sum \limits_j=1^l\rmP\rme\rma\rmr\rms\rmo\rmn(\hatY_j,Y_j),\endarray$$

where Ŷ and Y denote predicted and true sample level matrices, respectively, and subscripts indicate matrix columns. Along with the prediction loss, we also include a term penalizing CpG set size as described previously⁷⁸, designed to provide additional regularization to the model. Full model loss is then computed as

$$\rmL\rmo\rms\rms(\hatY,Y,W^B)=\rmP\rmr\rme\rmd\rmL\rmo\rms\rms(\hatY,Y)+\lambda \parallel \frac1n(W^B)^\rm\top (W^B)\odot I\parallel _F,$$

where $\lambda $ denotes the CpG set size penalty weight, I denotes the m × m identity matrix, ⊙ denotes the Hadamard product, and $\parallel \bullet \parallel _F$ denotes the Frobenius norm. In practice, $\lambda $ was set to $\sqrt10$ to balance loss terms.

Model regularization

To support robustness to technical drop-out in EM-seq data, drop-out was applied to model input matrix X during training at a rate of 0.5.

Model initialization and updates

Model weights were given default initialization except binary weight matrix W, which here was initialized with values sampled from the Gaussian distribution with mean μ = –0.125 and standard deviation σ = 0.055 to give a highly sparse initial matrix targeting 400–500 CpGs initially selected per set. At each iteration, model parameters were updated using PyTorch’s NAdam optimizer with learning rate lr = 0.001 and cross-epoch gradient accumulation given the stabilizing role of inertia in training binary neural networks^109,110.

Model training, evaluation and stopping

Models were trained over ten random splits (80% training, 20% validation) of the simulated training cohort (section ‘Simulation of plasma cfDNA with tumour contribution’ below). Each model was trained for 40 epochs, with model performance by epoch evaluated over training and validation sets using the PredLoss function described above (‘Loss function’ above). Performance on the validation split was used for early stopping. Final model weights were selected corresponding to the epoch yielding best validation performance.

Model ensembling

For all evaluations outside the training framework, the outputs of all ten models (from ten random folds) are averaged to yield a single ensembled prediction matrix.

Initial feature selection

To prepare a reduced model feature space for efficient training, informative CpGs per SE were selected from the TCGA melanoma cohort, which includes paired methylation and bulk RNA-seq profiles (n = 461 tumours) obtained from the TCGA PanCanAtlas (https://gdc.cancer.gov/about-data/publications/pancanatlas). First, the set of CpGs was reduced to those detected across all TCGA melanoma tumours. For each SE, we then performed differential methylation analysis in the training cohort to identify differentially methylated CpGs. Specifically, for each SE class, we grouped samples with inferred SE abundance from paired bulk RNA-seq data above the 75th or below the 25th quantile. We then assessed each CpG for differential methylation between groups. We computed the significance of group difference by the Wilcoxon rank-sum test and the magnitude by absolute value of the difference of group means, $\Delta =|\mu _75-\mu _25|$, selecting CpGs satisfying P < 0.05 and $\Delta $ > 0.1. If more than 5,000 CpGs met both criteria for an SE, we selected the top 5,000 by differential methylation magnitude. Selection of up to 5,000 CpGs was done separately for positive and negative differential methylation, allowing for a total of up to 10,000 CpGs to be selected per SE. To ensure adequate features for SE recovery from methylation data, we required a minimum of 1,000 informative CpGs per SE. In practice, ecotype SE6 did not meet this criterion for the melanoma model and was omitted. In total, this yielded a feature space of 38,431 CpGs.

Methylation data preprocessing

Mapping EM-seq CpGs to HM450K probe IDs

Methylation observations from EM-seq data, given in terms of CpG locations (chromosome, start and end coordinates), were matched to HM450 probe IDs using the GRCh38 HM450 manifest downloaded from https://zwdzwd.github.io/InfiniumAnnotation. In brief, the manifest was reduced to entries with defined values for CpG location as well as HM450 probe ID. Any duplicate entries per CpG location were subsequently dropped, yielding a one-to-one mapping.

Normalization and imputation

Data were subset to model features with methylation values subtracted from one and normalized to zero mean and unit variance over model CpGs per sample. Missing values were then imputed per CpG by dataset mean when defined and replaced by zeros otherwise.

Simulation of plasma cfDNA with tumour contribution

Because Liquid EcoTyper requires training data with known composition, and because of the limited availability of paired tumour and plasma cfDNA data, we simulated a training dataset by combining plasma cfDNA and tumour methylation profiles (Fig. 4b and Extended Data Fig. 10a). In doing so, we considered several factors. First, different malignant and TME cell states may exhibit variable kinetics of DNA shedding into circulation. We therefore prioritized cross-patient relative levels of SEs during training (‘Loss function’ above). Second, and relatedly, real biopsy samples, including those from patients with metastatic cancer, are prone to sampling bias, so plasma and ground-truth tumour SEs need not match exactly. Third, even cfDNA samples from patients with cancer with undetectable circulating tumour DNA by mutation analysis may include some tumour (and TME) methylation signal. Therefore, to isolate TME-derived signal during model training, we leveraged healthy donor plasma cfDNA to serve as a background class. Finally, inter-subject variation is much lower among cfDNA samples from healthy controls than from patients with cancer in our data (Extended Data Fig. 10b). We therefore considered it justifiable to combine methylomes from different healthy individuals to increase variation and boost the size of the training cohort (‘Generation of additional background cfDNA profiles’ below).

Accordingly, we designed simulated mixtures to contain both ‘background’ cfDNA and tumour tissue contributions, with the background compartment derived from plasma cfDNA profiled from cohort 2 healthy individuals (n = 23 in total; partitioned into n = 18 for training cohort and n = 5 for test cohort) and the tumour compartment derived from the TCGA melanoma methylation cohort (n = 461 in total; partitioned into n = 346 for training cohort and n = 115 for test cohort) (Fig. 4b,c and Extended Data Fig. 10a). Paired bulk RNA-seq data from TCGA tumour samples enabled the recovery of tissue SE levels through bulk deconvolution, as described in section ‘NMF model training for bulk deconvolution’ above.

Generation of additional background cfDNA profiles

Given the rationale outlined above, to expand the pool of healthy cfDNA profiles, we prepared new profiles from the samples of each cohort separately to reach the same number of samples as available from tumour tissue. For each new profile, we randomly selected three samples from the appropriate cohort (training or test), perturbing the methylation profiles of each sample with noise to reduce collinearity across new mixtures, then combining according to randomly generated fractions.

In detail, fractional composition was generated by sampling from the unit interval uniformly at random for each sample, then rescaling the three values to sum to one, yielding mixing fractions ϕ₁, ϕ₂ and ϕ₃. Noise perturbations were applied multiplicatively to raw methylation values, with noise level parameters ν₁, ν₂ and ν₃ generated by sampling from the normal distribution. Given sample methylation profiles M₁, M₂ and M₃, perturbed profiles $\hatM_1,\hatM_2\,\rmand\,\hatM_3$ were computed as follows:

$$\hatM_i=min((1+s\nu _i)M_i,1),$$

where s denotes a scaling parameter, set here to s = 0.02, and min operates element-wise to cap resulting methylation values at 1. With these mixing parameters defined, the resulting new healthy cfDNA profile M_new is given by:

$$M_\rmn\rme\rmw=\mathop\sum \limits_i=1^3\phi _i\hatM_i.$$

Combination of background and tumour compartments

With the resulting background cfDNA profiles now matching the tumour tissue profiles in number, we generated the final sets of simulated samples by matching background and tumour profiles one-to-one in the training and test cohorts (Extended Data Fig. 10a). For each simulated sample, we generate mixing fractions ϕ_T and ϕ_B of tumour and background, respectively, by sampling ϕ_B uniformly at random from a desired range, then computing ϕ_T = 1 − ϕ_B as the complement. This range was selected here as [0.2, 0.6] to weight fractions in favour of tumour contribution while preserving substantial background for regularization to support model extensibility to both tumour tissue and plasma cfDNA methylation profiles. Of note, tumour samples contain highly variable malignant cell content (‘Validation of simulated samples’ below), capturing a broad range of both malignant and TME DNA (Extended Data Fig. 10b). As such, this approach enables sufficient exposure of the model to diverse tumour fractions during training, facilitating generalizability. The resulting simulated samples are then constructed as:

$$M_\rmn\rme\rmw=\phi _TM_T+\phi _BM_B,$$

where M_T denotes the raw methylation profile of the selected tumour tissue sample and M_B denotes the raw methylation profile of the selected background profile, newly generated as described above.

Ground-truth composition of simulated samples

Tumour-specific SE levels used for simulating cfDNA above were inferred by applying the Spatial EcoTyper deconvolution model to tumour RNA-seq profiles as described above in section ‘NMF model training for bulk deconvolution’. Following this, any SE with inadequate feature support in Liquid EcoTyper (for example, SE6 in the melanoma-specific model) was omitted and the remaining sample-level deconvolution results were rescaled to unit sum. To define ground-truth SE levels in simulated plasma cfDNA samples, we then scaled the SE levels in corresponding tumour samples by ϕ_T, with the ground-truth background level given by ϕ_B, for each simulated sample.

Validation of simulated samples

To determine whether simulated profiles are a suitable proxy for authentic cfDNA profiles, we performed a direct assessment against real cfDNA samples from healthy individuals and patients with melanoma. Profiles from cohort 2 healthy individuals (n = 23), cohort 3 patients with melanoma (n = 78) and simulated training and test cohorts (n = 461) were jointly embedded by PCA. This embedding revealed tight clustering of healthy plasma cfDNA methylomes, with some melanoma cfDNA samples overlapping the healthy region but many others scattering out further (Extended Data Fig. 10b, left). Simulated cfDNA methylomes overlapped real melanoma cfDNA methylomes (Extended Data Fig. 10b, left).

Given these results, we hypothesized that the embedding distribution was organized by sample tumour content. Thus, we extended the visualization to colour by ctDNA percentage (Extended Data Fig. 10b, right). For real cfDNA samples, ctDNA percentage was given by AVENIO for patients with melanoma (n = 60; Supplementary Table 26) and set to zero for healthy individuals. We computed an effective ctDNA percentage for simulated profiles by multiplying the total tumour fraction by the tumour purity using consensus measurement of purity estimations¹¹¹, available for 456 of 461 samples. The resulting embedding revealed a gradient of tumour content across the embedding for both real and simulated melanoma cfDNA samples, with samples harbouring the lowest tumour content placed closer to healthy plasma (Extended Data Fig. 10b, right).

Performance assessment of Liquid EcoTyper

Application to held-out simulated data

We tested Liquid EcoTyper’s ability to recover ground-truth levels of SEs from plasma methylation profiles by application to the held-out test cohort of simulated data (n = 115; section ‘Simulation of plasma cfDNA with tumour contribution’ above). Performance for each SE individually, as well as for recovery of the underlying cross-correlation structure of SE levels, was assessed by Spearman correlation between ground-truth and predicted levels (Fig. 4c and Extended Data Fig. 10c).

Extension to carcinomas

To evaluate generalizability, we prepared Liquid EcoTyper models for 13 carcinoma types profiled by TCGA, each with paired bulk RNA-seq and methylation arrays for more than 100 tumour samples. Colon adenocarcinoma (COAD) and rectum adenocarcinoma (READ) were combined as colorectal cancer (CRC). For each carcinoma type, we repeated the procedures described in sections ‘Initial feature selection’, ‘Simulation of plasma cfDNA with tumour contribution’, ‘Model training’ and ‘Application to held-out simulated data’ above. As detailed in ‘Initial feature selection’, any SE with initial feature set selection resulting in fewer than 1,000 associated CpGs was excluded from model training and evaluation (SE1 from two carcinomas, SE3 from one carcinoma, SE6 from four and SE9 from one). We aggregated model performance results across carcinoma types by average Spearman correlation coefficients between predicted (Liquid EcoTyper) and expected SE levels for held-out test sets (Extended Data Fig. 10d, left).

We also evaluated a pan-carcinoma leave-one-out framework, training Liquid EcoTyper models over 12 carcinoma types at a time (150 randomly selected tumour samples per type to balance representation, totalling 1,800 training samples) and evaluating performance on each held-out carcinoma type in turn. For each pan-carcinoma model, we followed the same process as described above, with the exception that the initial CpG feature set for each cohort was selected according to a consensus mechanism across included carcinoma types. For each SE and methylation direction, CpGs were ranked by selection frequency across per-carcinoma feature sets and the top 5,000 were included. With this approach, all SEs had adequate CpG feature set coverage. To promote a fair comparison with carcinoma-specific models (above), for each held-out carcinoma type c, we excluded any SEs from model evaluation that were also excluded from the model exclusively trained on carcinoma c. Results were aggregated as described above (Extended Data Fig. 10d, centre).

Paired tumour and plasma from patients with melanoma

We assessed concordance of Liquid EcoTyper predictions for plasma cfDNA collected from 23 patients with melanoma (cohort 2) (Fig. 4d) against: first, SE levels inferred from matched tumour Visium or Visium HD by Spatial EcoTyper (15 patients); second, Liquid EcoTyper predictions from matched tumour EM-seq (20 patients); and third, Liquid EcoTyper predictions from matched PBMC EM-seq (Supplementary Table 17), as shown in Fig. 4e–i and Extended Data Fig. 11b.

For five tumour samples, EM-seq was performed on both FFPE and fresh-frozen sections (Supplementary Table 17). Given that SE levels inferred by Liquid EcoTyper were largely concordant across replicates (Extended Data Fig. 11a), we averaged SE levels across replicates for subsequent analyses. SE levels for Visium data were inferred as described above in ‘Paired bulk RNA-seq and Visium ST’ following quality control as described in ‘10x Visium (standard)’ in Supplementary Methods. For cases in which Visium replicates of adjacent sections were available (two patients, each with two samples profiled by Visium), we averaged SE levels across replicates (Supplementary Table 17). Two Visium HD samples were also included in the comparison with paired plasma EM-seq data, subject to quality control, as described in ‘10x Visium HD data’ in Supplementary Methods. To avoid platform-specific variation in analysing Visium and Visium HD data jointly, SE abundances for Visium HD were inferred for 16-µm bins using the same Spatial EcoTyper deconvolution protocol as described for Visium (section ‘Paired bulk RNA-seq and Visium ST’ above).

Liquid EcoTyper outputs were renormalized to unit sum excluding inferred plasma cfDNA background levels. We then evaluated the consistency of SE levels between tumour ST and plasma cfDNA EM-seq (Fig. 4e–h), as well as between tumour EM-seq and plasma cfDNA EM-seq (Fig. 4e,i and Extended Data Fig. 11b), and between PBMC EM-seq and both tumour and plasma EM-seq (Fig. 4i). To emphasize relative ordering, SE levels were ranked and normalized to the 0–1 range in each compartment, with concordance then quantified by Spearman correlation for each SE. All of the SE levels are provided in Supplementary Table 18.

To visualize the most prominent SE signals in Fig. 4h (centre), the levels of each SE (SE7 or SE4) in the Visium data were first separately standardized to mean zero and unit variance across all spots and samples. Next, for each sample, the 95th percentile of SE7 and SE4 levels together was determined, and SE levels were binarized by setting all values greater than this threshold to one and all other values to zero. The bar plots in Fig. 4h (bottom) show the means of the binarized data.

Validation of Liquid EcoTyper through learnt CpG sets

To further assess the ability of Liquid EcoTyper to learn biologically grounded methylation profiles of spatial cellular ecosystems, we performed a series of experiments leveraging the CpG sets learnt by the model to determine whether model predictions successfully target each SE accurately, specifically and in accordance with known biological features.

Concordance with ground-truth associations

We first assessed whether Liquid EcoTyper successfully learns and preserves the associations of each CpG set with ground-truth SE composition. For each CpG set extracted from the model, we averaged the methylation levels of its component CpGs in each training cohort sample, then compared the resulting quantities to ground-truth SE levels, for each SE, by Pearson correlation. We then replaced ground-truth with predicted SE levels and recomputed the associations for each CpG set–SE pair. Concordance between associations with ground-truth and predicted SE levels was evaluated using Pearson correlation across all CpG sets, separately for each SE (Extended Data Fig. 10e).

Model specificity

To assess the ability of Liquid EcoTyper to target each SE specifically for predictions, we evaluated the impact of ablating the top associated CpG sets per SE on model performance. For each SE, we selected all CpG sets with ground-truth Pearson correlations greater than 0.25 in absolute value among the training cohort (as determined from ‘Concordance with ground-truth associations’ above) and set the corresponding entries of the binary matrix W^B encoding the learnt CpG sets to zero for each ensemble model. We then applied the resulting model to the held-out test cohort methylomes, assessing Liquid EcoTyper specificity for the SE in terms of performance loss. We defined a performance loss index as the fractional difference in Spearman correlations ρ of predicted versus ground-truth SE levels between the original (unablated) and ablated Liquid EcoTyper models:

$$\frac\rho _\rmo\rmr\rmi\rmg\rmi\rmn\rma\rml-\rho _\rma\rmb\rml\rma\rmt\rme\rmd\rho _\rmo\rmr\rmi\rmg\rmi\rmn\rma\rml.$$

Performance loss was calculated for all SEs following the ablation of each SE individually and comparing the loss for the ablated SE (on-target) with that for other SEs (off-target) (Extended Data Fig. 10f). For ease of interpretability, performance loss index values were capped to a range of 0 to 1, defined as performance losses of 0% (ρ_ablated ≥ ρ_original) and 100% (ρ_ablated ≤ 0), respectively.

Biological interpretability

Hypomethylated CpGs proximal to the transcription start site (TSS) (for example, promoters, the first exon and the first intron) are generally associated with elevated gene expression¹¹². Because Liquid EcoTyper learns CpG sets that include promoters and gene bodies, and because the Illumina Infinium HumanMethylation450 (HM450) BeadChip array is enriched for regulatory and TSS-proximal CpGs¹¹³, we proposed that, on average, learnt CpGs within these regions would reflect this general relationship. Therefore, to investigate links between CpG methylation and SE classes (Extended Data Fig. 10g–i), we focused on model CpGs (section ‘Initial feature selection’ above) that overlap with promoter regions, defined as the 1-kilobase region upstream of the TSS, and the gene body, with hg38 coordinates obtained from the HGNC track at https://genome.ucsc.edu/cgi-bin/hgTables.

For each SE, we constructed a ranked gene list from model CpGs and ground-truth CpG set associations among the training cohort. To do so, each CpG set was first converted to a gene set comprising all of the unique HUGO gene symbols overlapping the selected CpGs in either promoter regions or gene body at least once. For each gene g within a given SE, we then computed the average ground-truth SE association (‘Concordance with ground-truth associations’ above) over all CpG sets in which gene g is represented, resulting in gene-level methylation scores for each SE. The gene-level scores for each SE—along with non-SE and background components—were converted into rank space with higher ranks denoting lower scores. This yielded a rank matrix R consisting of 8,860 evaluable genes (rows) by 10 components (columns). To prioritize SE-specific features, we then calculated, for each gene in R, the delta in rank space between each SE and the maximum rank among other components (the remaining SEs, non-SE and background). This resulted in a new matrix D, comprising deltas for each gene for all ten components. We then removed non-SE and background components from R and D, averaged the resulting matrices to exploit complementary information (ensuring that SEs of the same class were averaged), and median-subtracted each column to yield equal numbers of positive and negative ranks for each SE. Finally, we subjected the resulting SE-specific ranked lists to fgsea (v.1.25.1)¹¹⁴, testing the enrichments of SE-specific consensus genes (Extended Data Fig. 10g,h and Supplementary Table 11). Global significance (Extended Data Fig. 10i) was determined by calculating the number of top-ranked diagonal matches (row or column) across all possible permutations of the ranked lists, yielding an exact permutation P-value.

Consistency of associations

To assess faithfulness of model predictions to the learnt CpG set associations, we applied the procedure outlined above in ‘Concordance with ground-truth associations’ to compute CpG set associations with predicted SE levels for melanoma plasma cfDNA collected as part of cohorts 2, 3, and 4, split by institution (in total, 79 samples collected at Yale and 17 at Washington University in St. Louis; Supplementary Table 21). We then compared the resulting CpG associations between institutions using Pearson correlation for each SE (Supplementary Fig. 8).

Robustness of Liquid EcoTyper

To evaluate the robustness of plasma SE recovery to technical variation, we performed the following two experiments.

Plasma SE recovery versus sequencing depth

For the analysis in Extended Data Fig. 11c, we performed a down-sampling experiment using cohort 2 plasma samples with paired tumour ST profiling. Aligned read pairs were randomly downsampled using samtools¹¹⁵ (v.1.18) at predefined fractions to achieve target sequencing depths of 20×, 15×, 10× and 5× for each sample. Down-sampled BAM files were subsequently processed through the remainder of the pipeline described in ‘Enzymatic methyl sequencing’ in Supplementary Methods. Samples with an original sequencing depth below a given target were retained at their original depth. Liquid EcoTyper performance was then re-evaluated as described above in ‘Paired tumour and plasma from patients with melanoma’ using down-sampled plasma samples. Performance remained robust down to 10× depth (the minimum depth in this study), with a significant but modest decline observed at 5× depth (Extended Data Fig. 11c).

Plasma SE recovery versus CpG imputation approach

To evaluate whether the approach for imputing missing CpG methylation values (‘Methylation data preprocessing’ above) influences Liquid EcoTyper predictions (Extended Data Fig. 11d,e), we tested an alternative imputation strategy in which missing values were imputed uniformly at random from the range [0,1] before data normalization. Although the default approach leverages other methylation profiles in the cohort to infer missing values, this alternative uses no prior knowledge and instead adds random noise. Applied to cohort 2 plasma samples with paired tumour ST profiling, the random-imputation approach had only a small effect on samples with low coverage (higher imputation fraction), and there was no significant global difference in SE recovery performance, emphasizing robustness.

Association between liquid SE levels and clinical outcomes in ICI-treated patients

To evaluate the clinical relevance of SE levels in plasma cfDNA from patients with melanoma, we applied Liquid EcoTyper as described above in ‘Paired tumour and plasma from patients with melanoma’ to infer SE levels in 78 pretreatment plasma samples from cohort 3 (Fig. 5a and Supplementary Tables 19 and 26) and 10 plasma samples from cohort 4 (Supplementary Tables 20 and 26).

Liquid SEs versus ICI response

The association between each SE and ICI response was assessed using a two-sided Wilcoxon rank-sum test, comparing patients with DCB to those with NDB in each dataset (Fig. 5c,e, Extended Data Fig. 12a,g and Supplementary Tables 19, 20 and 27) and by AUC (Extended Data Fig. 12f and Supplementary Table 27).

Liquid SEs versus survival

For Kaplan–Meier analyses (Fig. 5d,f and Extended Data Fig. 12b–d), we dichotomized patients based on the median level of each liquid SE (SE7, SE8 and SE4) in cohort 3 (Supplementary Table 18). Differences in OS and PFS were assessed using a two-sided log-rank test with the survival (v.3.6.4) R package. Associations with OS and PFS were also evaluated using both univariate and multivariable Cox regression models, including models combining continuous liquid SE7, SE8 or SE4 levels with other clinical indices, such as sex, age, ICI type, melanoma subtype and BRAF mutation status. Results are provided in Extended Data Fig. 12e and Supplementary Table 28. Additional Cox models evaluating the associations between OS and continuous liquid SE levels, ctDNA levels, TMB (log₂-transformed number of non-synonymous mutations per megabase) and PD-L1 percentages (tumour proportion score, TPS), both alone and in bivariable models comparing liquid SE7, SE8 or SE4 levels against each of the above covariates, are provided in Extended Data Fig. 12h and Supplementary Table 27. For PD-L1 TPS values reported as ranges (for example, 1–5), the median of the range was used. For PD-L1 TPS values reported as inequalities, less than x or more than x, we subtracted or added 0.5 to x, respectively (Supplementary Table 24). Finally, we performed a time-dependent AUC(t) analysis for comparative OS prediction using the timeROC function of the timeROC R package¹¹⁶ (v.0.4 with default parameters) (Supplementary Fig. 9). The 6–18-month interval after ICI initiation was selected because it captures the minimum time needed for gauging durable clinical benefit (6 months) and provides additional time for delayed benefit. It also avoids losing patients to follow-up and minimizes the sparsity of events in cohort 3.

Statistics and reproducibility

Unless otherwise noted, all statistical tests were two-sided. Two-group comparisons were conducted with Wilcoxon rank-sum or signed-rank tests as appropriate. For comparisons of SE levels imputed from tissue expression data (bulk, Visium ST and scRNA-seq), which are expected to be linearly related, Pearson correlations were applied to assess concordance. For tissue versus plasma comparisons of SE levels, Spearman correlations were used to assess the directional concordance of SE levels because strict assumptions of normality and linearity could not be guaranteed. For Cox regression models, the proportional hazards assumption was verified for each covariate by evaluating the Schoenfeld residuals.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.