Coordinated inheritance of extrachromosomal DNAs in cancer cells

Cell culture

The TR14 neuroblastoma cell line was a gift from J. J. Molenaar (Princess MÃ¡xima Center for Pediatric Oncology). Cell line identity for the master stock was verified by STR genotyping (IDEXX BioResearch). The GBM39-KT cell line was derived from a patient with glioblastoma undergoing surgery at Mayo Clinic, Rochester, Minnesota as described previously⁵⁵. Monoclonal spheroids were isolated from GBM39-KT cells by limiting dilution to generate GBM39-KT-D10. The CA718 cell line was derived from a patient with glioblastoma as described previously⁵ and was obtained from the University of California San Diego Moores Cancer Center. Parental SNU16, COLO 320DM, H716 and HCT116 cells were obtained from ATCC. The monoclonal SNU16m1 was a sub-line of the parental SNU16 cells generated from a single cell after lentiviral transduction and stable expression of dCas9-KRAB as we previously described⁴. SNU16 and SNU16m1 cells were maintained in Dulbeccoâs modified Eagleâs medium/nutrient mixture F-12 (DMEM/F12 1:1; Gibco, 11320-082), 10% fetal bovine serum (FBS; Hyclone, SH30396.03) and 1% penicillinâstreptomycin (Thermo Fisher Scientific, 15140-122). COLO 320DM cells were maintained in DMEM (Thermo Fisher Scientific, 11995073) supplemented with 10% FBS and 1% penicillinâstreptomycin. GBM39-KT cells were maintained in DMEM/F12 1:1, B-27 supplement (Gibco, 17504044), 1% penicillinâstreptomycin, GlutaMAX (Gibco, 35050061), human epidermal growth factor (EGF, 20ângâml^â1; Sigma-Aldrich, E9644), human fibroblast growth factor (FGF, 20ângâmlâ1; Peprotech) and heparin (5âÎ¼gâmlâ1; Sigma-Aldrich, H3149-500KU). TR14 cells were grown in RPMI 1640 with 20% FBS and 1% penicillinâstreptomycin. For the mitotic cell imaging experiments in Fig. 2, SNU16m1 cells were grown in RPMI 1640 with 10% FBS. H716 cells were grown in ATCC formulated RPMI 1640 (Gibco, A1049101) with 10% FBS and 1% penicillinâstreptomycinâglutamine. COLO 320DM cells used for live-cell imaging, PC3 and HCT116 were cultured in DMEM (Corning, 10-013-CV) with 10% FBS and 1% penicillinâstreptomycinâglutamine. All cells were cultured at 37âÂ°C with 5%âCO₂. All cell lines tested negative for mycoplasma contamination.

Chemicals

BRD4 bivalent degrader was a gift from M. M. Hassan and N. S. Gray, and was resuspended in DMSO as 10âmM stock⁵⁶. Triptolide (Millipore, 645900) was resuspended with DMSO as 55âmM stocks and were used at a final concentration of 10âÂµM. Actinomycin D (Millipore Sigma, SBR00013) was used at a final concentration of 5âÂµgâml^â1. DRB (Sigma-Aldrich, D1916) was resuspended with DMSO as 70âmM stocks and was used at a final concentration of 200âÂµgâml^â1. ZL-12A was synthesized as reported previously⁴⁶ and resuspended in DMSO as 20âmM stock, and was used at a final concentration of 50âÂµM for 3âh. In the pretreatment assay with triptolide, ZL-12A was added for 3âh, followed by a wash-off with 1Ã PBS and the addition of DMSO or triptolide (10âÂµM) for 3.5âh.

Genetic knockout of CIP2A

CIP2A-knockout cells were created using the SNU16m1 cells as follows. We designed a guide RNA sequence targeting the protein-coding region of CIP2A using CHOPCHOP⁵⁷ (https://chopchop.cbu.uib.no), as well as a non-targeting control sgRNA (guide sequences are provided in Supplementary Table 1). To deliver each guide with CRISPRâCas9 into cells, we mixed purified S. pyogenes Cas9 nuclease (Alt-R S.p. Cas9 Nuclease V3; IDT, 1081058) with each single-guide RNA (sgRNA; diluted to 30âÎ¼M in 1Ã TE buffer; Synthego) at a 1:6 molar ratio in Neon Resuspension Buffer R (Thermo Fisher Scientific) and incubated it at room temperature for 10âmin to form Cas9 ribonucleoprotein (RNP) complexes. SNU16m1 cells were collected and washed twice with 1Ã PBS before being resuspended in Buffer R with Cas9 RNPs for a final concentration of 300,000 cells per 10âÎ¼l Neon reaction with 0.71âÎ¼M Cas9 complexes. Transfection was performed using the Neon Transfection System (Thermo Fisher Scientific, MPK5000) according to the manufacturerâs protocol using 10âÎ¼l tips with the following parameters: 1,400âV, 20âmâs^â1, 2 pulses. Three Neon reactions per guide condition were combined, resulting in 900,000 cells for either the control or CIP2A-knockout genotype.

WGS

WGS libraries were prepared by DNA tagmentation. We first transposed genomic DNA with Tn5 transposase produced as previously described⁵⁸, in a 50âÂµl reaction with TD buffer⁵⁹, 50âng DNA and 1âÂµl transposase. The reaction was performed at 50âÂ°C for 5âmin, and transposed DNA was purified using the MinElute PCR Purification Kit (Qiagen, 28006). Libraries were generated by 5â7 rounds of PCR amplification using the NEBNext High-Fidelity 2Ã PCR Master Mix (NEB, M0541L), purified using SPRIselect reagent kit (Beckman Coulter, B23317) with double size selection (0.8Ã right, 1.2Ã left) and sequenced on the Illumina NextSeq 550 or the Illumina NovaSeq 6000 platform. Reads were trimmed of adapter content with Trimmomatic⁶⁰ (v.0.39), aligned to the hg19 genome using BWA MEM⁶¹ (0.7.17-r1188) and PCR duplicates were removed using Picardâs MarkDuplicates (v.2.25.3). WGS data from bulk SNU16 cells were previously generated (SRR530826, Genome Research Foundation).

Analysis of ecDNA sequences in TCGA patient tumours

We performed ecDNA detection based on bulk WGS data from TCGA using the AmpliconArchitect (AA) method for genomic focal amplification analysis. The outputs of this method were previously published¹⁹. In brief, this approach for detecting ecDNA uses three general steps which are wrapped into a workflow we call AmpliconSuite-pipeline (https://github.com/AmpliconSuite/AmpliconSuite-pipeline, v.1.1.1). First, given a BAM file, the analysis pipeline performs detection of seed regions where copy-number amplifications exist (CNâ>â4.5 and size between 10âkb and 10âMb). Second, AA performs joint analysis of copy number and breakpoint detection in the focally amplified regions, forming a copy-number aware local genome graph. AA extracts paths representing genome structures and substructures from this graph that explains the changes in copy number. Last, a rule-based classification is performed using AmpliconClassifier (AC)⁶², based on the paths extracted by AA to predict the mode of focal amplification. This includes assessing structural variant types, segment copy numbers and the structure of the genome paths extracted by AA. Moreover, AC identifies ecDNA cycles based on criteria such as cyclic path length and copy number, providing a comprehensive classification system for amplicons on the basis of their structural characteristics. For example, if the changes in copy number are explained predominantly by one or more circular genome paths featuring a structural variant enclosing them with a head-to-tail circularization, this is consistent with an ecDNA mode of amplification, whereas a breakage-fusion-bridge genome structure contains multiple foldbacks and multiple genomic segments arranged in a palindrome. The complete classification criteria and description of the AC tool are available in the supplementary information of ref. ⁶².

We used AA (v.1.0) outputs from a previous study¹⁹, and classified focal amplifications types present in these outputs using AC (v.0.4.14) with the â–filter_similarâ flag set and otherwise the default settings. The â–filter_similarâ option removes probable false-positive focal amplification calls that contain far greater-than-expected levels of overlapping structural variants and shared genomic boundaries between ecDNAs of unrelated samples. In brief, AC scores the structural similarity of focal amplifications. These scores consider both genomic interval overlap and shared breakpoint junctions, with breakpoints deemed to be shared if their total distance is less than a specified threshold (defaultâ=â250âbp). Moreover, AC computes similarity scores for amplicons from unrelated origins, establishing a background null distribution for comparison. The tool uses a Î²-distribution model to fit the empirical null distribution, providing estimation of statistical significance of the similarity score. Out of 8,810 AA amplicons in the ref. ¹⁹ TCGA dataset, 45 candidate focal amplifications were removed by this filter.

To predict the distinct number of ecDNA species present in a sample, we used the genome intervals reported by AC for each focal amplification. AC determines the number of distinct, genomically non-overlapping ecDNA species present by clustering ecDNA genome intervals if those regions are connected by structural variants or the boundaries of the regions are within 500âkb. If intervals do not meet this criteria, AC predicts them as being unconnected and reports them as separate ecDNA species. AC uses a list of oncogenes that combines genes in the ONGene database (https://pubmed.ncbi.nlm.nih.gov/28162959/) and COSMIC (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6450507/).

Paired scATAC-seq and scRNA-seq library generation

Single-cell paired RNA-seq and ATAC-seq libraries were generated on the 10x Chromium Single-Cell Multiome ATAC + Gene Expression platform according to the manufacturerâs protocol and sequenced on an Illumina NovaSeq 6000 system. Data for COLO 320DM were generated previously⁴ and published under Gene Expression Omnibus (GEO) accession GSE159986.

Paired scATAC-seq and scRNA-seq analysis

A custom reference package for hg19 was created using cellranger-arc mkref (10x Genomics, v.1.0.0). The single-cell paired RNA-seq and ATAC-seq reads were aligned to the hg19 reference genome using cellranger-arc count (10x Genomics, v.1.0.0).

Subsequent analyses on RNA were performed using Seurat (v.3.2.3)⁶³, and those on ATAC-seq were performed using ArchR (v.1.0.1)⁶⁴. Cells with more than 200 unique RNA features, less than 20% mitochondrial RNA reads and less than 50,000 total RNA reads were retained for further analyses. Doublets were removed using ArchR. Raw RNA counts were log-normalized using Seuratâs NormalizeData function and scaled using the ScaleData function. Dimensionality reduction for the ATAC-seq data was performed using Iterative Latent Semantic Indexing (LSI) with the addIterativeLSI function in ArchR.

We next calculated amplicon copy numbers based on background ATAC-seq signals as we previously described and validated^4,32. In brief, we determined read counts in large intervals across the genome using a sliding window of 3âMb moving in 1âMb increments across the reference genome. Genomic regions with known mapping artifacts were filtered out using the ENCODE hg19 blacklist. For each interval, insertions per bp were calculated and compared to 100 of its nearest neighbours with matched GC nucleotide content. The mean log₂[fold change] was computed for each interval. On the basis of a diploid genome, copy numbers were calculated using the formula ${\rm{CN}}=2\times {2}^{{\log }_{2}[{\rm{FC}}]}$), where CN denotes copy number and FC denotes mean fold change compared with neighbouring intervals. To query the copy numbers of a gene, we obtained all genomic intervals that overlapped with the annotated gene sequence and computed the mean copy number of those intervals.

For analyses presented in Extended Data Fig. 5aâc, we inferred cell cycle stage from each cellâs RNA-seq data using the CellCycleScoring function in Seurat and the gene sets for S and G2M phases included in the Seurat package. Copy-number correlations were then evaluated for cells grouped by their inferred cell cycle phase: G1, S, or G2M.

scCircle-seqÂ analysis

TR14 scCircle-seq data were previously generated⁶⁵Â andÂ deposited at the European Genome-Phenome Archive (EGA) under accession number EGAS00001007026. A detailed description of the single-cell extrachromosomal circular DNA and transcriptome sequencing (scEC&T-seq) protocol is available at Nature Protocol Exchange (https://doi.org/10.21203/rs.3.pex-2180/v1)⁶⁶. Single cells were sorted,Â separation of genomic DNA and mRNA was performed byÂ G&T-seq⁶⁷Â andÂ genomic DNA of single cells was subjected to exonuclease digestion and rolling-circle amplification as described previously⁶⁵.

The processing of scCircle-seq reads is described in detail previously⁶⁵. In brief, scCircle-seq sequencing reads were 3â² trimmed for quality using Trim Galore (v.0.6.4)⁶⁸, and adapter sequences with reads shorter than 20 nucleotides were removed. The alignment of reads to the human reference assembly hg19 was performed using BWA MEM (v.0.7.15) with the default parameters⁶⁹. PCR and optical duplicates were removed using Picard (v.2.16.0). Sequencing coverage across mitochondrial DNA was used as an internal control to evaluate circular DNA enrichment. Cells that exhibited less than 10 reads per bp sequence-read depth over mitochondrial DNA or less than 85% genomic bases captured in mitochondrial DNA were excluded from further analyses⁶⁵.

Read counts from scCircle-seq BAM files were quantified in 1âkb bins across TR14 ecDNA regions (MYNC, CDK4, MDM2) as defined by ecDNA reconstruction analyses in TR14 bulk populations described previously⁴. To account for differences in sequencing depth among cells, read counts were normalized to library size.

Analysis of copy-number correlations of amplified oncogenes in human tumour samples

Copy numbers computed for single cells using scATAC-seq as described above (see the âPaired scATAC-seq and scRNA-seq analysisâ section) were used to devise a statistical approach for predicting ecDNA. We reasoned that, due to the random segregation of individual ecDNA molecules, ecDNA focal amplifications would be characterized by not only elevated mean copy number but also inflated copy-number variance. Indeed, classifying amplifications with a mean copy number of â¥4 and variance/mean ratio of â¥2.5 specifically classified only known ecDNAs in validated cell lines (Extended Data Fig. 4a).

We applied this statistical approach to a curated dataset of 41 tumours (from triple-negative breast cancer (TNBC), high-grade serous ovarian cancer (HGSC) and glioblastoma) with publicly available scATAC-seq or scDNA-seq data^34,35,36. For TNBC and HGSC tumours profiled with scDNA-seq data in ref. ³⁵, we used the author-provided single-cell copy numbers available on Zenodo (https://doi.org/10.5281/zenodo.6998936). Processed scATAC-seq data for glioblastoma samples were obtained from ref. ³⁴ and ref. ³⁶ (GEO accession number GSE163655), and copy numbers were computed as described above (see the âPaired scATAC-seq and scRNA-seq analysisâ section) in 3âMb genomic windows. Putative ecDNAs were predicted using the decision rule determined from validated cell lines, and copy numbers were determined for oncogenes by averaging copy numbers of windows overlapping with the oncogene of interest. Copy-number correlations were computed across oncogenes, only considering cells where the oncogene was amplified with a copy-number â¥4.

ecDNA isolation by CRISPRâCATCH

Molecular isolation of ecDNA by CRISPRâCATCH was performed as previously described¹⁸. In brief, molten 1% certified low-melting-point agarose (Bio-Rad, 1613112) in PBS was equilibrated to 45âÂ°C. In total, 1âmillion cells were pelleted per condition, washed twice with cold 1Ã PBS, resuspended in 30âÂµl PBS and briefly heated to 37âÂ°C. Then, 30âÂµl agarose solution was added to cells, mixed, transferred to a plug mould (Bio-Rad, 1703713) and incubated on ice for 10âmin. Solid agarose plugs containing cells were ejected into 1.5âml Eppendorf tubes, suspended in buffer SDE (1% SDS, 25âmM EDTA at pHâ8.0) and placed onto a shaker for 10âmin. The buffer was removed and buffer ES (1% N-laurolsarcosine sodium salt solution, 25âmM EDTA at pH 8.0, 50âÂµgâml^â1 proteinase K) was added. Agarose plugs were incubated in buffer ES at 50âÂ°C overnight. The next day, proteinase K was inactivated with 25âmM EDTA with 1âmM PMSF for 1âh at room temperature with shaking. Plugs were then treated with RNase A (1âmgâml^â1) in 25âmM EDTA for 30âmin at 37âÂ°C and washed with 25âmM EDTA with a 5âmin incubation. Plugs not directly used for ecDNA enrichment were stored in 25âmM EDTA at 4âÂ°C.

To perform in vitro Cas9 digestion, agarose plugs containing DNA were washed three times with 1Ã NEBuffer 3.1 (New England BioLabs) with 5âmin incubations. Next, DNA was digested in a reaction with 30ânM sgRNA (Synthego) and 30ânM spCas9 (New England BioLabs, M0386S) after pre-incubation of the reaction mix at room temperature for 10âmin. Cas9 digestion was performed at 37âÂ°C for 4âh, followed by overnight digestion with 3âÂµl proteinase K (20âmgâml^â1) in a 200âÂµl reaction. The next day, proteinase K was inactivated with 1âmM PMSF for 1âh with shaking. The plugs were then washed with 0.5Ã TAE buffer three times with 5âmin incubations. The plugs were loaded into a 1% certified low-melting-point agarose gel (Bio-Rad, 1613112) in 0.5Ã TAE buffer with ladders (CHEF DNA Size Marker, 0.2â2.2âMb; Saccharomyces cerevisiae ladder, Bio-Rad, 1703605; CHEF DNA size marker, 1â3.1âMb; Hansenula wingei ladder, Bio-Rad, 1703667) and pulsed-field gel electrophoresis was performed using the CHEF Mapper XA System (Bio-Rad) according to the manufacturerâs instructions and using the following settings: 0.5Ã TAE running buffer, 14âÂ°C, two-state mode, run time duration of 16âh 39âmin, initial switch time of 20.16âs, final switch time of 2âmin 55.12âs, gradient of 6âVâcm^â1, included angle of 120Â° and linear ramping. The gel was stained with 3Ã Gelred (Biotium) with 0.1âM NaCl on a rocker for 30âmin covered from light and imaged. The bands were then extracted and DNA was isolated from agarose blocks using beta-Agarase I (New England BioLabs, M0392L) according to the manufacturerâs instructions. All guide sequences are provided in Supplementary Table 1.

Short-read sequencing of ecDNA isolated by CRISPRâCATCH

Sequencing of ecDNA isolated by CRISPRâCATCH was performed as previously described¹⁸. In brief, we transposed DNA with Tn5 transposase produced as previously described⁵⁸ in a 50âÂµl reaction with TD buffer⁵⁹, 10âng DNA and 1âÂµl transposase. The reaction was performed at 50âÂ°C for 5âmin, and transposed DNA was purified using the MinElute PCR Purification Kit (Qiagen, 28006). The libraries were generated by 7â9 rounds of PCR amplification using NEBNext High-Fidelity 2Ã PCR Master Mix (NEB, M0541L), purified using SPRIselect reagent kit (Beckman Coulter, B23317) with double size selection (0.8Ã right, 1.2Ã left) and sequenced on the Illumina NextSeq 550 or the Illumina NovaSeq 6000 platform. Sequencing data were processed as described above for WGS. CRISPRâCATCH sequencing data for SNU16m1 (bands 30â34) and COLO 320DM (bands aâm) used in Extended Data Fig. 3 were generated previously⁴ and deposited at the NCBI Sequence Read Archive (SRA) under BioProject accession PRJNA670737; CRISPRâCATCH sequencing data for SNU16 (MYC, FGFR2 and enhancer ecDNAs) used in Fig. 4 were generated previously¹⁸ and deposited at the NCBI SRA under BioProject accession PRJNA777710.

Metaphase DNA-FISH

TR14 neuroblastoma cells were grown to 70% confluency in a 15âcm dish and treated with KaryoMAX Colcemid (Gibco) for 4âh. A mitotic shake off was performed and the medium of the cells was collected. The remaining cells were washed with PBS and treated with trypsin-EDTA 0.05% (Gibco) for 2âmin. The cells were washed again with the collected medium and centrifuged at 300g for 10âmin. The pellet was resuspended at 0.075âM KCl and left at 37âÂ°C for 20âmin. The sample was centrifuged at 300g for 5âmin. The cell pellet was resuspended carefully in 10âml Carnoyâs solution and centrifuged at 300g for 5âmin. This wash step was repeated four times using 5âml of Carnoyâs solution. The remaining pellet was resuspended in 400âÂµl of Carnoyâs solution. Then, 12âÂµl of the suspension was dropped on preheated slides from a height of approximately 15âcm. The slides were held over a heated water bath (55âÂ°C) for 1âmin. The slides were aged overnight at room temperature. Slides were prepared for staining according to the probe manufacturerâs protocol (DNA-FISH metaphase chromosome spreads, Arbor Biosciences). Before staining, the slides were first washed in PBS, followed by a wash in 65âÂ°C SSCT (5âml 20Ã SSC, 500âÂµl 10% Tween-20, and brought up to 50âml with molecular-grade H₂O) for 15âmin. The slides were next washed twice for 2âmin with room temperature SSCT. Dehydration of the slides was performed in 70% and 90% ethanol for 5âmin each. After air-drying, the slides were transferred into 0.07âN NaOH for 3âmin for chemical denaturation. After two washes for 5âmin in SSCT, the dehydration step was repeated, and the slides were air-dried. The probes used for staining were designed to target the MYCN, MDM2 and CDK4 gene using myTags (Arbor), conjugated as following: CDK4-Alexa 488, MYCN-Atto 550, MDM2-Atto 633. Then, 10âÂµl of the hybridization buffer (in SSCT: 50 % formamide, 10% dextran sulphate, 40ângâÂµl^â1 RNase A) was mixed with 1.5âÂµl of each resuspended probe. This mixture was headed to 70âÂ°C for 5âmin and stored on ice. Then, 14.5âÂµl of this mixture was added to the slide, which was covered by a cover glass and sealed with rubber cement. The slides were incubated in a hybridization chamber (Abbott Molecular) overnight at 37âÂ°C. The next day, the rubber cement and cover glass were removed, and the sample was washed in prewarmed (37âÂ°C) SSCT for 30âmin. The slides were then washed at room temperature with 2Ã SSCT for 5âmin each followed by a 5âmin wash with PBS. The air-dried slide was stained with Hoechst (1: 4,000 for 2âmin) and washed with PBS for another 5âmin. After drying, the slides were mounted using ProLong Glass Antifade Mountant (Thermo Fisher Scientific) and sealed with a coverglass. Imaging of TR14 metaphase spreads was done on the Leica Stellaris 8 system (Advanced Light Microscopy Facility, Max-DelbrÃ¼ck Center for Molecular Medicine) using a Ã63 oil objective with a Ã2 zoom. Excitation was done using the 405ânm, 488ânm, 561ânm and 538ânm lasers and detection was done using two HyD S and one HyD X and HyD R detectors. 4Ã line averaging was applied to each channel.

For the GBM39-KT, GBM39-KT-D10, SNU16, SNU16m1, CA718 and H716 cell lines, cells were treated with KaryoMAX Colcemid (Gibco) at 100ângâml^â1 for 3âh, and single-cell suspensions were then collected by centrifugation and washed once in 1Ã PBS. The cells were treated with 0.75âM KCl hypotonic buffer for 20âmin at 37âÂ°C, and fixed with Carnoyâs fixative (3:1 methanol:glacial acetic acid) followed by three additional washes with the same fixative. The samples were then dropped onto humidified glass slides and air-dried. The glass slides were then briefly equilibrated in 2Ã SSC buffer, dehydrated in ascending ethanol concentrations of 70%, 85% and 100% for 2âmin each. FISH probes (Empire Genomics) were diluted in hybridization buffer in 1:6 ratio and covered with a coverslip. The samples were denatured at 75âÂ°C for 3âmin and hybridized at 37âÂ°C overnight in a humidified slide moat. The samples were washed with 0.4Ã SSC for 2âmin, and 2Ã SSC 0.1% Tween-20 for another 2âmin. The nuclei were stained with 4,6-diamidino-2-phenylindole (DAPI) (50ângâml^â1) diluted in 2Ã SSC for about a minute, and washed once briefly in double-distilled H₂O. Air-dried samples were mounted with ProLong Diamond. Images were acquired on a Leica DMi8 widefield microscope using a 63Ã oil objective.

Metaphase DNA-FISH image analysis

Colocalization analysis for two- and three-colour metaphase FISH described in Fig. 1 and Extended Data Fig. 2 was performed using Fiji (v.2.1.0/1.53c)⁷⁰. Images were split into the individual FISH colours + DAPI channels, and the signal threshold was set manually to remove background fluorescence. Overlapping FISH signals were segmented using watershed segmentation. FISH signals were counted using particle analysis. xy coordinates of pixels containing FISH signals were saved along with image dimensions and coordinates of regions of interest (ROIs) as distinct particle identities (for example, distinct ecDNA molecules). Colocalization was then quantified in R. Each pixel containing FISH signal was assigned to the nearest overlapping ROI using xy coordinates. Unique ROIs in all colour channels were summarized such that ROIs in different channels that overlap with one another by one pixel or more in the same image were considered as colocalized.

Colocalization analysis for two-colour metaphase FISH data for ecDNAs in SNU16m1 cells described in Extended Data Fig. 10 was performed using Fiji (v.2.1.0/1.53c)⁷⁰. Images were split into the two FISH colours + DAPI channels, and signal threshold set manually to remove background fluorescence. Overlapping FISH signals were segmented using watershed segmentation. Colocalization was quantified using the ImageJ-Colocalization Threshold program and individual and colocalized FISH signals were counted using particle analysis.

Immunofluorescence staining and DNA-FISH in mitotic cells

For assessing mitotic segregation of ecDNA in GBM39-KT, GBM39KT-D10, TR14, SNU16m1, CA718 and H716 cells shown in Fig. 2 and Extended Data Figs. 5 and 7, asynchronous cells were grown on coverslips coated with either poly-l-lysine or poly-d-lysine (laminin for GBM39-KT and GBM39KT-D10). Cells were washed once with PBS and fixed with cold 4% paraformaldehyde (PFA) at room temperature for 10â15âmin. The samples were permeabilized with 0.5% Triton X-100 in PBS for 10âmin at room temperature and then washed with PBS. The samples were then blocked with 3% BSA in PBS 0.05% Triton X-100 for 30âmin at room temperature. The samples were incubated in primary antibody (Aurora kinase B polyclonal antibody, 1:200 dilution, A300-431A, Thermo Fisher Scientific; BRD4 antibody, 1:200, ab245285, Abcam; RNA polymerase II CTD repeat YSPTSPS (phosphorylated Ser2) antibody (3E10), ab252855, Abcam; CIP2A antibody, 1:400 dilution, NBP2-48710, Novus Biologicals; all diluted in 3% BSA) for either 1âh at room temperature or overnight at 4âÂ°C. The samples were washed three times in PBS 0.05% Triton X-100. The samples were incubated in fluorophore-conjugated secondary antibody (1:500 in 3% BSA) for 1âh at room temperature (with all of the subsequent steps in the dark) and then washed three times in PBS 0.05% Triton X-100. Cells were washed once with PBS and refixed with cold 4% PFA for 20âmin at room temperature. The coverslips were then washed once in 1Ã PBS, and incubated with freshly prepared 0.7% Triton X-100 in 1Ã PBS with 0.1âM HCl for 10âmin on ice, followed by acid denaturation of DNA strands with 1.9âM HCl for 30âmin at room temperature. They were then dehydrated in ascending ethanol concentrations of 70%, 85% and 100% for approximately 2âmin each. FISH probes (Empire Genomics) were diluted 1:4 in hybridization buffer (Empire Genomics) and added to the sample with the addition of a slide. The samples were denatured at 75âÂ°C for 3âmin and then hybridized at 37âÂ°C overnight in a humid and dark chamber. The samples were then washed with 0.4Ã SSC then 2Ã SSC 0.1% Tween-20 (all washes lasting approximately 2âmin). DAPI (100ângâml^â1) was applied to samples for 10âmin. The samples were then washed again with 2Ã SSC 0.1% Tween-20 then 2Ã SSC. The samples were briefly washed in double-distilled H₂O and mounted with ProLong Gold. The slides were sealed with nail polish. The samples were imaged either on a DeltaVision Elite Cell Imaging System (Applied Precision), on an Olympus widefield microscope (IX-71; Olympus) controlled by the SoftWoRx software v.6.5.2 (Applied Precision) and a Ã60 objective lens with a CoolSNAP HQ2 camera (Photometrics), or on a Leica DMi8 widefield microscope using a Ã63 oil objective lens. z stacks were acquired and used to generate maximum-intensity projections (ImageJ or LAS X) for downstream analysis. Images acquired on the Leica DMi8 were subjected to deconvolution using either small-volume computational clearing or large-volume computational clearing before making maximum-intensity projections.

For assessing mitotic segregation of oncogene and enhancer ecDNAs in SNU16 cells as shown in Fig. 4, cells were seeded onto fibronectin-coated 22âÃâ22 coverslips contained in a six-well culture plate at about 70% confluence. Then, 24âh after cell seeding, the cells were fixed with 4% PFA and permeabilized with 1Ã PBS containing 0.25% Triton X-100. The samples were blocked with 3% BSA-1Ã PBS for 1âh at room temperature, followed by primary antibody incubation (Aurora B kinase antibody; A300-431A; Thermo Fisher Scientific) (1:200 in 3% BSA) overnight at 4âÂ°C. The sample was washed three times in 1Ã PBS followed by incubation with diluted an anti-rabbit Alexa Fluor 647 antibody (donkey anti-rabbit IgG (H+L) highly cross-adsorbed secondary antibody, Alexa Fluor 647, A31573, Invitrogen; 1:500 dilution in 3% BSA) for 1âh at room temperature. The sample is then washed three times in 1Ã PBS and fixed with 4% PFA for 20âmin at room temperature. DNA-FISH was performed as described in the âMetaphase DNA-FISHâ section, with the conditions to heat denaturation changed to 80âÂ°C for 20âmin. Images were acquired on a Leica DMi8 widefield microscope using a Ã63 oil objective, and each z plane was post-processed by small-volume computational clearing on LAS X before generating maximum-projection images.

Mitotic cell imaging analysis

To quantify fractions of ecDNAs segregated to each daughter cell in pairs of dividing cells as shown in Fig. 2 and Extended Data Figs. 5 and 7, ecDNA pixel intensity was quantified from maximum intensity projections using ImageJ. ecDNA pixel intensity was measured using the âIntegrated Densityâ measurement from ImageJ. Before quantification, the background signal from FISH probes was removed uniformly for the entire image until all background signal from the daughter cell nuclei was removed. We further filtered out images with poor quality, those with overlapping nuclei that did not allow for accurate segmentation and those showing cells with unclear daughter cell pairings based on Aurora kinase B staining. To measure the fractions of ecDNAs segregated to daughter cells after inhibitor treatments, segmentation of daughter cells and measurement of DNA-FISH abundance was performed on maximum-intensity projections using AIVIA Software (Leica Microsystems). Individual machine-learning-based pixel classifiers were trained on the channels corresponding to the FISH probes of interest and DAPI to create confidence masks for FISH signal and nuclei, respectively. The confidence masks were used to create a recipe to segment individual FISH puncta and assign each punctum to a segmented daughter cell. The fractional inheritance of each ecDNA species was estimated by comparing the FISH area in the daughter cells of each corresponding pair. The abundances of proteins of interest (RNA Pol II pSer2, CIP2A and BRD4) were quantified using AIVIA software by measuring the pixel intensity values in the segmented nuclei.

To quantify the fractions of oncogene and enhancer ecDNAs segregated to daughter cells as shown in Fig. 4, the images were split into the different FISH colours + DAPI channels, and the signal threshold was set manually to remove background fluorescence using Fiji (v.2.1.0/1.53c)⁷⁰. Overlapping FISH signals were segmented using watershed segmentation. All FISH colour channels except for DAPI were stacked and ROIs were drawn manually to identify the two daughter cells, after which the colour channels were split again and image pixel areas occupied by FISH signals were analysed using particle analysis. Fractions of ecDNAs in each daughter cell were estimated by fractions of FISH pixels in the given daughter cell.

Intron RNA-FISH

Intron RNA-FISH was performed using Stellaris RNA FISH system (LGC Biosearch Technologies), with the manufacturerâs protocol for adherent cells. Intron RNA-FISH probe was designed against MYC intron 2 sequence (hg38) using the Stellaris Probe Designer tool (maximum number of probesâ=â48, oligo lengthâ=â20, minimum spacing lengthâ=â2), the final probe design for MYC intron 2 consists of 31 probes and was tagged with the Quasar 570 fluorophore. Images were acquired on the Leica DMi8 system using a Ã63 oil objective to obtain z stack images, which underwent small-volume computational clearing before making maximum-intensity projections. For the RNase-A-treated negative control, cells were first fixed in 3.7% PFA, followed by digestion with RNase A (Thermo Fisher Scientific, EN0531) diluted to a final concentration of 200âÂµgâml^â1 with 1Ã RNase-free PBS for 30âmin at 37âÂ°C. RNase A was washed off once with 1Ã RNase-free PBS before 70% ethanol permeabilization. Intron RNA-FISH staining was then continued as described in the manufacturerâs protocol for adherent cells.

Live-cell imaging

The live-cell imaging cell line was engineered from COLO 320DM cells obtained from ATCC. In brief, the engineering involved the following key steps: (1) CRISPR-mediated knock-in of 96Ã TetO array into intergenic sites next to MYC, followed by puromycin selection for TetO-positive cells; (2) lentiviral infection of TetR-mNeonGreen, followed by sorting of mNeonGreen positive cells using flow cytometry to enable labelling of TetO inserted MYC locus; (3) monoclonal expansion of cells and evaluation by microscopy to select for clones that forms distinct mNeonGreen puncta with a good signal-to-noise ratio; (4) lentiviral infection of H2B-emiRFP670 was conducted to fluorescently label histone H2B protein, followed by sorting of emiRFP670 and mNeonGreen double-positive cells using flow cytometry. The final monoclonal cells were analysed using metaphase DNA-FISH to confirm good TetO labelling efficiency and that amplicons remained as ecDNA structures.

Cells were seeded onto poly-d-lysine coated 96-well glass-bottom plates 2 days before imaging. On the day of imaging, the medium was switched to FluoroBrite DMEM (Gibco, A1896701) supplemented with 10% FBS and 1Ã GlutaMax. Prolong live antifade reagent (Invitrogen, P36975) was used at 1:200 dilution to suppress photobleaching. Cells were imaged on a top stage incubator (Okolab) fitted onto a Leica DMi8 widefield microscope with a Ã63 oil objective, with temperature (37âÂ°C), humidity and CO₂ (5%) controlled throughout the imaging experiment.

Simulations of ecDNA segregation in pairs of daughter cells

To understand how co-segregation dynamics of ecDNAs in dividing cells may affect copy-number correlations in daughter cells, we simulated distributions of ecDNA copies among two daughter cells by random sampling using the sample function in R, for which the sample size is the total copy number of an ecDNA species multiplied by two (as a result of DNA replication). For a given fraction of one ecDNA species that co-segregates with the same fraction of another ecDNA species, the corresponding ecDNA copies were randomly distributed among two daughter cells but at the same ratio for both ecDNA species.

To compare observed ecDNA segregation with these simulations given a non-zero frequency of covalent fusions between two ecDNAs such as the low-level fusion events between different oncogene ecDNA species in various cell lines shown in Extended Data Fig. 2 or those between the enhancer and oncogene sequences shown in Fig. 4, the fraction of fused ecDNAs was treated as co-segregating ecDNAs in the simulations. To generate the expected distributions of enhancer and oncogene ecDNAs among daughter cells in Fig. 4, for each mitotic immunofluorescence and FISH image collected, the fractions of enhancer ecDNAs, oncogene ecDNAs and fused enhancer-oncogene ecDNAs were used to simulate 20 segregation events in which a fraction of ecDNAs corresponding to the fused molecules were perfectly co-segregated. The resulting copy-number correlations in simulated daughter cells represent the null distribution of ecDNAs explained by covalent fusion alone with no additional co-segregation between distinct ecDNA molecules.

ATAC-seq

ATAC-seq data for SNU16 were previously published under GEO accession GSE159986 (ref. ⁴). Adapter-trimmed reads were aligned to the hg19 genome using Bowtie2 (v.2.1.0). Aligned reads were filtered for quality using samtools (v.1.9)⁷¹, duplicate fragments were removed using Picardâs MarkDuplicates (v.2.25.3) and peaks were called using MACS2 (v.2.2.7.1)⁷² with a q-value cut-off of 0.01 and with a no-shift model.

ChIPâseq

ChIPâseq data for SNU16 were previously published under GEO accession GSE159986 (ref. ⁴). Paired-end reads were aligned to the hg19 genome using Bowtie2Â (ref.Â ⁷³) (v.2.3.4.1) with the –very-sensitive option after adapter trimming with Trimmomatic⁶⁰ (v.0.39). Reads with MAPQ values of less than 10 were filtered using samtools (v.1.9) and PCR duplicates removed using Picardâs MarkDuplicates (v.2.20.3-SNAPSHOT). The ChIPâseq signal was converted to bigwig format for visualization using deepTools bamCoverage⁷⁴ (v.3.3.1) with the following parameters: –bs 5 –smoothLength 105 –normalize Using CPM –scaleFactor 10.

Evolutionary modelling of ecDNA copy-number framework

ecDNA copy number was simulated over growing cell populations using a forward-time simulation implemented in Cassiopeia⁷⁵ (https://github.com/YosefLab/Cassiopeia). All simulations performed in this study were of two distinct ecDNA species in a growing cell population. Simulations were parameterized with (1) initial ecDNA copy numbers (initial copy number for ecDNA species j is denoted as ${k}_{{\rm{init}}}^{j}$); (ii) selection coefficients for cells carrying no ecDNA (s_â,â), both ecDNAs (s_+,+), or either ecDNA (s_â,+ or s_+,â; in this study, selection coefficients are treated as constant functions of the types of ecDNA species present in a cell); (3) a base birth rate (Î»_baseâ=â0.5); (4) and a co-segregation coefficient (Î³). Optionally, a death rate can also be specified (Î¼).

Starting with the parent cell, a birth rate is defined based on the selection coefficient acting on the cell, $s\in \{{s}_{-,-},\,{s}_{-,+},\,{s}_{+,-},\,{s}_{+,+}\}$ as Î»₁â=âÎ»_baseâÃâ(1â+âs). Then, a waiting time to a cell division event is drawn from an exponential distribution: t_bâ~âexp(âÎ»₁). When a death rate is also specified, a time to a death event is also drawn from an exponential distribution: t_dâ~âexp(âÎ¼). If t_bâ<ât_d, a cell division event is simulated and a new edge is added to the growing phylogeny with edge length t_b; otherwise, the cell dies and the lineage is stopped. This process will continue until a user-defined stopping condition is specifiedâeither a target cell number (for example, 1âmillion) or a target time limit.

During a cell division, ecDNAs are split among daughter cells (d₁ and d₂) according to the co-segregation coefficient, Î³, and the ecDNA copy numbers of the parent cell p. In this study, this co-segregation is simulated using two different strategies to determine the effects of co-segregation (see the âAlternative model of ecDNA co-evolutionâ section below). In the following description, let ${n}_{j}^{(i)}$ indicate the copy number of ecDNA species j in daughter cell i and let N_j indicate the copy number of ecDNA species j in the parent cell.

ecDNA species 1 is randomly split distributed to each daughter cell:

$${n}_{1}^{(1)} \sim {\rm{binomial}}(2{N}_{1},0.5)$$

$${n}_{1}^{(2)}=2{N}_{1}-{n}_{1}^{(1)}$$

Where binomial is the binomial probability distribution. To simulate co-segregation, for the second ecDNA species, copies are distributed to the daughter cells in proportion to the segregation coefficient Î³ and the copy number of the first ecDNA species in each daughter cell:

$${n}_{2}^{\left(1\right),\gamma }=\gamma \times 2{N}_{2}\times \frac{{n}_{1}^{\left(1\right)}}{2{N}_{1}}$$

$${n}_{2}^{\left(2\right),\gamma }=\gamma \times 2{N}_{2}\times \frac{{n}_{1}^{(2)}}{2{N}_{1}}$$

Then, the remainder of copies left over that were not passed with co-segregation are randomly distributed between daughter cells:

$${n}_{2}^{\left(1\right),r} \sim {\rm{binomial}}(2{N}_{2}-{n}_{2}^{\left(1\right),\gamma }-{n}_{2}^{(2),\gamma },0.5)$$

$${n}_{2}^{\left(2\right),r}=2{N}_{2}-{n}_{2}^{\left(1\right),r}-{n}_{2}^{\left(1\right),\gamma }-{n}_{2}^{\left(2\right),\gamma }$$

After this simulation, the output is a phylogeny T over l leaves (denoted by L) with ecDNA copy numbers ${k}_{j}^{i}$ for ecDNA species j in leaf i.

Evolutionary modelling of ecDNA co-assortment trends

To simulate the trends of ecDNA copy-number dynamics, we used the evolutionary modelling framework described previously (see the âEvolutionary modelling of ecDNA copy-number frameworkâ section). We used the following fixed parameters: selection acting on individual ecDNA (s_â,+,s_+,â) of 0.2, selection acting on cells without ecDNA (s_â,_â) of 0.0, a base birth rate (Î»_base) of 0.5, and initial ecDNA copy numbers for both species (${k}_{init}^{1}={k}_{init}^{2}$) of 5 in the parental cell. We varied co-selection (s_+,+) and co-segregation (Î³) between 0 and 1.0 and reported the fraction of cells reporting a copy-number of both ecDNAs above a threshold m (by default 1) and the Pearson correlation between ecDNA copy numbers in cells:

$$C=\frac{1}{|L|}\sum _{l\in L}I({k}_{l}^{1} > m,{k}_{l}^{2} > m)$$

$$\rho ={\rm{Pearson}}({{\bf{k}}}_{L}^{1},{{\bf{k}}}_{L}^{2})$$

Where ${k}_{l}^{i}$ is the copy number of ecDNA species i in leaf l and ${{\bf{k}}}_{L}^{i}$ is the vector of copy numbers of ecDNA species i across all cells.

For the results presented in Fig. 3bâe and Extended Data Fig. 8b, we simulated populations of 1âmillion cells and reported the average co-occurrence and correlation across 10 replicates.

Inference of evolutionary parameters

ABC was used to determine evolutionary parameters in cell line data, specifically selection acting on individual ecDNAs (assumed to be equal between ecDNAs (s_â,+,s_+,â), the level of co-selection (s_+,+), and the co-segregation coefficient (Î³). In brief, ABC takes a parameter set $\theta $ from a prior or proposal distribution and simulates a dataset ${y}_{0}$ from this parameter set. If the simulated dataset matches the observed dataset within specified error tolerance ${\epsilon }$, then we accept the parameter set and update our posterior distribution $\pi (\theta |{y}_{0})$. In our case, we defined the priors over each parameter as follows:

$$\pi \left({s}_{-,+}),\pi ({s}_{+,-}\right) \sim {Unif}(0,1)$$

$$\pi \left({s}_{+,+}\right) \sim {Unif}(0,2)$$

$$\pi \left(\gamma \right) \sim {Unif}(0,1)$$

We used the evolutionary model presented above (see section titled âEvolutionary modelling of ecDNA copy-number frameworkâ) to simulate datasets ${y}_{0}$ from the proposed parameter set Î¸, no death rate, a base birth rate Î»_baseâ=â0.5, and selection acting on cells without ecDNA s_â,ââ=â0.

Here our goal is to infer a posterior distribution over each evolutionary parameter given single-cell copy numbers observed from scATAC-seq data in a target cell line, denoted as y_obs (see the âPaired scATAC-seq and scRNA-seq analysisâ section above). To accomplish this, we chose to derive summary statistics describing the co-occurrence (proportion of cells carrying more than 2 copies of each gene amplified as ecDNA) and the Pearson correlation between the log-transformed copy numbers of ecDNAs for guiding our inference, denoted by C_obs and Ï_obs, respectively. In each round of ABC, we simulated a dataset y₀ of 500,000 cells and compared the summary statistics of this simulated dataset to the observed summary statistics using the following distance function:

$$D(\,{y}_{{\rm{obs}}},{y}_{0})=| {C}_{{\rm{obs}}}-{C}_{0}| +{\rm{| }}\,{\rho }_{{\rm{obs}}}-{\rho }_{0}\,{\rm{| }}$$

where C₀ and Ï₀ are the simulated co-occurrence and Pearson correlation, respectively. We used a tolerance of Ïµâ=â0.05 as our target error, and each ABC instance was run for up to 3âdays. Each simulation was initialized with a parental cell with equal copy-number of initial ecDNA (${k}_{{\rm{init}}}^{1}={k}_{{\rm{init}}}^{2}$): in Fig. 3g this initial copy number was 5 although alternative initial conditions are explored in Extended Data Fig. 8fâh. We used the following summary statistics for each cell line: SNU16m1 (C_obsâ=â0.99, Ï_obsâ=â0.46); TR14 (C_obsâ=â0.96, Ï_obsâ=â0.26); GBM39-KT (C_obsâ=â0.67, Ï_obsâ=â0.36).

The specific implementation of this procedure was performed using a sequential Monte Carlo scheme (ABC-SMC) using the Python package pyabc (v.0.12.8). In brief, this approach performs sequential rounds of inference while computing a weight for the accepted parameters for each iteration. Further details of this procedure were reported previously^76,77,78,79.

Cell-level co-segregation model of ecDNA co-evolution

Previously, we introduced the co-segregation on the ecDNA element level inside of each cell, where an ecDNA element carrying one species is linked to another element with a probability defined as the co-segregation parameter. Here, we introduce an alternative model, in which ecDNA co-segregation is implemented at the cellular level. In each cell division, if a cell is chosen for proliferation, the number of ecDNA copies in that cell are doubled. We first have the randomly segregation of both ecDNA species following a binomial distribution separately, and then pair those with high copy numbers into the same daughter cells with a probability $\gamma \in [\mathrm{0,\; 1}]$. More precisely, Î³ describes the likelihood of extreme copy-number correlation, and 1âââÎ³ describes the likelihood of extreme copy-number anticorrelation. If Î³â=â0.5, it is related to unbiased likelihood for both extreme scenarios, and it results in the modelling of standard random ecDNA proliferation without co-segregation.

In this model, the population growth is also modelled as a birthâdeath stochastic process and implemented by a standard Gillespie algorithm¹². We start from a small initial population (a single cell or three cells) carrying a certain amount of ecDNA elements and recording the exact number of ecDNA copies for each cell through the simulation. Cells are chosen randomly but proportional to their fitness (1â+âs) for proliferation, where s is the selection coefficient. Neutral proliferation is defined compared to fitness of cells without ecDNA (sâ=â0). If there is a fitness effect by carrying ecDNA, sâ>â0. For simplicity, in our models, we give a fixed selection coefficient for cells carrying either ecDNA and vary the selection coefficient for cells with both ecDNA to investigate the impact of co-selection in ecDNA co-evolution. For reporting, we discretize the population into three subpopulations, named pure, mix and free (no) ecDNA cells (Fig. 3g), which represent cells carrying just one type of ecDNA, both types or no ecDNA at all, respectively. For the results presented in Extended Data Fig. 8câe, we simulated populations of 10,000 cells and reported summary statistics across 500 replicates.

Evolutionary modelling of drug intervention

The evolutionary model described previously (see the âEvolutionary modelling of ecDNA copy-number frameworkâ section) was used to evaluate the effect of pemigatinib treatment on SNU16m1 cells. To do so, we modified the framework to allow for a burn-in period to simulate population growth without drug and then introduced a perturbation to selection coefficients at a defined timepoint.

Specifically, we allowed the cell population to grow to 5,000 cells under the following conditions: base birth rate (Î»_base) of 0.5, a death rate (Î¼) of 2.5, an initial ecDNA copy number for both species (${k}_{{\rm{init}}}^{1}={k}_{{\rm{init}}}^{2}$) of 10, and the following selection coefficients: s_â,ââ=â0; s_â,+â=â0.15; s_+,ââ=â0.15; s_+,+â=â0.8 (here, let cells carrying only FGFR2 ecDNA be denoted by s_+,â and cells only carrying MYC ecDNA by s_â,+).

For the experiments presented in Extended Data Fig. 10a in which we examine the dynamics of ecDNA copy-number after pemigatinib treatments cross a range of values, we simulated pemigatinib treatment by modulating the co-segregation level and selection pressures acting on cells after the 5,000 cell burn-in population was simulated. Specifically, we explored co-segregation parameters between 0 and 1, and selection pressure values ${s}_{+,+}={s}_{+,-}\in \{0,-\,0.1,\,-\,0.2,\,-\,0.3,\,-\,0.4,\,-\,0.5\}$. We then simulated 500,000 cells from the pre-treatment group of 5,000 cells while maintaining the same values for Î³, Î¼, Î»_birth, s_â,â and s_â,+.

For the pulsed pemigatinib treatment simulations presented in Fig. 4i, we used the same base birth rate, initial copy numbers, death rate and selection coefficients for the burn-in period of 5,000 cells. To simulate the first round of pemigatinib treatment, selection pressure values were set to s_+,+â=âs_+,ââ=ââ0.1 and 100,000 cells were simulated from the initial 5,000 cell pre-treatment group and 25,000 cells were sampled at random to continue for the drug holiday. During the drug holiday, 1.2âmillion cells were simulated according to initial selection parameters from the 25,000 cells sampled from the simulated drug treatment, with a modified base birth rate of 0.4 to model recovery times after drug treatment. After the drug holiday, 200,000 cells were sampled at random and a further drug treatment was simulated up until at least 110 time units according to the same selection parameters used in the first round of simulated pemigatinib treatment. For time-dependent functions of copy number reported in Fig. 4i, the mean copy numbers of both ecDNA species were computed in time bins of 5 up until the introduction of pemigatinib and bins of 1 afterwards.

Evolutionary modelling of enhancer-only ecDNA

To examine the evolutionary principles of enhancer-only ecDNA, we used the previously described evolutionary model (see the âEvolutionary modelling of ecDNA copy-number frameworkâ section above) without death and fixed the following evolutionary parameters: s_+,ââ=â0.2, s_â,+â=â0, Î»_baseâ=â0.5 and ${k}_{{\rm{init}}}^{1}={k}_{{\rm{init}}}^{2}=5$. We simulated ten replicates of 1-million cell populations a modulated co-selection coefficient s_+,+ from [0,â1] and co-segregation coefficient Î³ from [0,â1]. In Fig. 4, we report the distribution of co-occurrence summary statistics C across these ten replicates.

Nanopore sequencing of SNU16 genomic DNA

Genomic DNA from approximately 2âmillion SNU16 cells was extracted using the MagAttract HMW DNA Kit (Qiagen, 67563) and prepared for long-read sequencing using the Ligation Sequencing Kit V14 (Oxford Nanopore Technologies SQK-LSK114) according to the manufacturerâs instructions. Libraries were sequenced on a PromethION (Oxford Nanopore Technologies) using a 10.4.1 flow cell (Oxford Nanopore Technologies FLO-PRO114M).

Base calling from raw POD5 data was performed using Dorado (Oxford Nanopore Technologies, v.0.2.1+c70423e). Reads were aligned to hg19 using Winnowmap2Â (ref.Â ⁸⁰) (v.2.03) with the following parameters: -ax map-ont. Structural variants were called using Sniffles⁸¹ (v.2.0.7) using the following additional parameters: –output-rnames.

Pemigatinib treatment of SNU16m1 and COLO 320DM cell lines

SNU16m1 and COLO 320DM cells were treated with 5âÎ¼M pemigatinib (Selleckchem, S0088), or with an equal volume of DMSO. Fresh pemigatinib was replenished approximately every 3â4 days. Approximately 1âmillion SNU16m1 cells were sampled from the DMSO condition, 300,000 cells from the pemigatinib-treated conditions at day 0, 7, 14, 21, 28, 35 and 42; genomic DNA was extracted using the DNeasy Blood & Tissue Kit (Qiagen, 69504), and subjected to WGS (see the âWGSâ section above). Approximately 2âmillion COLO 320DM cells were sampled at day 14, genomic DNA was extracted using the Quick DNA MiniPrep kit (Zymo Research; D0325) and subjected to WGS using the same procedure as above. Copy numbers for oncogene regions were computed using cnvkit (v.0.9.6.dev0)⁸².

Chemotherapy treatment of SNU16m1 cell line

SNU16m1 cells were treated with 10âÎ¼M etoposide (Selleckchem, S1225), 20âÎ¼M fluorouracil (Selleckchem, S1209), 100âÎ¼M hydroxyura (Selleckchem, S1896), or equal volume DMSO control for 20 days. 2,300,000 SNU16m1 cells were plated in T-75 flasks for treatment with chemotherapeutic drugs and approximately 1,000,000 cells were seeded in T-25 flasks for treatment with DMSO control. Fresh chemotherapy drug was replenished at least every 7âdays. On day 20 of the experiment, the remaining cells were collected and genomic DNA was extracted using the Quick DNA MiniPrep kit (Zymo Research, D0325) and subjected to WGS and analysis (see the âWGSâ section above). Copy numbers for oncogene regions were computed using cnvkit (v.0.9.10)⁸².

Nutlin-3a treatment of TR14 cells and interphase DNA-FISH

A total of 175,000 TR-14 cells was seeded per well in 12-well plates. Cells were treated either with 0.1% DMSO or with 1âÂµl nutlin-3a (Sigma Aldrich, SML0580) for 6 days, without an additional wash-out period.

The samples were fixed using Carnoyâs solution (3:1 methanol:acetic acid). Fixed samples on coverslips or slides were briefly equilibrated in 2Ã SSC buffer. They were then dehydrated in ascending ethanol concentrations of 70%, 90% and 100% for approximately 2âmin each. FISH probes were diluted in probe hybridization buffer and added to the sample with the addition of a coverslip or slide. The samples were denatured at 78âÂ°C for 5âmin and then hybridized at 37âÂ°C overnight in a humid and dark chamber. The samples were washed twice in 0.4Ã SSC with 0.3% IGEPAL CA-630 for 2âmin with agitation for the first 10â15âs. They were then washed once in 2Ã SSC with 0.1% IGEPAL CA-630 at room temperature for 2âmin, again with agitation for the first 10â15âs. DAPI (100ângâml^â1) was applied to samples for 10âmin. The samples were then washed again with 2Ã SSC and mounted with ProLong Antifade Mountant.

FISH and microscopy was performed in the same manner as TR14 was processed as described above (see the âMetaphase DNA-FISH image analysisâ section). Statistical significance was assessed using Wilcoxon rank-sum tests.

TP53 knockdown by shRNA

Lentiviruses were produced for TP53 knockdown using short hairpin RNA (shRNA) targeting TP53 (shTP53) or GFP (sgGFP) as a control. The shTP53 pLKO.1 puro plasmid was a gift from Y. Yu, Johannes Kepler UniversitÃ¤t Linz. The shGFP pLKO.1 control plasmid was obtained from the RNAi Consortium, Broad Institute. HEK293T cells were transfected using TransIT-LT1 (Mirus) in a 2:1:1 ratio of lentiviral plasmid, psPAX2 and pMD2.G plasmids (Addgene) according to the TransIT-LT1 manufacturerâs protocol. Viral supernatant was collected 48 and 72âh after transfection, pooled, filtered and stored at â80âÂ°C.

TR14 cells were transduced for 1âday in the presence of 8âÂµgâml^â1 polybrene (Sigma-Aldrich). They were then grown in full medium for 1 day and selected with puromycin (2âÎ¼gâml^â1) for 5â7 days.

Western immunoblotting

A total of 800,000 cells was seeded in six-well plates and treated with either 0.1% DMSO or with the indicated concentration of nutlin-3a (Sigma Aldrich, SML0580) for 6âdays, without an additional wash-out period. Whole-cell protein lysates were then prepared by lysing cells in radioimmunoprecipitation assay buffer supplemented with cOmplete Protease inhibitor (Roche) and PhosphStop (Roche). Protein concentrations were determined using the bicinchoninic acid assay (Thermo Fisher Scientific). Then, 30âÂµg of protein was denatured in Laemmli buffer at 95âÂ°C for 10âmin. The lysates were loaded onto 16% Tris-Glycine (Thermo Fisher Scientific) for gel electrophoresis. Proteins were transferred onto polyvinylidene fluoride membranes (Roche), blocked with 5% dry milk for 1âh and incubated with primary antibodies overnight at 4âÂ°C, followed by secondary antibodies for 1âh at room temperature (MDM2 antibody (SMP14), Santa Cruz Biotechnology, sc-965, 1:200 dilution; p53 Antibody (DO-1), Santa Cruz Biotechnology, sc-126, 1:500 dilution; goat anti-mouse IgG (H+L) secondary antibody, HRP, Invitrogen, 31430, 1:2,000 dilution; vinculin monoclonal antibody (VLN01), Invitrogen, MA5-11690, 1:250 dilution). Chemiluminescent signal was detected using SuperSignal West Femto Maximum Sensitivity Substrate (Thermo Fisher Scientific) and the Fusion FX7 imaging system (Vilber Lourmat) using ImageLab. Unprocessed western blot images are provided as source data.

Reporting summary

Further information on research design is available in theÂ Nature Portfolio Reporting Summary linked to this article.