Replication-stress-induced chromatin loops protect fork stability

Cell culture, treatments and thymidine analogue incorporation

Cell lines and culture

MRC5 SV40-immortalized human fibroblasts and HCT116 human colorectal cancer cells were cultured in a 1:1 mixture of Dulbecco’s modified Eagle’s medium (DMEM) and Ham’s F10 (Invitrogen), supplemented with 10% FCS (Biowest) and 1% penicillin–streptomycin (Sigma-Aldrich) (Supplementary Table 6). Cells were maintained at 37 °C in a humidified incubator with 5% CO₂.

Human RPE1-hTERT TP53^−/−shBRCA2 cells were cultured in DMEM/Nutrient Mixture F-12 (Invitrogen) supplemented with 10% FCS and 1% penicillin–streptomycin at 37 °C and 5% CO₂.

U2OS (WT and TKO) cells were cultured in DMEM supplemented with 10% FCS and 1% penicillin–streptomycin under the same incubation conditions.

HAP1 cells²⁵ were cultured in Iscove’s modified Dulbecco’s medium (IMDM) supplemented with 10% FCS and 1% penicillin–streptomycin. All cell lines used in this study were routinely tested for mycoplasma contamination and tested negative.

Thymidine analogue incorporation

For BrdU incorporation, asynchronously growing cells were incubated in their respective culture medium containing 100 µM BrdU for 20 or 40 min (Supplementary Table 7). Immediately after the BrdU pulse, cells were either fixed or subjected to replication stress by replacing the medium with fresh medium containing 1 mM or 4 mM HU.

For EdU incorporation, asynchronously growing cells were incubated in in their respective culture medium containing 10 µM EdU for 20 min. Immediately after the EdU pulse, cells were either fixed or subjected to replication stress by replacing the medium with medium containing appropriate drugs.

In experiments using HCT116 cells expressing CTCF-mAID2-mClover, the culture medium was replaced with medium containing 1 µM 5-Ph-IAA (HY-134653, MedChemExpress) for at least 2 h before the experiment to induce transient depletion of CTCF.

The G9a inhibitor UNC0642 (MedChemExpress) was added at a final concentration of 1 µM for 2 to 4 h before experiments.

Mirin (Sigma-Aldrich, M9948) was added at 100 µM for at least 2 h before any treatment. Analogously, the DNA2 inhibitor (DNA2i; Sigma-Aldrich, SML4192) was added at 5 µM for at least 2 h before treatment.

In experiments using RPE1 cells with a doxycycline-inducible shRNA against BRCA2, knockdown was induced by replacing the culture medium with medium containing 2 µg ml⁻¹ doxycycline for 48 h before the experiment.

Flow cytometry

Cell cycle analysis was performed on MRC5 cells. After the indicated treatments, cells were collected by trypsinization, washed in PBS and fixed in 70% ethanol at −20 °C. Fixed cells were permeabilized in 0.2% Triton X-100 (Sigma-Aldrich). To detect incorporated BrdU, permeabilized cells were incubated in 2.5 M HCl for 1 h at room temperature to denature the DNA, followed by neutralization and washing in PBS.

Cells were then incubated with a primary anti-BrdU antibody (anti-BrdU, 44, BD, 347580) for 1 h, followed by a 30 min incubation with an Alexa-Fluor-488-conjugated secondary antibody (goat anti-mouse). DNA was counterstained with DAPI. Single nuclei were selected by sequential gating using SSC-A versus FSC-A, followed by FSC-H versus FSC-W and SSC-H versus SSC-W on the BD Aria flow cytometer (BD Biosciences) to exclude doublets and debris. Quantification and analysis of cell cycle profiles were performed using FlowJo software (BD, FlowJo).

Western blot

Protein samples were separated on 4–12% NuPAGE Bis-Tris Gels (Novex, Life Technologies) and transferred onto polyvinylidene difluoride membranes (0.45 µm; Immobilon, Millipore). The membranes were blocked with 5% BSA in PBS for 1 h at room temperature and incubated overnight at 4 °C with primary antibodies diluted in blocking buffer.

The following primary antibodies were used: anti-CTCF⁵⁷ (rabbit, 1:500), anti-GFP (mouse, 1:1,000, Roche, 11814460001) and anti-α-tubulin (mouse, 1:1,000, Sigma-Aldrich, T6074). The next day, the membranes were washed in PBS containing 0.1% Tween-20, then incubated with secondary antibodies coupled to near-infrared dyes CF 680 or CF 770 (1:10,000; Sigma-Aldrich) for 1 h at room temperature (Supplementary Table 8).

Signals were detected using the Odyssey CLx infrared scanner (LI-COR Biosciences), and the band intensities were quantified using ImageJ (NIH).

DNA fibre analysis

DNA fibre analysis was performed essentially as described previously. In brief, cells were sequentially pulse-labelled with 10 µM EdU (Invitrogen) or 30 μM CldU (MP Biomedicals) followed by 250 µM IdU (Sigma-Aldrich). After labelling and treatments, as indicated in the figure panels, cells were collected and DNA fibres were prepared by spreading lysed nuclei on glass slides, as previously described³.

EdU was detected by a Click-iT reaction to conjugate Azide-PEG3-Biotin (Jena Bioscience, CLK-1167-25) to EdU-labelled DNA, followed by incubation with anti-biotin antibody (A150-109A, Bethyl Laboratories) diluted 1:100 in blocking buffer (PBS, 2% BSA, 0.1% Tween-20). CldU was detected using Anti-BrdU (Abcam, ab6326), IdU was detected using an anti-BrdU antibody (B44; 347580, BD Biosciences) diluted 1:100 in the same blocking buffer. Primary antibodies were then labelled with anti-mouse antibody conjugated with Alexa Fluor 488 (diluted 1:100 in blocking buffer) and anti-rat or anti-rabbit secondary antibody conjugated to Alexa Fluor 594 (diluted 1:100 in blocking buffer) for 1 h at room temperature (Supplementary Table 8). Fibres were visualized and imaged using the Metafer slide scanner (MetaSystems) equipped with a ×40 Plan-Neofluar 0.75 NA air objective. Replication tracks were quantified using ImageJ.

Chromatin fibre analysis (ChromStretch)

Chromatin fibre analysis (ChromStretch) was performed largely as described previously³. After the indicated treatments, a minimum of 3 × 10⁵ cells was collected and washed twice in cold 1× PBS. To facilitate chromatin isolation and spreading, cells were resuspended in hypotonic buffer (3 mM EDTA, 0.1 mM EGTA, 1 mM DTT and protease inhibitor cocktail). Nuclei were collected by centrifugation (1,800g for 4 min at 4 °C).

The nuclear pellet was spotted onto Superfrost microscope slides and allowed to settle for 10 min in a humid chamber. Excess buffer was removed by gently tilting the slides, which were then allowed to air-dry for a maximum of 5 min. The slides were transferred into a lysis chamber containing lysis buffer (25 mM Tris base, 0.1 mM EDTA, 0.1 mM EGTA, 1 mM DTT and 2% Triton X-100) and incubated for 20 min. Chromatin stretching was achieved by flowing the lysis buffer out of the chamber at a constant flow using a custom-designed and custom-built device. Stretched chromatin fibres were fixed in 2% formaldehyde for 15 min, then washed three times in PBS. EdU was labelled with Alexa Fluor 594-azide according to the manufacturer’s instructions for 30 min, followed by a PBS wash and blocking in 1× PBS with 5% BSA for 1 h. The slides were then incubated with primary antibodies overnight at 4 °C. Primary antibodies were mouse anti-H3K9me3 (EPR26601; Abcam, ab317790, 1:500 in 5%BSA/PBS) and rabbit anti-H3 (Abcam, ab1791; 1:500 in 5% BSA/PBS). Primary antibodies were then labelled with anti-mouse antibody conjugated with Alexa Fluor 488 (diluted 1:1,000 in blocking buffer) and anti-rabbit secondary antibody conjugated with Alexa Fluor 647 (diluted 1:1,000 in blocking buffer) for 1 h at room temperature (Supplementary Table 8). Chromatin fibres were imaged using the Leica ST5 confocal microscope equipped with an oil-immersion ×63 HC PL APO CS2 objective (NA 1.4). Quantification of the H3K9me3 signal overlapping with the EdU signal was performed using ImageJ.

High-content PLA

High-content PLA were performed as previously described³ using the Duolink In Situ reagents (Sigma-Aldrich) according to the manufacturer’s instructions. In brief, cells were seeded on coverslips 1 day before treatments. After pulse labelling with 10 µM EdU for 20 min and the indicated treatments (as described in the figure legends), cells were fixed.

Fixed cells were permeabilized with 0.1% Triton X-100 in PBS for 15 min at room temperature and blocked with 5% BSA for 1 h at room temperature. After blocking, EdU was biotinylated using copper-catalysed click chemistry by incubating the mix (25 µM picolyl-azide-PEG4-biotin (Jena Bioscience), 10 mM sodium l-ascorbate and 2 mM CuSO₄ in PBS) for 30 min at room temperature to allow antibody labelling of newly replicated DNA. Cells were washed once with PBS and incubated with the indicated primary (mouse anti-biotin and rabbit anti-H3K9me3, 1:1,000 (Extended Data Fig. 4); or rabbit anti-biotin and mouse anti-GFP, 1:1,000 (Extended Data Figs. 8d and 10i)) antibodies in 5% FBS overnight at 4 °C.

Proximity ligation was performed using the Duolink Proximity Ligation Assay kit according to the manufacturer’s instruction (Sigma-Aldrich). In brief, cells were washed three times with buffer A (0.01 M Tris-HCl pH 7.4, 0.15 M NaCl and 0.05% Tween-20) and incubated with Duolink In Situ PLA Probe Anti-Rabbit PLUS and Anti-Mouse MINUS for 1 h at 37 °C. Cells were washed three times with buffer A and incubated with the Duolink In Situ PLA Ligation Mix for 30 min at 37 °C. After three more washed with buffer A, cells were incubated with Duolink In Situ PLA Amplification mix (Red), and washed three times with buffer B (0.01 M Tris-HCl pH 7.4, 0.15 M NaCl) for 10 min at room temperature, followed by one wash with 0.01× buffer B. For the EdU-CTCF PLA (Extended Data Figs. 8d and 10i) cells were incubated with an anti-rabbit Alexa Fluor 488-conjugated secondary antibody for 15 min at room temperature for staining EdU-positive cells, and nuclei were stained with DAPI (0.1 μg ml⁻¹) for 15 min at room temperature. Cells were washed three times with PBS and mounted onto SuperFrost microscope slides using MOWIOL 4-88 mounting medium (Sigma-Aldrich).

Images were acquired using the Zeiss Axio Imager Z2 microscope coupled to MetaSystems Metafer5. Image analysis and quantification was carried out using Metafer MetaCyte. The number of PLA foci was quantified and the PLA spot intensity was calculated as the product of the mean intensity of each spot per nucleus. To analyse clustered PLA spots (Extended Data Fig. 4), the cell image analysis software CellProfiler was used to quantify the number of clustered spots per nucleus.

Hi-C/Rep-Hi-C

Hi-C and Rep-Hi-C were performed on 60–70 million cells. Cells were grown in 15-cm dishes to approximately 70% confluence and pulse-labelled with 100 µM BrdU for 20 min. Subsequently, cells were incubated or not with 4 mM HU for 1 h to induce replication stress. Cells were then cross-linked in 1% formaldehyde for 10 min at room temperature, quenched with 300 mM glycine for 5 min at room temperature, and further incubated for 15 min at 4 °C. Fixed cells were collected by scraping, snap-frozen in liquid nitrogen, and stored at −80 °C until use.

Cross-linked pellets were dounced and incubated in pre-cooled permeabilization buffer (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 0.4% Igepal CA-630, and protease inhibitor cocktail) for 1 h. Cells were washed with 1× NEBuffer 2.1 and treated with 0.1% SDS for 10 min at 65 °C, followed by quenching with 1% Triton X-100 to permeabilize nuclei.

Open chromatin was digested overnight at 37 °C with 400 U DpnII/HindIII restriction enzyme in 1× NEBuffer 2.1 (NEBuffer 3.1 for HindIII). Overhangs from the restriction digest were biotinylated (using biotin-14-dATP for DpnII Hi-C/Rep-Hi-C or biotin-14-dUTP for HindIII Hi-C/Rep-Hi-C; Jena Bioscience) by a gap-filling reaction at 23 °C for 4 h, followed by blunt-end ligation at 16 °C for 4 h. Ligated DNA was de-cross-linked using a standard proteinase K protocol overnight. Protein-free DNA was purified using phenol–chloroform extraction, ethanol precipitation and concentration on Amicon columns to remove excess salts and concentrate DNA (adapted from a previous study¹⁵). The purified DNA was treated with RNase A (10 µg ml⁻¹) to avoid RNA contamination.

After DNA purification, unligated or dangling fragments bearing undesired biotinylated nucleotides were removed before sonication using a Covaris sonicator. DNA fragments were then size-selected to 200–300 bp using solid-phase reversible immobilization (SPRI) bead-based size selection (Ampure XP, Beckman Coulter). Before pull-down, size-selected DNA was end-repaired using T4 DNA polymerase, T4 PNK, Klenow DNA polymerase and dNTPs, followed by SPRI bead purification. The repaired DNA was A-tailed and ligated to Illumina adapters.

Adapter-ligated DNA was split into two fractions. The first fraction was used to generate conventional Hi-C libraries and was directly subjected to capture of biotinylated fragments using streptavidin-coated magnetic beads (MyOne Streptavidin beads, Invitrogen). DNA bound to the beads served as PCR template to generate indexed Hi-C libraries for Illumina sequencing.

The second fraction was processed for Rep-Hi-C. Adapter-ligated DNA was denatured at 95 °C for 5 min and immediately chilled on ice for 2 min to obtain single-stranded DNA. The single-stranded DNA was incubated with 0.5 mg anti-BrdU antibody (BD Pharmingen, 555627; 0.5 mg ml⁻¹ stock) per µg of DNA for 45 min at room temperature. Newly replicated DNA bound to BrdU antibodies was pulled-down using Protein G Dynabeads. DNA was eluted using proteinase K treatment and purified using SPRI beads. The resulting adapter-containing ssDNA was incubated with streptavidin-coated beads (Dynabeads MyOne Streptavidin C1 beads, Invitrogen) to capture biotin-labelled ligation junctions. Bead-bound DNA was used as PCR template to generate indexed Rep-Hi-C libraries, which were sequenced on the Illumina NovaSeq 6000 platform.

ChIC/Rep-ChIC

Cells were incubated in the presence of 100 µM BrdU for 20 min before being subjected or not to replication stress with 1 mM HU for 1 h. After treatment, 500,000 or 1,000,000 cells were collected for ChIC or Rep-ChIC, respectively.

ChIC was performed using the ChIC/CUT&RUN assay kit (Active Motif, 53180) according to the manufacturer’s instructions. In brief, nuclei were isolated, permeabilized and incubated with the appropriate antibodies: rabbit anti-H3K9me3 (EPR16601, Abcam, ab176916, 1 µg per 500,000 cells), rabbit anti-CTCF (Active Motif, 61311, 1 µg per 500,000 cells) or rabbit anti-FANCD2 (Abcam, ab108928, 1 µg per 500,000 cells) (Supplementary Table 8). After antibody binding, nuclei were incubated with pAG-MNase, and nuclease activity was activated to cleave chromatin near antibody-bound sites. Cut DNA fragments were released, collected and purified according to the manufacturer’s protocol. Purified DNA fragments were used to prepare sequencing libraries with the NEBNext Multiplex Oligos for Illumina (New England Biolabs).

Rep-ChIC was performed as for ChIC with an additional BrdU-pull down as follows. After end-repair, A-tailing and adapter ligation, DNA was denatured at 95 °C for 5 min and immediately cooled on ice for 2 min. The resulting single-stranded DNA was incubated with 0.5 µg anti-BrdU antibody (BD Pharmingen, 555627; 0.5 mg ml⁻¹) for 45 min at room temperature. Newly replicated DNA bound to BrdU antibodies was pulled-down using Protein G Dynabeads. DNA was eluted using proteinase K, purified using SPRI beads and the adapter-containing ssDNA was used as template to generate indexed Illumina sequencing libraries by PCR.

Fork-deg-seq

For Fork-deg-seq, a minimum of 10–15 million cells, after all treatments, were incubated with 100 µM BrdU for 40 min and then subjected to replication stress with 4 mM HU for 3, 5 or 8 h. For the ionizing radiation control (Fig. 4d), cells were incubated with 100 µM BrdU for 40 min before irradiation with 10 Gy of X-ray followed by a 15 min recovery. Cells were then cross-linked with 1% formaldehyde for 10 min at room temperature, quenched with 300 mM glycine for 5 min at room temperature and further incubated for 15 min at 4 °C. Fixed cells were collected by scraping, snap-frozen in liquid nitrogen and stored at −80 °C.

Cross-linked pellets were homogenized by douncing and incubated in pre-cooled permeabilization buffer (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 0.4% Igepal CA-630, protease inhibitor cocktail) for 1 h. Cells were then treated with 0.1% SDS for 10 min at 65 °C and quenched with 1% Triton X-100. Biotinylated nucleotides were incorporated at DNA degradation sites by a gap-filling reaction, followed by ligation, de-cross-linking and DNA purification as described above for the Rep-Hi-C procedure.

Purified DNA was sonicated using a Covaris sonicator, and fragments were size-selected to 200–300 bp using SPRI beads. Before pull-downs, size-selected DNA was end-repaired, A-tailed and ligated to paired-end Illumina adapters as in Rep-Hi-C. Adapter-ligated DNA was first used to enrich replicating regions by capturing BrdU-labelled fragments, as in Rep-Hi-C. The resulting single-stranded DNA after BrdU capture was then used to capture degradation sites by pull-down with streptavidin beads, which bind to biotin incorporated at ligation/degradation sites. Bead-bound DNA was PCR-amplified to the desired number of cycles and sequenced on the Illumina NovaSeq 6000 platform.

3C–qPCR

For 3C–qPCR, a minimum of 5–10 million cells was treated and cross-linked as mentioned in the Rep-Hi-C protocol, but without the BrdU pulse. Samples were collected and cross-linked pellets were processed similarly to Rep-Hi-C until the restriction-digestion step, at which point chromatin was digested overnight at 37 °C with 400 U HindIII. Sticky-end ligation was performed at 16 °C for 4 h. Ligated DNA was de-cross-linked and purified as described previously for the Rep-Hi-C procedure.

After 3C DNA purification, specific chromatin interactions were quantified using qPCR using 16 sets of primer pairs designed across putative loop anchors distant to each other (in the range of few kilobases) for six loops across different chromosomes (schematics are provided in Extended Data Fig. 7b) (Supplementary Table 2). PCR products for the interactions ranged between 200 and 350 bp and, using 3C DNA as template, C_q values (representing interaction among two distant loop anchors) were calculated. C_q values were normalized to an internal interaction control, and the fold changes were calculated relative to the UT conditions (Extended Data Figs. 7c and 10h). Sanger sequencing of PCR products was used to confirm interactions (HindIII junction site) between primer pairs from two distinct loop anchors for HU-specific loops and induced by other replication-stress-inducing drugs. (schematics and representative junction sites are shown in Extended Data Fig. 7c (right)).

EM analysis

Replication fork architecture was analysed according to the standard protocol described previously²⁵. In brief, asynchronous cultures were treated with 1 mM HU for 3 h. Fork architecture was cross-linked in vivo by treatment with 10 µg ml⁻¹ 4,5′,8-trimethylpsoralen (Thermo Fisher Scientific, J63226-03) and pulse irradiation with 365 nm monochromatic ultraviolet light using a BLX312 ultraviolet crosslinker (Vilber Lourmat).

DNA was extracted with lysis buffer (1.28 M sucrose, 40 mM Tris–HCl pH 7.5, 20 mM MgCl₂, 4% Triton X-100) and further digested in digestion buffer (800 mM guanidine-HCl, 30 mM Tris–HCl pH 8.0, 30 mM EDTA pH 8.0, 5% Tween-20, 0.5% Triton X-100) in the presence of 1 mg ml⁻¹ proteinase K at 50 °C for 2 h. DNA was purified by phase separation using chloroform:isoamyl alcohol (24:1) and precipitated with 0.7 vol isopropanol, then washed with 70% ethanol, dried and resuspended in TE buffer.

Next, 12 μg of genomic DNA were digested with PvuII-HF (NEB, R3151) and concentrated using Microcon centrifugal filters (Merck) according to the manufacturer’s instructions. The benzyldimethylalkylammonium chloride (Sigma-Aldrich, 8219440100) method was used to spread DNA at the water–air interface, which was then collected on carbon-coated nickel grids. DNA was coated with an 8 nm layer of platinum using a high-vacuum sputter coater (EM ACE600, Leica Microsystems).

High-throughput, automated imaging was performed using an FEI TALOS transmission electron microscope equipped with a BM-Ceta camera and MAPS v.3.18 software. For each condition, at least 70 replication fork intermediates were analysed per experiment. ssDNA gaps and fork structures were processed and quantified using Fiji⁵⁸.

3D DNA FISH probe design

Probe sequences were designed using the PaintSHOP platform (https://paintshop.io/), and probe indices were selected on the basis of previously published iFISH resources⁵⁹. High-throughput probe production was performed following the iFISH protocol, with probes designed specifically for a HU-specific loop anchors (left and right) (Supplementary Table 3).

3D DNA FISH

Three-dimensional DNA FISH (3D DNA FISH) was performed according to the uniFISH procedure⁶⁰. In brief, after treatments, cells were fixed in methanol:acetic acid (3:1, v/v) at room temperature for 15 min. Fixed cells were washed with 0.05% Triton X-100 and incubated with RNase A (100 µg ml⁻¹ in 1× PBS) at 37 °C for 60 min.

Cells were dehydrated through a graded ethanol series and air-dried for 3 h. For pre-hybridization, morphologically preserved samples were incubated in 1× uniFISH pre-hybridization buffer (an application-directed, versatile DNA FISH platform for research and diagnostics)⁶⁰ at 37 °C for 60 min. The primary probe mix was prepared by adding 1 µl of a 1:2,500 working probe solution to 9 µl of 1.1× uniFISH first hybridization buffer. Coverslips were placed onto the probe mix, sealed with Fixogum rubber cement, denatured at 75 °C for 5 min and incubated at 37 °C overnight.

The next day, Fixogum was removed and coverslips were washed according to the uniFISH protocol. The secondary hybridization solution was prepared by mixing 1 µl of a 1:50 dilution of fluorescently labelled oligonucleotide with 99 µl of 1× uniFISH second hybridization buffer. The samples were incubated in this solution at 30 °C for 2 h in the dark, followed by washing in universal wash buffer at 30 °C for 30 min in the dark. Cells were counterstained with DAPI for 30 min and mounted.

DNA FISH signals were visualized and imaged using a Metafer slide scanner (MetaSystems) equipped with a 63× Plan-Neofluar oil objective (NA 0.75). Quantification of FISH signals and distances was performed using ImageJ and CellProfiler.

scEdU-seq on MRC5 cells treated with hydroxyurea

For scEdU-seq, MRC5 cells were first labelled with a single 15 min pulse of EdU, followed by treatment with DMSO, 0.5 mM HU or 1 mM HU, and subsequently fixed. Azide-PEG3-Biotin was conjugated to EdU-labelled DNA as previously described for scEdU-seq²¹. Cells were next stained with DAPI to assess cell cycle distribution and labelled with CellTrace dyes to distinguish between treatment conditions (mock labelling for DMSO, CellTrace Yellow (Invitrogen, C34567) for 0.5 mM HU; and CellTrace Far Red (Invitrogen, C34572) for 1 mM HU). Cells were sorted using the CytoFlex SRT cell sorter into 384-well plates for scEdU-seq processing.

Library preparation was performed as follows. Cells underwent digestion with proteinase K followed by genomic DNA digestion with NlaIII (NEB, R0125) DNA blunt-ending, A-tailing and adapter ligation, incorporating cell barcodes and unique molecular identifiers (UMIs). Pooled single-cell libraries were then bound to Dynabeads MyOne Streptavidin C1 (Invitrogen (65002)) to capture DNA replication fragments through the biotin-modified EdU.

Captured fragments were released by heat denaturation and filled in using Klenow DNA polymerase. Libraries were subsequently amplified by in vitro transcription (IVT), reverse transcription and PCR. The final libraries were sequenced on the Illumina NextSeq 2000 platform (P3 chemistry, 2 × 100 bp).

TrAEL-seq library

For TrAEL-seq, 1 × 10⁶ asynchronous cells were collected after treatment in the absence or presence of HU. Cells were washed in 5 ml buffer (10 mM Tris HCl (pH 7.5), 100 mM EDTA, 20 mM NaCl), embedded in CleanCut agarose and digested overnight at 50 °C in 500 μl digestion buffer (10 mM Tris HCl (pH 7.5), 100 mM EDTA, 20 mM NaCl, 1% sodium N-lauroyl sarcosine, 0.1 mg ml⁻¹ proteinase K) before washing with TE and PMSF. TrAEL-seq was performed using the updated multiplexing protocol as described⁶¹.

E/L repli-seq

Exponentially growing MRC5 cells were pulse-labelled with 100 µM BrdU for 20 min. Immediately after the BrdU pulse, cells were collected by trypsinization and fixed in 70% ethanol and incubated overnight at 4 °C in the presence of 15 µg ml⁻¹ Hoechst 33342. Cells were resuspended in 1× PBS and sorted into two fractions (100,000 cells per fraction): early (E) S phase cells and late (L) S phase cells. Sorted cells were incubated overnight in lysis buffer (50 mM Tris-HCl pH 8, 10 mM EDTA, 0.1% SDS, 50 μg ml⁻¹ RNase A, 100 μg ml⁻¹ proteinase K). DNA was purified from the lysates by phenol extraction. Purified DNA was sonicated using the Covaris sonicator, and fragments were size-selected to 200–300 bp using SPRI beads. Before pull-down, size-selected DNA was end-repaired, A-tailed and ligated to paired-end Illumina adapters as in Rep-Hi-C. Adapter-ligated DNA was first used to enrich replicating regions by capturing BrdU-labelled fragments, as in Rep-Hi-C. The resulting single-stranded DNA after BrdU capture was PCR-amplified to the desired number of cycles and sequenced on the Illumina NovaSeq 6000 platform.

BrdU-HU seq

Exponentially growing MRC5 cells were pulse-labelled with 100 µM BrdU for 20 min followed with a 1 h incubation in 1 mM HU. Immediately after the HU treatment, cells were collected by trypsinization and fixed in 70% ethanol and incubated overnight at 4 °C. Cells were incubated overnight in lysis buffer (50 mM Tris-HCl pH 8, 10 mM EDTA, 0.1% SDS, 50 μg ml⁻¹ RNase A, 100 μg ml⁻¹ proteinase K). DNA was then purified from the lysates by phenol extraction. From this step onwards, the rest of the sample preparation was performed as in the E/L repli-seq procedure.

Data analysis

ChIC/Rep-ChIC–seq data processing and analysis

Sequencing yielded H3K9me3 ChIC/Rep-ChIC datasets (four independent replicates), CTCF Rep-ChIC datasets, FANCD2 Rep-ChIC datasets and BrdU-IP libraries (all in FASTQ format): raw reads were quality-checked using FastQC. Sequenced reads were aligned to the human reference genome (hg38/GRCh38) using Bowtie2 (ref. ⁶²) with the default parameters. Alignment files were converted to BAM format, sorted and indexed using samtools (v.1.15.1)^63,64. Principal component analysis of the H3K9me3 ChIC/Rep-ChIC datasets was performed on read count matrices derived from the raw BAM files using multiBAMSummary and the plots were done using plotPCA function in deepTools (v.3.5.6)⁶⁵ (Extended Data Fig. 2c) to assess replicate concordance and overall data quality.

Genome-wide signal tracks were generated from BAM files using the bamCoverage function in deepTools⁶⁵ with a mapping-quality filter of MAPQ ≥ 30 to exclude poorly aligned reads. Unless otherwise specified, coverage was calculated in bins of fixed size and normalized to reads per kilobase per million mapped reads (RPKM) (RPKM (per bin) = number of reads per bin/(number of mapped reads (in millions) × bin length (kb)), yielding the raw signal of BigWig tracks for all ChIC/Rep-ChIC and BrdU-IP datasets. For ChIC/Rep-ChIC, peaks were called using SEACR⁶⁶ in norm mode with the relaxed threshold against the corresponding input tracks, and reads were subsequently filtered to those overlapping peak regions for downstream quantitative analyses.

Peak sets were further processed to generate genome browser–compatible formats. Peak were converted to BedGraph and BigWig formats using the bigWigToBedGraph⁶⁷ and bedGraphToBigWig utilities from the UCSC Genome Browser, as well as custom Python wrappers. ChIC/Rep-ChIC signal profiles over selected loci were visualized using the Python package gtracks (v.1.12.6), together with custom Python scripts (Figs. 1c and 3b and Extended Data Fig. 2d). Normalized BigWig tracks and the corresponding peak BED files were used as input for all visualizations to ensure consistent scaling across samples (Supplementary Table 9).

Tornado (heat map plus aggregate) plots of ChIC/Rep-ChIC signal were generated in two steps. First, significant peaks were called genome-wide using MACS3⁶⁸ (function bgpeakcall). For global tornado plots (Fig. 1d and Extended Data Fig. 2e,f), the most significant peaks were selected and signal matrices centred on peak summits were computed from scaled BigWig files using the computeMatrix function in deepTools⁶⁵ (reference-point mode, bin size 50 bp). The resulting matrices were then visualized as heat maps using plotHeatmap, which simultaneously reports the corresponding average signal profiles. In this context, tornado plots refer to heat maps in which each row represents a fixed-width window around a genomic feature (for example, a peak summit), rows are ordered by signal intensity, and an aggregate profile across all rows is shown below the heat map (Fig. 1d and Extended Data Fig. 2e).

HU- or APH-unique peaks enriched for BrdU were defined in a differential peak-calling framework. HU- or APH-unique peaks were then defined as regions present in the HU or APH peak sets but absent from the UT peak set at the same calling thresholds using custom Python codes. The peaks overlapping with BrdU signal were then identified to obtain the replication-induced de novo peaks using the bedtools intersect function. Tornado plots comparing UT versus HU or UT versus APH were generated by providing the condition-specific BigWig files and these unique peak sets to computeMatrix (reference-point mode, bin size 50 bp), followed by visualization with plotHeatmap (Fig. 1e and Extended Data Fig. 2g).

For fountain-centred analyses (Fig. 2d), HU-unique peaks from UT, HU and HU + G9ai samples were intersected with predefined fountain regions using standard BED intersection tools. Peaks overlapping IZs or TZs within fountain regions were classified accordingly. ChIC/Rep-ChIC tornado plots were then generated separately for IZ-associated and TZ-associated fountains using computeMatrix and plotHeatmap as described above, enabling direct comparison of replication-associated chromatin changes at IZs versus TZs.

Subtraction and coverage-based operations on genome-wide tracks were performed using the bigwigCompare and bigwigAverage functions from deepTools (v.3.5.6) on SEACR peak-called BigWig files. Whole-genome heat maps of HU–UT differences in H3K9me3 ChIC/Rep-ChIC signal were plotted over SEACR peak-defined regions using custom Python scripts (Extended Data Fig. 3d). For each genomic position within the analysed regions, aggregated values across samples or conditions were calculated and visualized as heat maps (Fig. 3h and Extended Data Fig. 8f–h).

To analyse H3K9me3 enrichment at replication IZs, IZ coordinates for MRC5 cells were derived using the OKseqHMM pipeline⁶⁹ applied to TrAEL-seq data. Initial signal coverage across IZs and their flanking regions was computed from normalized H3K9me3 BigWig files using the computeMatrix function in deepTools (v.3.5.6)⁶⁵ with a bin size of 500 bp, and enrichment profiles were visualized using custom Python scripts (Extended Data Fig. 3f). IZs were subdivided into early- and late-replicating zones on the basis of replication timing profiles obtained from in-house two-stage Repli-seq experiments in MRC5 cells, allowing H3K9me3 enrichment to be compared between early and late IZs under different replication stress conditions.

TrAEL-seq data processing

TrAEL-seq data were processed and mapped to the human reference genome GRCh38 using the revised pipeline available at GitHub (https://github.com/laurabiggins/TrAEL-seq)^14,61. This pipeline includes trimming of the poly(T) tail, separation of multiplexed sets, separation into T and no-T datasets, copy-number-aware UMI deduplication and mapping with Bowtie2. The resulting data were analysed using the OKseqHMM toolkit (originally developed for Okazaki fragment sequencing), which has been recently extended to other RFD mapping methods (such as TrAEL-seq) OKseqHMM (v.2.0)⁶⁹; the OKseqHMM toolkit separates reads by Watson or Crick strand and computes the RFD in 1 kb non-overlapping windows, defined as (C – W)/(C + W) where C and W are read counts on the Crick and Watson strands, respectively. The raw RFD signal was further smoothed using a 15 kb sliding window (1 kb step). Genomic regions with insufficient coverage (below the default read-depth threshold in OKseqHMM) were masked. Smoothed RFD profiles were segmented using the toolkit’s built-in four-state hidden Markov model to define replication IZs (up segments), TZs (down segments) and two intermediate flat states. Where indicated, origin efficiency metrics were also calculated at multiple scales for visualization. Biological replicates were processed independently through this pipeline and used for downstream analysis. These RFD profiles and segmentations were used for comparative analyses and visualization in genome browsers.

To calculate fork pausing, the signal in the UT condition was subtracted from the HU-treated condition. Meta-analysis was then performed, either retaining strand-specific information (Fig. 3j and Extended Data Fig. 8i) or combining the strands (Fig. 3i) as needed.

Repli-seq data processing

Repli-seq reads were mapped to the hg38 reference genome using Bowtie2 (ref. ⁶²) with the default parameters, and alignments were filtered to retain only uniquely mapped, properly paired reads. For each sample, genome-wide coverage was computed, and the replication timing profile was derived as the log₂-transformed ratio of early S phase to late S phase read counts in non-overlapping 5 kb windows. All initial processing (alignment, filtering, windowed coverage calculation) was performed using custom Python scripts and standard command-line tools. The resulting raw replication timing profiles were then post-processed in R by applying loess smoothing with a span of ~300 kb, following the recommendations of the original Repli-seq protocol. This pipeline produced a smoothed bedGraph file for each sample, containing final log₂[early/late] replication timing values for each 5 kb window. The bedGraph was subsequently converted to a bigWig format for efficient visualization and analysis (using UCSC command-line tools). These smoothed replication timing tracks (bedGraph and bigWig files) were used for all downstream visualizations and quantitative analyses involving replication timing data.

Hi-C and Rep-Hi-C data processing

In situ Hi-C libraries (and Rep-Hi-C libraries, where specified) were processed using the Juicer pipeline (v.1.6)⁷⁰ with the default parameters, and the merged_sort.txt was filtered for MAPQ score greater than equal to 30 but retaining all reads (including duplicates). This resultant file was used to generate .hic files for all downstream analysis. Juicer produced raw contact count matrices at multiple resolutions (1 Mb, 100 kb, 50 kb, 25 kb, 10 kb and 5 kb) from the FASTQ data. We first ran MultiQC (v.1.11)⁷¹ on the aligned reads for quality control, extracting summary statistics such as the number of short-range (≤20 kb), long-range (>20 kb) cis interactions and trans interactions for each sample. Next, we used HiCExplorer (v.3.7.2)^72,73 to convert the contact matrices into the cooler format: each sample’s contact data were consolidated into a single multi-resolution .mcool file (with all resolutions listed above), and individual .cool files were generated for key resolutions (for example, 10 kb and 50 kb). Hi-C contact matrices were visualized interactively using Juicebox (v.2.17)⁷⁰ at 1 Mb and 100 kb resolution to inspect data quality and global interaction patterns. For further analysis, we used the cooltools library¹⁷ with custom Python scripts to handle the cooler files for operations such as normalization, expected contact calculation and visualization.

Hi-C and Rep-Hi-C compartment analysis

A/B compartment analysis was performed on Hi-C and Rep-Hi-C contact matrices using principal component analysis of contact correlations. We computed the first eigenvector (EV1) at 50 kb resolution on Knight–Ruiz-normalized contact matrices using cooltools¹⁷. The genomic GC content profile was used to determine the orientation of EV1 (that is, the sign was flipped if necessary, so that positive EV1 values correspond to A compartments, which are typically gene rich and GC rich). Contiguous genomic regions with the same EV1 sign were defined as A or B compartment domains. To identify compartment switching, we compared the EV1 sign of each 50 kb bin between conditions. Any bin that exhibited an opposite EV1 sign in HU-treated cells compared with UT cells was classified as a switching bin (as a control, we also assessed compartment changes between Hi-C and Rep-Hi-C in the same condition, UT versus HU, to ensure that differences were treatment specific rather than technology specific). We calculated the proportion of the genome undergoing compartment switching as the percentage of non-masked 50 kb bins that changed compartment status between UT and HU.

To visualize how HU treatment affected compartmental contacts, we generated differential contact maps at 50 kb resolution by taking the log₂-transformed ratio of Knight–Ruiz-normalized contact frequencies in HU-treated versus UT cells. In these difference maps, red and green colouring indicates increased or decreased contact frequency in HU relative to UT, respectively (the colour scale was symmetric around zero, from red to green) (Fig. 2b, Extended Data Fig. 6a and Supplementary Fig. 2).

We also performed saddle plot analysis using cooltools¹⁷ to quantify genome-wide compartment interactions. For each sample, 50 kb bins were ranked by their EV1 values and binned into percentiles (excluding the extreme 2.5% of bins at each end of the EV1 distribution, as recommended by the cooltools¹⁷ pipeline, to avoid outliers). We then calculated an O/E contact matrix stratified by these EV1 percentiles for each chromosome arm. This resulted in a 50 × 50 saddle plot in which the diagonal (top-right to bottom-left) represents contacts among bins with similar compartment identity (A–A or B–B), and the off-diagonal represents A–B contacts. Saddle plots were generated for Hi-C and Rep-Hi-C in both UT and HU conditions (Fig. 2a and Extended Data Fig. 5b). From the saddle plot, we quantified compartment interaction strength by measuring the enrichment of contacts in the A–A and B–B regions. Specifically, we calculated the area within the square of strong A–A and B–B interactions (indicated by a dotted line on the saddle plot) using custom scripts, and used this as a metric of compartmentalization strength. To examine changes after HU treatment, we also created difference saddle plots by subtracting the UT saddle values from the HU saddle values (for both Hi-C and Rep-Hi-C). These difference plots directly highlight how HU alters A–A vs. B–B contact frequencies (with warmer colours indicating an increase in compartmental contacts and cooler colours a decrease, relative to UT).

Subcompartment analysis

Finer-scale subcompartment analysis was performed using CALDER (calibrated automatic local detection of subcompartments)⁷⁴ on the 50 kb binned contact data. CALDER⁷⁴ identified multiple subcompartment states hierarchically. In our analysis, we grouped the genome into eight subcompartments: starting from the conventional A and B compartments, each was divided into two subclasses (A1, A2 and B1, B2), and each subclass was further split into two (yielding A1.1, A1.2, A2.1, A2.2 and B1.1, B1.2, B2.1, B2.2). We took the first eight classifications from the CALDER⁷⁴ subcompartment tree. We then examined how subcompartment assignments changed with HU treatment. A Sankey diagram⁷⁵ was generated (Fig. 5c) to visualize the transitions of genomic regions between subcompartment states in UT versus HU, using custom Python scripts. This illustrated the differential compartment shifts induced by replication stress (HU) in both Hi-C and Rep-Hi-C datasets (Extended Data Fig. 5d).

TAD analysis

TADs were identified in the Hi-C and Rep-Hi-C contact matrices using the hicFindTADs function from HiCExplorer (v.3.7.2) with the default parameters. This algorithm produced a set of TAD coordinates (output as a BED file of domains). For downstream analyses, we classified each TAD as early-replicating or late-replicating on the basis of the overlap with replication timing data. Specifically, if ≥80% of a TAD’s genomic span coincided with regions labelled as early S phase in our replication timing profile, it was designated an early TAD; if ≥80% overlapped late S phase regions, it was designated a late TAD (TADs that did not meet either criterion was left unclassified). The replication timing profiles used for this classification were derived from our Repli-seq experiments in MRC5 cells. We also calculated insulation scores and boundary strength metrics using cooltools functions. These metrics were mapped to the TAD boundary coordinates obtained from hicFindTADs to assess whether HU treatment or other conditions affected TAD boundary strength, and whether there were differences in insulation at boundaries between early and late replicating TADs.

Aggregate analysis of fountains

We used the fountain-calling pipeline (the fun pipeline as described previously¹⁶) to identify fountains, which are specific interaction patterns in contact maps, in both UT and HU-treated Hi-C/Rep-Hi-C samples. This analysis was conducted at the 10 kb and 25 kb resolutions. The fountain-calling pipeline outputs the midpoint coordinates of all detected fountain regions. We merged the midpoints from all fountains identified (combining 10 kb and 25 kb resolution results from both UT and HU) using bedtools merge, yielding a unified set of fountain centre positions (referred to as the merged fountain file).

Using this merged list of fountain coordinates, we performed aggregate contact analysis to compare fountain features across different conditions (UT, HU, UT + G9ai, HU + G9ai). We applied the on-diagonal pileup function of cooltools (as described previously¹⁷) to aggregate the Hi-C or Rep-Hi-C contact matrices around each fountain centre. In these pileups, we extracted a square region of the contact matrix centred on the fountain midpoint (extending ±250 kb in both dimensions) and averaged the contact signal over all fountain locations genome-wide. For Rep-Hi-C samples, we computed O/E contact values using the corresponding Hi-C maps as the expected background, before aggregation. The result of this analysis is an average fountain contact map for each condition, reflecting the typical interaction pattern around fountain loci. To highlight differences between conditions, we also computed difference maps (for example, HU minus UT) of the aggregated fountain contact matrices.

Furthermore, to quantitatively compare fountain signals between conditions, we generated summary box plots from the aggregate difference matrices. For each aggregated fountain difference map, we flattened the matrix (taking each pixel’s O/E difference as an independent data point) and plotted the distribution of these values. This approach provides a genome-wide view of how fountain interaction strengths change under different treatments. All box plot generation and statistical analyses were performed using custom Python scripts (Fig. 2e,f and Extended Data Fig. 6d,f).

Corner plot analysis

In addition to the on-diagonal fountain pileups, we performed a complementary corner-plot aggregate analysis. This analysis focused on off-diagonal corner features of the contact matrix around fountain loci (for example, to capture any directional or asymmetric interaction patterns flanking the fountain centres). Using the same merged fountain list, we aggregated contact matrices with cooltools¹⁷ in a manner similar to the fountain analysis, but capturing the corner regions relative to each fountain midpoint. As with the fountain aggregate, we used a ±250 kb window for the corner pileup around each fountain and averaged across all fountains to produce an aggregate corner interaction map for each condition (UT, HU, UT + G9ai, HU + G9ai). We then computed difference maps between conditions for the corner aggregates. To summarize these differences, we flattened the aggregate corner difference matrices (taking each pixel’s value as a datapoint) and generated box plots, analogous to the fountain analysis above. This corner plot analysis allowed us to assess subtle changes in the spatial interaction pattern (off-diagonal signals) around fountain sites under different treatments. All analyses for corner plots were implemented with custom Python code using cooltools and standard libraries for plotting (Fig. 2g,h).

Aggregate analysis of chromatin loops

Chromatin loops were identified using the HICCUPS CPU⁷⁰ pipeline on the contact maps. We ran HICCUPS on the .hic files generated by Juicer at the 10 kb and 25 kb resolutions to call loop anchors in each condition. To determine which loops were specific to HU treatment, we compared loop sets between HU-treated and UT samples. HU-unique loops were defined by using the pairToPair utility from BEDTools (v.2.31.0) with the -neither option, which finds loop anchors present in HU that have no overlapping counterpart in the UT condition. We further assessed the reproducibility of HU-specific loops by intersecting the loop lists from biological replicate experiments (requiring at least 80% overlap in anchor positions, using bedtools intersect with -f 0.8). Only highly reproducible HU-specific loops were retained for downstream analyses.

We next performed APA on the sets of chromatin loops to evaluate their average contact enrichment. Using cooltools off-diagonal pileup functions, we generated APA plots for loops in MRC5 and HCT-116 cells (Fig. 3f,g). For each set of loops, we extracted a 10-kb-resolution contact submatrix centred on each loop (±250 kb padding around the loop centre) and averaged these submatrices to produce a mean contact enrichment map. This aggregate contact map highlights the typical loop peak (the high interaction frequency at the loop anchor pair) over local background interactions. We quantified the loop strength by comparing the contact intensity at the central loop pixel to the surrounding local background in the APA matrix. Specifically, loop strength was calculated as the ratio of the average contact frequency in the central 1–2 bins (around the loop anchor intersection) to the average contact in a surrounding donut-shaped area that represents background. These calculations were done for each cell line (MRC5, HCT-116), experiment type (Hi-C versus Rep-Hi-C), and condition (UT versus HU) using identical parameters, to allow direct comparisons. The numerical loop enrichment values are indicated on the APA plots (top right corners) for reference.

To investigate the binding orientation of CTCF at HU-specific loops, we analysed CTCF motifs at the loop anchors. We scanned the sequence of each HU-unique loop anchor for the CTCF binding motif using the JASPAR core vertebrate position weight matrix. For each anchor, we retained the strongest motif hit that overlapped a peak of CTCF enrichment in our HU Rep-ChIC dataset (CTCF Replicative-ChIC in HU-treated cells). Using the Logomaker Python package⁷⁶, we generated sequence logos of these retained CTCF motifs to verify their consensus. We then determined the relative orientation of CTCF motifs at each loop: if both anchors of a given loop carried motifs oriented in the same genomic direction (for example, both on the positive strand pointing the same way), the loop was classified as tandem; if the two anchor motifs were oriented towards one another (convergent orientation), the loop was classified as convergent (loops for which one or both anchors lacked a confidently mapped CTCF motif with HU-specific enrichment were excluded from this orientation analysis). We tallied the proportion of HU-specific loops with tandem versus convergent CTCF motif arrangements and plotted the percentages (Fig. 3e). This analysis aimed to determine whether HU-induced loops were predominantly supported by convergent CTCF binding (as is often the case for stable loops) or if other configurations were common under replication stress.

Fork-deg-seq processing

Fork-deg-seq libraries generated in HCT-116 cells were aligned to the human genome (hg38/GRCh38) using Bowtie2 (with the default settings for end-to-end alignment). Sequencing yielded paired-end reads (Illumina), which were subjected to quality control using FastQC⁷⁷. The resulting SAM files were converted to BAM, sorted and indexed using Samtools (v.1.15.1)⁶⁴. To account for background, each condition had a matched input control; we generated coverage tracks using deepTools bamCoverage (v.3.5.6), normalizing the signal in each sample against its corresponding input (–control BAM option) to produce a net enrichment bigWig track for Fork-deg-seq signal.

Fork-deg-seq data analysis

Downstream analysis of the Fork-deg-seq signal focused on the enrichment of fork degradation at HU-induced loop sites and its relationship with replication timing and DNA damage markers. First, to obtain a clean fork degradation signal track, we subtracted the input signal from each sample using deepTools (through the bigwigCompare function with –operation subtract). The resulting difference signal (sample minus input) was exported to bedGraph format using the UCSC utility bigWigToBedGraph. We filtered this bedGraph to retain only positive values (genomic regions with genuine fork degradation enrichment above the background), then converted it back to a bigWig file using bedGraphToBigWig. This processed Fork-deg-seq track represents the specific signal of nascent-strand degradation.

For aggregate analyses, we focused on the set of HU-specific chromatin loops (as defined in the loop analysis above). We used deepTools computeMatrix -scale-regions to extract Fork-deg-seq signal profiles centred on each loop. Signals were computed for a window spanning each loop anchor region with 5 kb flanking on each side; a bin size of 50 bp was used for the 5 h HU treatment samples, whereas a bin size of 500 bp was used for 3 h HU samples. This produced a matrix of Fork-deg signal values around each loop, which we then averaged across all loops to get an overall Fork-deg-seq enrichment profile over HU-specific loops.

We stratified this analysis by replication timing: using the HCT-116 replication timing data (dataset ID Int90617792)^9,13,78,79, we classified each fragile sites as early or late if at least 80% of the fragile site span fell in early replicating or late-replicating regions, respectively (fragile sites not meeting either threshold was excluded from this timed subset). We then computed separate aggregate Fork-deg-seq signal for early loops and late sites. The resulting profiles were plotted to compare fork degradation at early- versus late-replicating fragile site regions (Extended Data Fig. 10c–e). For the RPE-1 WT and shBRCA2 samples, the Fork-deg-seq enrichment matrices (computed as described above) were imported into Python, and the average profiles were plotted with error bars representing the standard error across loops (Extended Data Fig. 9j).

We next examined how the density of HU-specific loops in the genome relates to replication timing and fork degradation levels. We calculated a loop density track by counting the number of HU-specific loops per genomic interval. Specifically, we used bedtools genomecov -bga on the HU-specific loop coordinates to produce a base-by-base coverage track of loop occurrence, then identified contiguous loci with the same loop count (for example, regions with 0 loops, 1 loop, 2 loops). We considered a loop to cover a genomic region if it covers at least 70% of the bin. For each such locus, we determined the mean replication timing value by overlapping the locus with the replication timing data and averaging the timing signal within it. We grouped loci into two categories: those with ≥2 loops and those with <2 loops and plotted the distribution of replication timing values for each group as a histogram. This enabled us to see whether regions with high loop density tend to replicate earlier or later than regions with few or no loops (Fig 4c).

Finally, we integrated additional genomic datasets to provide context for the HU-specific loops and fork degradation sites (Supplementary Figs. 6e and 8 and Extended Data Fig. 10i). We gathered genome-wide data for FANCD2 binding (Fanconi anaemia protein, from chromatin immunoprecipitation–sequencing (ChIP–seq); NCBI SRA: PRJNA473287), γH2AX (a marker of DNA damage, from ChIP–seq; Gene Expression Omnibus (GEO): GSE60395), CTCF binding (from our Rep-CUT&RUN experiments), and genomic instability sites (single-nucleotide variants (SNVs) from a previous study⁵²).

Focusing on the HU-specific loops, we first aligned these datasets relative to loop anchor positions. To do this, we intersected loop anchor coordinates with CTCF peak locations (using bedtools) and identified the nearest CTCF binding site to each loop anchor (this helps to align signals because CTCF often demarcates loop anchors). We then used deepTools⁶⁵ to extract the signal intensity of each dataset (FANCD2, γH2AX⁵², Fork-deg-seq, CTCF and SNVs) in regions around the loop anchors, anchored at the nearest CTCF site. These signals were compiled into a matrix and plotted as a heatmap, with each row representing a loop (ordered by some criterion, for example, signal strength or genomic location) and the columns spanning the region around the anchor. The heat maps therefore display the co-occurrence of fork degradation, DNA damage (γH2AX), repair factors (FANCD2) and CTCF at HU-induced loop sites, providing insights into the molecular context of these structures. Moreover, as illustrative examples, we selected specific genomic regions on chromosome 16 and chromosome 8 and plotted the Fork-deg-seq signal tracks for all conditions (WT, CTCF knockdown, G9ai, DKD, with and without nucleases and shBRCA2) across those regions (Fig. 4d and Extended Data Figs. 9j and 10b). These custom Python-generated line plots highlight how fork degradation signals change in the presence of replication stress and different genetic perturbations, in relation to the local replication and looping landscape.

scEdU-seq data analysis

Filtering of scEdU-seq-positive cells and reads

Cells were filtered on the basis of their average counts per 100-kb genomic bin, applying lower and upper thresholds of 0.37 and 2.72, respectively. Deviations from Poisson behaviour were defined using a threshold of 0.1. For each cell, an exponential mixture model was fitted to the genomic distances between successive reads per chromosome using the R package flexmix. Reads with a posterior probability >0.5 for their nearest-neighbour distances were retained for S phase ordering.

Construction of S phase ordering

To refine the S phase trajectory, Gaussian kernel smoothing (s.d. = 8,333.333) was applied to the filtered reads, and pairwise overlap coefficients were computed between cells on a per-chromosome basis. These overlap coefficients were converted into distances and averaged across all chromosomes. The resulting distance matrix was embedded in one dimension using UMAP⁸⁰ (umap R package). The embedding process was repeated 100 times, and each resulting UMAP axis was standardized to a z score. To ensure consistent orientation, runs with a mean Spearman rank correlation below 0.85 relative to the others were discarded. Cells were considered robustly ordered if their placements formed a cluster with gaps <0.1 between successive positions and at least 80% of runs falling within the largest cluster.

DNA replication track visualization

For visualization, heat maps of the scEdU-seq signal were generated across single cells aligned along the inferred S phase ordering. A representative genomic region spanning both early- and late-replicating domains (for example, chromosome 2, 10–50 Mb) was used for display. The scEdU-seq signal was binned in 50-kb intervals and normalized to the maximum signal per chromosome, producing a normalized scale from 0 to 1.

Segmentation of single DNA replication forks and replication speed estimation

To identify genomic regions undergoing replication in single-pulse experiments, a two-state hidden Markov model was used to segment the genome into foreground and background regions. Specifically, a hidden semi-Markov model (R package mhsmm) was fitted for each cell, using exponential emission distributions and a gamma sojourn distribution based on inter-read distances. The widths of segmented replication forks were divided by the labelling duration (20 min) to calculate replication speeds (kb per min).

Cancer mutation analysis

The Catalogue of Somatic Mutations in Cancer (COSMIC) database⁸¹ release v95 annotated to hg38 was used to identify mutation burden at HU-unique loops in BRCA2 or BRCA1 proficient and deficient tumours. The mutation burden at HU unique loop anchors was then analysed as described previously⁸².

Genome-wide correlation

Genome-wide correlation plots were generated using Circos⁸³ through the Galaxy platform⁸⁴ correlation analysis was performed by dividing the genome into 1 Mb bins. Each bin was binarized according to whether it colocalized with HU-unique loops (1) or not (0). For each bin, the average signal or presence/absence of the second feature was calculated. Correlations between features were quantified using Spearman’s rank correlation to account for potential non-linear relationships. Both the standard Spearman correlation (rho) and a permutation-based Spearman test robust to ties were computed. To assess differences in the distribution of the second feature between bins overlapping HU-unique⁸⁵ loops and bins not overlapping, a Wilcoxon rank-sum test was applied. All analyses were performed in R Studio (v.2024.09.0) using base functions and the coin package⁸⁶ for permutation-based tests. Reported P values are two-sided, and significance was considered at P < 0.05.

Data visualization and statistical methods

Box plots

For ChromStretch analyses (Fig. 1b and Supplementary Fig. 7c,d), DNA FISH (Fig. 3a and Extended Data Fig. 7d), EM (Fig. 5d,e) and PLA (Extended Data Figs. 4d,e and 8d), box plots were generated with the box spanning the interquartile range (IQR; 25th–75th percentiles) and the median is indicated by the centre line. The whiskers represent the 10th and 90th percentiles, and datapoints outside this interval are shown individually as outliers.

All remaining box plots (Fig. 2f,h, Extended Data Figs. 7d,e,h, 9d,e and 10c–e,i,j and Supplementary Figs. 4b,c and 8a) were generated using standard Tukey parameters: the box limits represent the IQR, the median is marked by the centre line and the whiskers extend to the most extreme values within 1.5 × IQR from the quartiles, and points beyond this threshold are plotted as outliers.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.