Probing condensate microenvironments with a micropeptide killswitch

Generation of DNA constructs

A list of all of the oligonucleotides that were used to generate plasmids and used for quantitative PCR with reverse transcription (RT–qPCR) in this study is provided in Supplementary Table 1.

Generation of mammalian expression vectors for expression of GFP-fused HMGB1 and NPM1

Mammalian expression vectors pRK5-mEGFP-HMGB1-WT, -mutant and mutant-patchless were prepared in a previous study³¹ (Addgene, 194548, 194550 and 194553, respectively). To generate fsHMGB1-full-length with KS variants C16-to-A, M-to-E&D and 3F-to-E&D (Fig. 1e), cDNA fragments were ordered from Twist Biosciences, PCR amplified and assembled into BsrGI + SalI-digested pRK5-mEGFP-HMGB1-mutant plasmid using the NEBuilder HiFi DNA assembly master mix (NEB, E2621). Other KS variants (F-to-A, F-to-G, 0F, F12A, F13A, ΔKS 3F, 11G3F3G, KS Shuffled01–10 and ΔKS-TDP43 HP) were constructed by amplifying pRK5-mEGFP-HMGB1-mutant vector with primers flanking the KS sequence, and KS variant inserts were generated as a single-stranded oligonucleotide that was used to bridge double-stranded, PCR-amplified vector using the NEBuilder HiFi DNA assembly master mix.

To generate PiggyBac carrier vectors, N-terminally eGFP-tagged coding sequence of human NPM1 (Addgene, 131818)⁶¹ was cloned into the backbone of the inducible Caspex expression vector (Addgene, 97421)⁶² linearized by restriction digest with NcoI (NEB, R0193) and BsrGI (NEB, R3575). The eGFP sequence was cloned from same inducible Caspex expression vector. The KS variants were introduced into the C terminus of NPM1 through the overhang regions of the reverse PCR primers (Extended Data Fig. 2e).

pRK5-NPM1-WT and -KS plasmids were constructed by amplifying eGFP-NPM1-WT and -KS DNA fragments from PiggyBac carrier vectors described above and the fragments were assembled into AgeI + SalI-digested (NEB, R3552 and R0138, respectively) pRK5 backbone (mEGFP-HMGB1-WT) using the NEBuilder HiFi DNA assembly master mix.

pRK5-mEGFP-NPM1-TDP43-HP plasmid was constructed by amplifying the NPM1 sequence from Addgene plasmid 131818 and cloned using the NEBuilder HiFi DNA assembly into a backbone of the pRK5-meGFP (Addgene, 18696)⁶³ linearized by restriction digest with BsrGI and AccI (NEB, R0161). The TDP43-HP sequence (amino acids 318–343) was introduced into the C terminus of NPM1 through the overhang regions of the reverse PCR primer (Supplementary Fig. 2f).

GFP-nb–KS expression vectors

Initial design of GFP-nb expression constructs included mCherry fused to SV40-NLS-(GGGGS)×2-linker-GFP-nb-(GGGGS)×2 linker, followed by 0–2 repeats of killswitch peptide in the pRK5 vector backbone. To generate this, mCherry was amplified from the pET45-mCherry-NPM1 vector (Addgene, 194546)³¹, cDNA sequences for SV40-NLS-(GGGGS)×2-linker-GFP-nb-(GGGGS)×2-(0–2 repeats of killswitch) were ordered from Twist Biosciences. The LaG-16 variant of GFP-nb³⁵ (Addgene, 128788)⁶⁴ was used in this study. DNA fragments were PCR-amplified and assembled into AgeI + SalI-digested pRK5-mEGFP-HMGB1 plasmid (Addgene, 194550) using the NEBuilder HiFi DNA assembly master mix.

As expression mCherry–GFP-nb fusion protein had unintended effects on nucleoli (Extended Data Fig. 3a), two cleavage sites (P2A and T2A) were introduced between mCherry and GFP-nb to ensure minimal levels of fusion protein while enabling the use of mCherry fluorescence as reporter for GFP-nb expression level. Unless stated otherwise, nanobody construct with cleavable mCherry was used in experiments using GFP-nb. Moreover, HA tag was introduced between T2A and SV40-NLS so that localization of GFP-nb into target condensate could be verified with immunofluorescence (Extended Data Fig. 3c). To generate these expression vectors, pRK5-mCherry-GFP-nb vector was PCR-amplified using primers that introduce HA tag before SV40-NLS, and an insert containing GSG-P2A-GSG-T2A sequence was generated as single-stranded oligonucleotide that was used to bridge double-stranded PCR-amplified pRK5-mCherry-GFP-nb vector using the NEBuilder HiFi DNA assembly master mix. pRK5-mCherry-P2A-T2A-HA-tag-SV40-NLS-GFP-nb-1×KS, -2×KS, -1×KS_F-to-A and -1×KS_F-to-G vectors were generated by amplifying mCherry-P2A-T2A-HA-tag sequence from vector described above and GFP-nb variant sequences from previously prepared pRK5-mCherry-GFP-nb vectors, and fragments were assembled into AgeI + SalI-digested pRK5 vector backbone using the NEBuilder HiFi DNA assembly master mix. Vectors with 2×KS_F-to-A and 2×KS_F-to-G variants were generated by amplifying the vector with primers flanking the KS sequence and 2×KS_F-to-A and 2×KS_F-to-G sequences were generated as single-stranded oligonucleotide that was used to bridge double-stranded, PCR-amplified vector using the NEBuilder HiFi DNA assembly master mix. For experiments involving mCherry–RPL18 and mCherry–SURF6 (Fig. 3g and Supplementary Fig. 5), mCherry reporter in the GFP-nb construct was replaced by TagBFP. To generate this vector, mCherry and GFP-nb sequences were cut out by digestion with AgeI + SalI, and the TagBFP and P2A-T2A-HA-tag-SV40NLS-GFP-nb sequences were amplified by PCR and assembled back into pRK5 backbone using the NEBuilder HiFi DNA assembly master mix. The TagBFP sequence was a gift from the E. Schulz laboratory.

Killswitch expression vectors with other nanobodies

Sequences for nanobodies against V5-tag (Addgene, 201475⁶⁵), ALFA-tag (Addgene, 159986)⁶⁶ and VHH05-tag (Addgene, 171570)⁶⁷ were ordered as synthetic DNA from Twist Biosciences. mCherry-P2A-T2A-SV40NLS-linker sequences were cloned from previously generated GFP-nb vector and assembled with nanobody sequences with NEBuilder HiFi DNA assembly master mix. Linker and killswitch sequences were introduced to C termini of nanobodies in reverse PCR primers.

V5-tag, ALFA-tag or VHH05-tag was fused to the N terminus of eGFP–NPM1 by introducing sequences into the forward cloning primers when cloning eGFP–NPM1 from PiggyBac carrier vector into AgeI- and SalI-digested pRK5 backbone.

pUC19 repair templates

Repair templates with msfGFP flanked by 550–900 bp homology arms in the N termini of NPM1, TCOF1, HP1α, EWSR1 and FUS, and the C terminus of SRRM2 were generated into pUC19 vector backbone (NEB, N3041S). msfGFP was amplified from previously generated pRK5-mEGFP-HMGB1 plasmid (Addgene, 194550). DNA fragments for the 5′ and 3′ homology arms of NPM1, HP1α, SRRM2, EWSR1 and FUS were ordered from Twist Biosciences. The homology arms of TCOF1 were amplified from the gDNA of HCT-116. The repair template of each target gene was assembled into pUC19 vector digested with SalI + KpnI (NEB, R3142) or HindIII + BamHI (NEB, R0104 and R0136, respectively) using the NEBuilder HiFi DNA assembly master mix.

gRNA–Cas9 expression vectors

gRNAs targeting N or C termini of target genes were cloned into sgRNA-Cas9 expression vector pX458 (Addgene, 48138)⁶⁸, from which eGFP was replaced with mCherry. The pX458 backbone was amplified as three separate fragments, T2A-mCherry fragment was amplified from Addgene plasmid 161974⁶⁹ and fragments were assembled to generate pX458-mCherry vector using the NEBuilder HiFi DNA assembly master mix. Guide RNAs were cloned into pX458-mCherry vectors using DNA oligos (0.1 nmol) that were first phosphorylated for 30 min at 37 °C in T4 DNA ligase reaction buffer (NEB, M0202) with T4 polynucleotide kinase (NEB, M0201) in a total volume of 10 µl, annealed after 5 min of incubation in 95 °C by cooling down at room temperature for 20 min. The oligo duplex was diluted 1:200 and 1 µl of oligo duplex was ligated into 50 ng of BbsI-digested (NEB, R0539) pX458-mCh plasmid using T4 DNA ligase (NEB, M0202).

Generation of GFP-fusion-oncoprotein expression vectors

For expressing the N-terminally mEGFP-tagged fusion oncoprotein constructs (Fig. 4a), the fusion partner sequence of each fusion oncoprotein of NUP98::HOXA9, NUP98::DDX10, SS18::SSX1, SS18::SSX2, NONO::TFE3, YAP::MAMLD1, TAZ::CAMTA1 and PAX::FOXO1 were amplified from U2OS cDNA. The cDNA was first amplified using primers without overhang unless stated otherwise in Supplementary Table 1. The PCR products were then amplified as DNA template using primers containing overhang for Gibson assembly reaction. EWS::FLI1 was cloned from Addgene plasmid 102813 (ref. ⁷⁰). CRTC1::MAML2 was cloned from Addgene plasmid 154265 (ref. ⁷¹). NUP98::DDX10 Mut2 was cloned from synthesized NUP98 sequences from Twist Biosciences that were ligated to DDX10 sequence through Gibson ligation. The KS sequence was introduced through the overhang of the reverse PCR primers. The fusion oncoproteins were then cloned using NEBuilder HiFi DNA assembly into a backbone of the pRK5-meGFP (Addgene, 18696) linearized by restriction digest with BsrGI and AccI.

To generate expression plasmids with eGFP–BRD4::NUT, the eGFP sequence was amplified from pRK5-eGFP-NPM1 plasmid described above and cDNA for BRD4::NUT fusion protein was amplified from Addgene plasmid 171630 (ref. ⁴⁹) as two separate fragments, while generating KS and KS_F-to-G fusions in the C terminus with reverse primers. BRD4::NUT and eGFP fragments were assembled into AgeI + SalI-digested pRK5 vector backbone (pRK5-mEGFP-HMGB1) using the NEBuilder HiFi DNA assembly master mix.

For 1,6-hexanediol experiments, the mCherry-P2A-T2A sequence was fused N-terminally to eGFP–BRD4::NUT constructs by cloning it from previously prepared pRK5-mCh-P2A-T2A nanobody construct.

pRK5-mCherry constructs

To generate N-terminal mCherry fusions to SURF6, RPL18 and P300, mCherry sequence was cloned from previously prepared mCherry-P2A-T2A nanobody vectors and GGGGSGGGGS linker C-terminal to mCherry was introduced in the reverse primer. Coding sequences of SURF6, RPL18 and P300 were amplified from HEK293T cDNA and assembled with mCherry sequence into AgeI + SalI-digested pRK5 backbone using the NEBuilder HiFi assembly master mix.

To generate C-terminal mCherry fusion to CRM1, the mCherry sequence was cloned from previously prepared mCherry-P2A-T2A nanobody vectors and GGGGSGGGGS linker N-terminal to mCherry was introduced. The coding sequence of CRM1 was amplified from U2OS cDNA and assembled with mCherry sequence into AgeI + AccI-digested pRK5 backbone using the NEBuilder HiFi assembly master mix.

Vectors for LacO-LacI tethering assay

CFP-LacI-MCS plasmid⁷² was digested with BamHI and XbaI, a stop codon was introduced to the end of the GAPGSAGSAAGGSAIA linker sequence after CFP-LacI with T4 PNK (NEB, M0201S) phosphorylated primers. CFP-LacI-NUT, -NUT-KS and NUT-KS_F-to-G vectors were generated by digesting CFP-LacI-MCS plasmid with AsiSI and BsiWI restriction enzymes and NUT sequences were cloned from pRK5-eGFP-BRD4::NUT vectors and assembled with CFP-LacI backbone using NEBuilder HiFi DNA Assembly.

Constructs for NUP98::KDM5A experiments

The GFP–NUP98::KDM5A construct was generated by restriction-enzyme-guided removal of the IRES sequence from a plasmid for constitutive expression of NUP98::KDM5A as previously described⁵⁶. For generation of GFP–IRES–NUP98::KDM5A–KS/KS_F-to-A/KS_F-to-G constructs, the KS was amplified from OE_KS/F-to-A/F-to-G-mEGFP and introduced in frame into the GFP–IRES–NUP98::KDM5A vector using Gibson assembly. The TRE3G-mCherry-P2A-nb-KS was cloned into a lentiviral entry vector. The retroviral plasmid pCMV-gag/pol was acquired from Cell Biolabs.

52K expression vectors

To generate mammalian expression plasmids containing C-terminally tagged 52K fusion proteins (Fig. 5d–h and Supplementary Fig. 11d), plasmid backbones were linearized by restriction digest, and complete plasmids reassembled from linearized backbone and DNA fragments using Gibson assembly Master Mix (NEB, E2611) according to the manufacturer’s guidelines. All restriction enzymes were purchased from New England Biolabs and were compatible with digest reactions in rCutsmart buffer (NEB, B6004S). Gene fragments corresponding to all variants of KS were purchased as double-stranded DNA from Azenta Life Sciences using the FragmentGENE service (Supplementary Table 1). For cloning of GFP-tagged fusion proteins (Fig. 5a and Supplementary Fig. 11a), plasmids were assembled from p52K-GFP⁵⁷ linearized by digestion with BsrGI (NEB, R3575) and the corresponding KS gene fragment. For cloning of fusion proteins without GFP tags, p52K⁵⁷ was digested with NheI (NEB, R3131) and SalI (NEB, R0138) restriction enzymes to excise the existing 52K open reading frame and linearize the plasmid backbone. An open reading frame encoding 52K, lacking the stop codon and containing complementary sequence required for Gibson assembly was PCR-amplified using specific primers. A three-fragment assembly was performed using digested backbone, PCR-amplified open reading frame and the corresponding KS gene fragment. The mammalian expression plasmid encoding GFP-tagged minor capsid protein IIIa was previously described⁵⁷.

Generation of DNA constructs for protein purification

For the purification of msfGFP labelled fusion proteins (Fig. 1f, Extended Data Fig. 7 and Supplementary Fig. 1) we amplified the msfGFP sequence from previously generated pRK5-mEGFP-HMGB1 plasmid (Addgene, 194550) for msfGFP, msfGFP–KS, msfGFP–KS_F-to-G and msfGFP–NPM1. The NPM1 sequence was amplified from pHcRed-NPM1wt-C1 plasmid (Addgene, 131818). The amplified gene fragments were cloned into a pET22b-backbone (Addgene, 166439) linearized by restriction digest with PmlI and BsrGI, using the NEBuilder HiFi DNA assembly master mix.

Generation of DNA constructs for zebrafish experiments

A plasmid containing full-length Nanog mNeonGreen was generated previously⁵⁸. The KS and KS_F-to-G control peptide sequences were cloned into the C-terminal end of Nanog-mNeonGreen. These plasmids were used as a template to synthesize RNA using the SP6 mMessage mMachine in vitro transcription kit (Invitrogen AM1340) according to manufacturer’s instructions.

Cell culture

Cells were cultured under standard conditions (37 °C and 5% CO₂) in sterile, TC-treated, non-pyrogenic, polystyrene tissue culture dishes (Corning). U2-OS (ATCC, HTB-96) HEK293T (ATCC, CRL-3216), HCT-116 (ATCC, CCL-247), MCF7 (ATCC, HTB-22), C2C12 (ATCC, CRL-1772), Lenti-X 293T (Takara Bio, 632180) and A673 (gifted by H. Kovar; CLS, 300454) cell lines were cultured in DMEM GlutaMAX (Gibco, 31966047). HAP1-SRRM2^tr0 (ref. ³⁷) and TC71 (DMSZ, ACC516) cells were cultured in IMDM (Gibco, 12440053). H3122 (CLS, 300484) and 1765-92 (gifted by P. Åman) cells were cultured in RPMI 1640 Medium (Thermo Fisher Scientific, 61870036). All media included 10% FBS (Gibco, 10438-026) and 100 U ml⁻¹ penicillin–streptomycin (Gibco, 15140148).

For experiments involving expression of 52K and KS variants, HEK293 (ATCC, CRL-1573) and HEK293T (ATCC, CRL-3216) cells were grown in DMEM (Corning, 10-013-CV) supplemented with 10% FBS (VWR, 89510-186) and 1% penicillin–streptomycin (Gibco, 15140-122).

V6.5 mES cells were cultured on irradiated primary mouse embryonic fibroblasts, previously seeded on 0.2% gelatin-coated plates, in KO DMEM (Gibco, 1082901) containing 15% FBS (Gibco, 10438-026), 100 U ml⁻¹ penicillin–streptomycin, 1× non-essential amino acids (Gibco, 11140050), 0.05 mM β-mercaptoethanol (Gibco, 21985023) and laboratory-purified recombinant leukaemia inhibitory factor (LIF). The identity of all cell lines were verified using morphological characteristics, but lines have not been authenticated.

All of the cell lines tested negative for mycoplasma using the LookOut Mycoplasma PCR Detection Kit (Sigma-Aldrich, MP0035) or the PCR Mycoplasma Test Kit II (Applichem, A8994). Mycoplasma testing was performed on 0.2–1 ml of cell culture medium taken from tissue culture dishes containing confluent monolayers of cells on a routine basis at least twice a year.

For experiments involving expression of NUP98::KDM5A and KS variants, mouse fetal liver cells were cultured in DMEM/IMDM (50:50%, v/v, Gibco, life technologies), supplemented with 10% heat-inactivated FBS (Sigma-Aldrich), 100 U ml⁻¹ penicillin, 100 µg ml⁻¹ streptomycin, 4 mM l-glutamine and 50 µM β-mercaptoethanol (all Gibco, Thermo Fisher Scientific) in the presence of 100 ng ml⁻¹ mSCF, 10 ng ml⁻¹ mIL-3 and 10 ng ml⁻¹ mIL-6 (all PeproTech). Ex vivo-isolated leukaemia cells were cultured in RPMI 1640 (Gibco, Life Technologies), supplemented with 10% FBS, 100 U ml⁻¹ penicillin, 100 µg ml⁻¹ streptomycin, 4 mM l-glutamine, 100 ng ml⁻¹ mSCF and 10 ng ml⁻¹ mIL-3. After 1 week, the medium of the ex vivo-isolated cells was switched to RPMI 1640 supplemented with 10% FBS, 100 U ml⁻¹ penicillin and 100 µg ml⁻¹ streptomycin, 4 mM l-glutamine, 1 mM sodium pyruvate (Sigma-Aldrich), 50 µM 2-mercaptoethanol (Gibco, Thermo Fisher Scientific) and 20 mM 4-(2-hydroxyethyl)−1-piperazineethanesulfonic acid (HEPES) (Sigma-Aldrich). Stable cell lines were established by continuous culture for over 4 weeks and the GFP–NUP98::KDM5A cell line was maintained using RPMI medium. Platinum-E (Cell Biolabs), HEK293T and Lenti-X 293T cells (Takara) were cultured in DMEM (Gibco, Thermo Fisher Scientific) supplemented with 10% FBS, 100 U ml⁻¹ penicillin, 100 µg ml⁻¹ streptomycin and 2 or 4 mM l-glutamine, respectively. Mouse leukaemia cells and HEK293T cells expressing nanobody constructs were incubated with doxycycline (Sigma-Aldrich, 24390-14-5) at 1 µg ml⁻¹ in growth medium to induce the expression. For proteasome inhibition (Fig. 4h–j and Supplementary Fig. 9a,b), murine leukaemia cells expressing the nanobody constructs were initially incubated with doxycycline as specified above for 3 h to induce expression. Cells were then treated with the UBA1 inhibitor TAK-243 (MLN7243) (MedChemExpress, HY-100487) at 0.25 µM, the NAE inhibitor pevonedistat (MLN4924) (MedChemExpress, HY-70062) at 1.25 µM and the proteasome inhibitor MG-132 (MedChemExpress, HY-13259) at 2.5 µM and incubated for 3 h before imaging. All cells were cultured at 37 °C under 5% CO₂ and 95% humidity.

Cell-viability assay

A total of 150,000 cells per well were seeded onto a six-well plate, transfected the next day with 500 ng of pRK5-msfGFP-HMGB1-mutant KS variants using FuGENE HD (Promega, E2311) according to the manufacturer’s instructions. GFP-expressing cells were sorted using the FACSAria II instrument (BD) the next day, and 10,000 cells per well were seeded onto 96-well plates and the viability was measured 48 h later using CellTiter-Glo 2.0 reagents (Promega, G9242) (Fig. 1c and Extended Data Fig. 2d).

Generation of GFP knock-in cell lines

Generation of repair templates with msfGFP flanked by 550–900 bp homology arms in the N termini of NPM1, TCOF1, HP1α, EWSR1 and FUS, and the C terminus of SRRM2 into pUC19 vectors is described in the ‘pUC19 repair templates’ section. Linear repair template DNA fragments were generated by PCR (a list of the primers is provided in Supplementary Table 1), gel-extracted and purified using the QIAquick gel extraction kit (Qiagen, 28704).

A total of 350,000 HCT-116, U2OS, TC71 and 1765-92 cells per well was seeded onto six-well plates and transfected the next day with 2,400 ng of linearized repair template and 600 ng of pX458-mCherry-gRNA plasmid using Lipofectamine 3000 according to the manufacturer’s instructions. Cells were first selected for mCherry expression using the FACSAria II instrument (BD) 48 h after transfection, cultured for 4–6 days, after which GFP-expressing cells were sorted into single-cell clones onto 96-well plates.

A total of 500,000 mES cells per well was seeded feeder-free. The medium was supplemented with 2× LIF and the cells were transfected the next day with 4,000 ng linearized repair template and 1,000 ng of pX-458-mCherry-gRNA plasmids using Lipofectamine 3000 according to the manufacturer’s instructions. Cells were first selected for mCherry expression using the FACSAria II instrument (BD) 48 h after transfection, cultured for 4–6 days on 6 cm dish with feeder cells. mES cell colonies were hand-picked into 96-well plates with feeder cells.

Homozygous clones for NPM1, TCOF1, SRRM2 and HP1α were selected after verifying successful insertion into both alleles, and the knock-in of the WT allele of EWSR1 was genotyped using the primers listed in Supplementary Table 1. Owing to the complex karyotype of U2OS cells, only heterozygous knock-in of GFP–NPM1 was successful. Finally, the GFP-tagged proteins from the knock-in lines were verified by western blotting. Genotyping data for the cell lines generated in this study are included in Supplementary Figs. 13 and 15.

Generation of doxycycline-inducible eGFP–NPM1 overexpression systems in A673 cells

To generate a doxycycline-inducible overexpression system of eGFP–NPM1, we randomly integrated the coding sequences of NPM1 wild type, KS, KS_F-to-G, KS_F-to-A and KS_0F into A673 cells using the PiggyBac transposon system.

Carrier plasmids (described above) and PiggyBac transposase expression vector (SBI, PB210PA-1) were co-transfected into A673 cells using Lipofectamine 3000 (Thermo Fisher Scientific) according to the manufacturer’s instructions at a molar ratio of 5:1. The transfected bulk population was screened for integration by addition of 2 μg ml⁻¹ puromycin (Gibco) to the cell culture medium 24 h after transfection for a total of 5 days. The surviving cells were then used for experiments (Extended Data Fig. 2e–g).

LacO–LacI tethering assay

U2OS 2-6-3 cells with LacO array⁵¹ were seeded on eight-well chamber slides (Ibidi, 80826-90) at density of 30,000 cells per well and transfected the next day with CFP-LacI (empty control) or CFP-LacI-NUT plasmids using FuGENE HD and 175 ng plasmid per well according to the manufacturer’s instructions. Then, 2 days after transfection, cells were fixed and stainings for RNAPII were performed as described in the ‘Immunofluorescence’ section.

Image analysis of the LacI–LacO tethering assay was performed using ZEN Blue v.3.9 software using the zone of influence method. LacI-NUT foci were detected using CFP signal (click thresholding; ValueLower: 16600; ValueUpper: 65535; Dilate: 1) and background regions were defined as rings surrounding the foci (ring distance: 5; thickness: 6). The mean CFP intensities and AlexaFluor 647 intensity for RNAPII were measured, and enrichment of the RNAPII signal was calculated by dividing the mean signal at foci by the mean signal at the background ring element.

Transplantation-based models and NUP98::KDM5A cell line generation

GFP–NUP98::KDM5A mouse-model-derived cell lines were established as described previously⁵⁶. In brief, GFP–NUP98::KDM5A cell lines were generated by retroviral co-transduction of MSCV–eGFP–NUP98::KDM5A with MSCV–rtTA3–IRES–Nras(G12D)–EF1a–Luc2 of mouse fetal liver-derived HPSCs (C57BL/6, Ly5.2). Then, 3.74 × 10⁶ (6.1% GFP⁺) cells were transplanted into sublethally irradiated (4,5 Gy) recipient mice (C57BL/6, Ly5.1) through tail-vein injection (Fig. 4f). Disease progression was monitored by whole-body luminescence imaging as previously described. Mice were euthanized after disease onset and bone marrow and spleen cells were collected. Stable cell lines were established by continuous culture of bone marrow cells for 4 weeks without supplemented cytokines. All animal studies were performed according to ethical animal license protocols and were approved by the responsible authorities of the Austrian government (BMBWF-68.205/0199-V/3b/2018). For this, male and female C57BL/6J.SJL mice at the age of 10–12 weeks were used. Mice were kept in specific opportunistic pathogen free quality (SOPF) under stringent controlled standard conditions, in individually ventilated cages, fed with Sniff Haltungsfutter CHOW standard 10 mm pellets (catalog no. V1534-000), ad libitum. This study does not include any experiments in which animals were subjected to different treatment cohorts, for which sex-based analysis would be relevant.

Live-cell imaging

All live-cell imaging experiments were performed using the LSM880 Airyscan microscope equipped with a Plan-Apochromat ×63/1.40 oil differential interference contrast objective, while incubating cells at 37 °C and 5% CO₂. Cells were seeded onto eight-well chamber slides (Ibidi, 80826-90) at 30,000 cells per well, transfected 24 h later and imaged 24 h after transfection. U2OS and HEK293T cells were transfected using FuGENE HD; and HCT-116, TC71, 1765-92, H3122 and V6.5 mES cells with Lipofectamine 3000 according to the manufacturer’s instructions. Hoechst 33342 (0.2 µg ml⁻¹, Thermo Fisher Scientific, 62249) was added into cell culture medium for nuclear staining. To visualize nucleoli in living cells, RFP–fibrillarin fusion protein was expressed by transfecting cells with pTagRFP-C1-fibrillarin plasmid (Addgene, 70649)⁷³ together with plasmids for mEGFP-fsHMGB1-full-length variants.

For expressing NUP98::KDM5A and KS constructs, HEK293T cells were seeded onto eight-well chamber slides (Ibidi, 80826-90) and cultured until 70% confluency was reached. For transfection, 1 μg of plasmid DNA and 2.5% polyethyleneimine (Polyscience, 26292) were mixed in 200 µl of opti-MEM I (Gibco, 31985062) and incubated for 20 min. The mixture was then added dropwise to each well. After overnight incubation, the medium was exchanged to fresh prewarmed growth medium before live-cell imaging. To visualize cell nuclei, cells were incubated with 5 µM DRAQ5 (NobusBio, NBP2-81125) 10 min before imaging.

In FRAP experiments, two regions of interest (ROIs) were determined: a rectangular ROI 1, and smaller, circular ROI 2 that covered the object to be bleached. GFP signal was bleached within ROI 2 using a 488 nm laser with 70–100% intensity, 5–20 iterations and GFP signal recovery was measured using 1–2 s intervals for 40–60 s. For co-FRAP assays with both GFP and mCherry signal, the mCherry signal was bleached using a 561 nm laser the same way as for the GFP signal. The laser intensity, number of iterations and size of ROIs varied between experiments, but were always identical within an experiment. Fluorescence intensities were acquired from 6–20 ROIs from separate condensates in each experiment, quantified using ZEN Black v.2.3 and reported as relative values to the pre-bleaching timepoint (Figs. 1b, 2e, 3g, 4b and 5b, Extended Data Figs. 1a, 2f, 3e, 4a, 6f, 8b, 9b, 10b,f, 12c,f and Supplementary Figs. 2d,g, 3b,e,h, 5b,e, 10d,g and 12b). Figures were generated using GraphPad PRISM9 and with R package ggplot2. In FRAP experiments with cells transfected with mCherry-P2A-T2A-GFP-nb vectors, an image was acquired from ROI 1 using Hoechst, mCherry and GFP channels before photobleaching, and the mCherry signal within nuclear area (defined by Hoechst signal) was used to quantify the mCherry expression level (Extended Data Figs. 3e and 4b and Supplementary Figs. 3c,f and 6b). To quantify the nuclear TagBFP intensity (Supplementary Fig. 5c,f), the mean intensity was measured from a square of 1.4 µm² manually placed at a nuclear region outside the nucleolus.

NuFANCI

The NuFANCI method was adopted from the FANCI method⁴¹. Endogenously msfGFP–NPM1-tagged cells were used for the NuFANCI experiment. Six million cells were plated in 10 cm dishes, cultured for 24 h, the medium was then changed and cells were transfected using Lipofectamine 3000 with the GFP-nb constructs. The actinomycin-D-treated cells were treated with 400 nM actinomycin D (Sigma-Aldrich, A1410-2MG) for 1 h before collection. Then, 24 h after transfection, the untransfected, actinomycin-D-treated and GFP-nb-transfected cells were trypsinized, collected and then pelleted into 1.5 ml Eppendorf tubes. The cells were fixed with 1 ml 1% formaldehyde diluted in cell culture medium from 16% formaldehyde (Thermo Fisher Scientific, 28906) for 10 min at room temperature with rotation. The fixation was stopped by adding 1 M glycine (Jena Bioscience, CSS-510) to a final concentration of 200 mM for 5 min at room temperature with rotation. The cells were washed twice with cold PBS, each with spin of 1,000×g at 4 °C for 3 min, and the pellets were kept on ice for sorting as soon as possible without first freezing the cells. Next, the fixed cells were sorted on the BD FACSAria Fusion system to collect mCherry⁺ cells from the transfected samples and mCherry⁻ cells from the untransfected samples into 15 ml Protein LoBind Tubes (Eppendorf, 0030122216) coated with FACS buffer (2% FBS, 2 mM EDTA, in PBS). Around 850,000 and 500,000 events were collected from the transfected and untransfected samples, respectively. The sorted cells were pelleted in 1.5 ml Protein LoBind Tubes (Eppendorf, 0030108116). Next, the pellets were thoroughly resuspended in 1 ml lysis buffer B0 (50 mM HEPES pH 7.5, 150 mM KCl, 1% IGEPAL CA-630, cOmplete protease inhibitor (Sigma-Aldrich, 11873580001) and PhosSTOP (Merck 4906837001)) supplied with a final concentration of 1 mM DTT, 1:1,000 RNase inhibitor (NEB, M0314L) and 2 µg ml⁻¹ DAPI. The samples were incubated on ice for 5 min, transferred to Covaris milliTUBE (Covaris, 520130) and then sonicated using Covaris E220 (PIP: 140; duty factor: 5; duration: 120 s). Then, 30 µl of each sample was reserved as the input material for MS. The rest of each sample was transferred to 1.5 ml Protein LoBind Tubes and sorted on the BD FACSAria Fusion system with an SSC threshold of 1000 into 1.5 ml Protein LoBind Tubes (the sorting strategy and quality control are shown in Supplementary Fig. 4). For the gating strategy for the sorting of nucleoli, three gates were used: (1) DAPI (uv-450/50-A) versus GFP (b-530/30-A) was used to identify the population containing nucleoli (GFP⁺DAPI^intermediate), determined by sorting different fractions outlined in Supplementary Fig. 4c and subsequent imaging (Supplementary Fig. 4d); (2) FSC-A versus SSC-A gate was used to exclude large events; (3) GFP (b-530/30) versus mCherry (yg-610/20) was used to sort for either mCherry⁻ (for the samples untransfected and actinomycin D) or mCherry⁺ (for the samples Nb, KS, KS_F-to-G and 2×KS). Gates were determined by comparing to mCherry⁻ samples (untransfected). Flow Cytometry data were collected and analysed using BD FACSDiva v.8.0.1; flow cytometry data visualization was performed using FlowJo. Around 400,000 events were collected from each sample. The collected nucleoli were centrifuged at 10,000×g for 10 min at 4 °C followed by one wash with cold PBS.

For MS sample preparation, the nucleolus samples were supplied to reach 1× buffer 4 (2× buffer 4: 100 mM Tris, pH 7.5, 50 mM NaCl and 4 mM MgCl₂) and incubated with 1,000 RPM shaking at 65 °C for 1 h and then 10 min at 95 °C. Each sample was then sonicated on the Qsonica Q700 sonicator equipped with microtip (Bioke, Q4417) with amplitude 5 for 10 s until there were no visible particles in the tube. Next, each sample was cooled on ice briefly before benzonase was added to a final concentration of 25 U µl⁻¹ (Thermo Fisher Scientific, 70-746-3) and the sample was incubated at 37 °C for 30 min with 1,000 rpm shaking. The samples were then supplied to reach 1× buffer 5 (2× buffer 5: 6 M GdmCl, 20 mM TCEP, 80 mM chloroacetamide) and incubated at 37 °C for 1 h with 1,000 rpm shaking. Next, to each sample 1 ml of ice-cold 100% acetone was added and precipitated at −20 °C overnight. The next day, the samples were centrifuged at 20,000×g for 10 min at 4 °C, the supernatant was discarded, and the sample was washed once with ice-cold 100% ethanol and centrifuged at 20,000×g for 10 min at 4 °C; the supernatant was discarded and the pellet was air-dried briefly to dry most of the ethanol but not to complete dryness. Next, the input and nucleolus samples were resuspended in 50 and 30 µl 100 mM (NH₄)HCO₃, respectively. The samples were then sonicated on the Qsonica Q700 sonicator equipped with microtip with amplitude 5 for 10 s until there were no visible particles in the tube. The concentration of the samples was measured in duplicate on the Qubit 3 system using the Qubit protein assay kit (Thermo Fisher Scientific, Q33211). Around 2 mg of proteins was yielded from each nucleolus sample (the NuFANCI sample preparation quality control log is outlined in Supplementary Table 2). Next, 150 ng of each sample was digested with 5 ng trypsin and 5 ng Lys-C filled to final volume of 20 µl with 100 mM (NH₄)HCO₃ overnight at 37 °C with 800 rpm shaking. The peptides were acidified with formic acid to a final concentration of 2% and 150 ng of the digests was loaded onto Evotip Pure (Evosep) tips according to the manufacturer’s protocol. Peptide separation was carried out by nanoflow reversed-phase liquid chromatography (Evosep One, Evosep) with the Aurora Elite column (15 cm × 75 µm inner diameter, C18 1.7 µm beads, IonOpticks) using the 20 samples a day method (Whisper Zoom 20 SPD). The LC system was coupled online to a timsTOF SCP mass spectrometer (Bruker Daltonics) using the data-dependent acquisition with parallel accumulation serial fragmentation method. The MS data were processed using MaxQuant (v.2.6.6.0; Max Planck Institute for Biochemistry) and searched against the human UniProtKB proteome (UP000005640; revision 2024 09 11). Additional modified sequences as outlined in Supplementary Table 1 were used accordingly. The match between run and label-free quantification features were used independently for the NuFANCI and input samples. The MS data have been deposited at the ProteomeXchange Consortium via the PRIDE partner repository⁷⁴ under dataset identifier PXD058854.

Proteomics analysis

MS data were acquired using TimsTOF SCP (Bruker). The raw peak files were processed using MaxQuant⁷⁵. Label-free quantification (LFQ) values were calculated separately for NuFANCI and input, with a match between runs applied separately for NuFANCI and input. Alphastats (v.0.6.9)⁷⁶ was used to process the MaxQuant output. Protein group matrices were used as an input for Alphastats, and data were preprocessed to remove contaminants and reversed proteins. Principal component analysis (PCA) of the NuFANCI and input proteomes was performed with the 500 most variable proteins using ANOVA, and LFQ values were standardized (Extended Data Fig. 5a). For the NuFANCI subset, PCA was performed with a VST-transformed matrix (Extended Data Fig. 9b). Correlation plots were calculated using the SciPy package⁷⁷ in Python v.3.10 and plotted with Seaborn (Extended Data Fig. 5c). Heat maps were plotted using log₂-transformed LFQ values and clustered by Euclidean distance using the Ward method and plotted using Seaborn (Fig. 3b and Extended Data Fig. 9e). Protein expression plots were generated using Seaborn (Extended Data Fig. 9f). Volcano plot data were calculated using the Alphastats diff_expression_analysis function set to t-test and then plotted with Seaborn (Fig. 3c and Extended Data Fig. 5g). A list of the differentially detected protein groups is provided in Supplementary Table 2.

Cell-penetrating-peptide experiments

Peptides (R₁₀, R₁₀MMMMNKLVLAQFFFSCL and R₁₀MMMMNKLVLAQGGGSCL) with N-terminal TAMRA labels were synthesized by Peptide Specialty Laboratories and reconstituted in DMSO into 2 mM stocks. GFP–NPM1 U2OS cells were seeded onto eight-well chamber slides (Ibidi, 80826-90) at a density of 60,000 cells per well. The next day, cells were washed twice with PBS and exposed to 3 µM peptides in PBS for 30 min in 37 °C. Wells were washed once with PBS and the cells were then kept in an 37 °C incubator for 3 h in the presence of 0.2 µg ml⁻¹ Hoechst 33342 (Thermo Fisher Scientific, 62249) in cell culture medium, followed by imaging on the LSM880 microscope. We noted that R10 control peptide was often present in cytoplasmic foci and only rarely in the nucleolus. To facilitate cellular distribution of R10 control peptide, cells were exposed to Texas Red filter light using the HXP lamp for 30 s, followed by a 30 s waiting period before image acquisition. Imaging of R10–KS and R10–KS_F-to-G was done without an additional illumination step.

To analyse nuclear and nucleolar TAMRA signals in cells treated with cell-penetrating peptides, nuclei were first segmented using Otsu thresholding (click thresholding: 12, 255) and nucleoli were detected on the basis of the GFP–NPM1 signal (click thresholding: 6, 255). Mean nuclear, nucleoplasmic (outside nucleoli) and nucleolar TAMRA signal intensities were measured.

For the images of R10–KS-peptide-treated cells, z-positions for dying cells were processed separately, as the aggregated cytoplasmic and extracellular TAMRA signal biased the use of max intensity projections. We note that, for the TAMRA intensity calculations, in some cases a nucleus could be detected at different z-positions and may be counted twice. For this reason, the number of dying cells, indicated by the presence of small, condensed nuclei and increased Hoechst staining intensity, was counted by visual inspection. Images were acquired from two biological replicates and included combined 356, 460 and 511 cells for peptides R10, R10–KS and R10–KS_F-to-G, respectively. Replicates were pooled for TAMRA intensity measurements (Extended Data Fig. 6d).

NPM1–R10 in vitro droplet fusion assay

For in vitro NPM1 droplet fusion assays, 30 μM of purified recombinant msfGFP–NPM1 was mixed with 2 μM of either TAMRA–R10, TAMRA–R10–KS, or TAMRA–R10–KS_F-to-G and 5% PEG 8000 in a PCR tube and pre-assembled in the tube for 5 min. The volume for each droplet was 5 μl, consisting of 0.75 μl msfGFP–NPM1 in storage buffer (50 mM Tris pH 7.5, 125 mM NaCl, 10% glycerol), 0.5 μl of TAMRA–peptide in DMSO (only DMSO for the DMSO control condition), 1.25 μl 20% PEG 8000 in water and 2.5 μl storage buffer. The resulting 5 μl was pipetted onto a chambered coverslip (Ibidi, 80800). Images were acquired after 3 min equilibration of the drop on the slide, with an LSM880 confocal microscope equipped with a Plan Apochromat ×63/1.40 NA oil DIC objective with a ×1 zoom. For each field of view, time-series imaging captures 300 consecutive images with a 0.26 s time interval. Quantification of droplet fusion events was based on three independent image series per condition.

In vitro droplet fusion analysis

The droplet fusion events were sub-tracked in Fiji (v.2.3.0/1.53f) to contain the 40 slices that capture the one slice before the droplet fusion and 39 slices during the droplet fusion. Then, the fusing droplets were converted to binary mask and the ROI of the fusion droplets are measured for ℓ_major and ℓ_minor for each slice using Fiji’s built-in measurement function. The aspect ratio, relaxation time (τ), length scale (ℓ) and inverse capillary velocity (η/γ) were calculated as previously described^36,46,78,79. In brief, the aspect ratio of the fusing droplets was calculated by AR = ℓ_major/ℓ_minor. The time evolution of the aspect ratio was fit to function AR = 1 + (AR_t0 − 1) × exp(−t/τ), where t is time, τ is relaxation time and AR_t0 is the aspect ratio at the first timepoint. The length scale of the fusion events was calculated by ℓ = (ℓ_minor_,t0 × (ℓ_major,t0 − ℓ_minor,t0))^0.5, where t0 indicates the value at the first timepoint. Plots of τ versus ℓ were fit to a line of the form τ = (η/γ) × ℓ to determine the inverse capillary velocity η/γ, which is the ratio of viscosity (η) to surface tension (γ) (Extended Data Fig. 7).

Image analysis for live-cell imaging

Generation of FRAP curves

The recorded fluorescence intensity from each timepoint of each FRAP ROI was normalized to the signal intensity of the first timepoint. The replicates of each timepoint in the same FRAP series were plotted in GraphPad and RStudio with the error bar representing the s.d.

Calculation of the mobile and immobile fraction

The mobile fraction (Fig. 1e and Extended Data Figs. 1a, 2f and 3e) of the FRAP ROI was calculated using the signal intensity normalized as described above with the following equation:

$$\begin{array}{l}{\rm{mobile}}\,{\rm{fraction}}=\frac{{\rm{last}}\,{\rm{timepoint}}\,{\rm{signal}}\,{\rm{intensity}}-{\rm{after}}\,{\rm{bleach}}\,{\rm{signal}}\,{\rm{intensity}}}{1-{\rm{after}}\,{\rm{bleach}}\,{\rm{signal}}\,{\rm{intensity}}}\\ {\rm{immobile}}\,{\rm{fraction}}=1-{\rm{mobile}}\,{\rm{fraction}}\end{array}$$

Circularity

Live-cell images were acquired using a ×63 oil objective on a LSM880-airyscan under ZEN Black v.2.3 (Zeiss). For each condition, 8–31 regions were imaged, with a minimum of 26 nuclei captured. The resulting images were quantified in the image analysis module ZEN v.3.4 (Zeiss). In brief, within images, nuclei were identified by nuclear counterstaining using auto-intensity thresholds after smoothing (Gauss, 3.0). Nucleoli were segmented within nuclei by applying a fixed intensity threshold on GFP signal after faint smoothing (Gauss, 1.3) and using the rolling-ball algorithm. The maximum GFP area was set to 1,000, and the circularity score was extracted for each GFP object (Extended Data Fig. 2b).

Experiments related to NUP98::KDM5A

Raw files were imported into Arivis Vision 4D (Arivis) and an automated segmentation pipeline was designed manually. This pipeline consisted of median-based denoising, background correction, Cellpose deep-learning segmentation for cells and nuclei, intensity threshold segmentation for condensates, particle finder and size and sphericity filters. The segmented objects (cells, nuclei and condensates) were manually proofread, and settings were adjusted if necessary. For the data shown in Fig. 4h, ROIs were set based on the mCherry signal, and the number of condensates and mean GFP intensity was calculated for at least 100 individual cells per condition. The same approach was used for the data shown in Supplementary Fig. 9c, but at least 60 cells for each condition were analysed.

Immunofluorescence

For immunofluorescence experiments performed in GFP–NPM1 knock-in HCT116 cells (Fig. 3h and Supplementary Fig. 6a) and eGFP–BRD4::NUT-expressing cells (Fig. 3g,h and Supplementary Fig. 6g,h), cells were seeded on 8-well or 18-well chamber slides (Ibidi, 80826-90 and 81816) with 30,000 or 12,000 cells per well, and transfected 24 h later and fixed 24 h after transfection with 4% PFA in PBS for 10–15 min. Cells were permeabilized with 0.5% Triton X-100 (Thermo Fisher Scientific, 85111) in PBS for 10–15 min, incubated in blocking buffer containing 1% BSA (BSA Fraction V, Gibco, 15260037) and 0.1% Triton X-100 in PBS followed by overnight staining with primary antibodies in +4 °C with gentle agitation. Slides were washed five times with blocking buffer, incubated with secondary antibodies (AlexaFluor 647 donkey anti-mouse or anti-rabbit antibodies, Jackson ImmunoResearch, 715-605-150 and 711-605-152, 1:1,000) in blocking buffer for 1 h in room temperature, washed twice with blocking buffer, stained with 0.5 µg ml⁻¹ DAPI in PBS (Invitrogen, D1306) and washed three times with PBS. The following primary antibodies were used: 5.8S rRNA (Novus, NB100-662SS, 1:500), HA-tag (Cell Signaling, C29F4, 1:1,000), NEPRO (Santa Cruz, sc-376579, 1:100), RNAPII (Abcam, ab26721, 1:500) and H3K27Ac (Abcam, ab4729, 1:1,000). Imaging was performed using the LSM880 Airyscan microscope equipped with a Plan-Apochromat ×63/1.40 oil differential interference contrast objective.

For the immunofluorescence experiment of NEPRO (Fig. 3d), all of the procedure steps were identical to as described above except for the sample fixation. The cells used for NEPRO immunofluorescence (IF) were fixed on the slide with 1% formaldehyde diluted in culture medium at room temperature for 10 min, quenched with a final concentration of 200 mM glycine for 5 min, and then washed once with PBS and followed by permeabilization.

For IF experiments performed in 52K, 52K–KS and 52K–KS_F-to-A expressing cells (Fig. 5g and Supplementary Fig. 11d), cells were grown on 12 mm glass coverslips (Electron Microscopy Sciences, 72196-12) in 24-well, non-pyrogenic, polystyrene plates. For transient expression of transgenes, mammalian expression plasmids were transfected into HEK293T or HEK293 cells using X-tremeGENE HP (Roche, 6366236001) according to the manufacturer’s instructions for a 3:1 reagent:plasmid ratio. Plasmids were transfected at a ratio of 1 µg DNA:3 µl X-tremeGene HP:4 × 10⁵ cells and scaled up or down accordingly for all experiments. Then, 24 h after transfection, cells were fixed in 4% PFA in PBS at 37 °C for 10 min and washed once in PBS, followed by permeabilization with 0.5% Triton X-100 in PBS at room temperature for 10 min. The samples were blocked in 3% BSA in PBS (+0.05% sodium azide) for 1 h at room temperature, incubated with primary antibodies in 3% BSA in PBS (+0.05% sodium azide) for 1 h at room temperature, washed three times in 3% BSA in PBS (+0.05% sodium azide), followed by incubation with secondary antibodies and DAPI for 1 h at room temperature. The coverslips were then washed twice in PBS and mounted onto glass slides using ProLong Gold Antifade Reagent (Cell Signaling Technologies, 9071). The following primary antibodies were used: 52K (gift from P. Hearing⁸⁰; rabbit, polyclonal, 1:500) and DBP (gift from A. Levine⁸¹, mouse, B6-8, 1:400). AlexaFluor goat anti-rabbit 488 fluorophore-conjugated secondary antibody (Life Technologies, A-11008) or goat anti-mouse 488 fluorophore-conjugated secondary antibody (Life Technologies, A-11001) was used at a concentration of 1:1,000. Coverslips were imaged using a Leica DMi8 Thunder Imager and LAS X acquisition software. Images were processed in FIJI (v.1.53f51) using equivalent settings. Image analysis was performed using FIJI (v.1.53f51). Analysis of enrichment of IIIa in 52K condensates (Fig. 5h) is described below in Image Analysis.

Image analysis for IF

Pearson’s correlation coefficients from IF images

Pearson’s correlation coefficients between GFP and IF staining intensities within nuclei were analysed using ImageJ v.2.14.0/1.54f (Fig. 4d, Extended Data Fig. 8e and Supplementary Fig. 6b). First, nuclei were segmented into ROIs using thresholding on the DAPI channel and the AnalyzeParticles tool, and Pearson’s correlation coefficients between GFP and AF647 reported by Coloc2 plugin were collected and reported for each nucleus with mCherry or GFP expression. Thresholds for DAPI, mCherry and GFP channels varied between experiments, but were kept constant when thresholding images within each experiment.

Quantifying 5.8S rRNA intensity in nucleoli

Analysis of nucleolar 5.8S rRNA intensity in de-mixed and remaining nucleolar regions of GFP-nb–2×KS-expressing GFP–NPM1 HCT-116 cells (Fig. 3i) was performed manually with ImageJ by selecting mCherry⁺ cells where de-mixing of GFP–NPM1 was clearly visible. Measurements of GFP–NPM1 and AF647 intensities were performed within circular areas of 0.8 µm² that were positioned in de-mixed and remaining regions of nucleoli using the GFP–NPM1 intensity as illustrated in Fig. 3h.

Enrichment of GFP–IIIa in 52K condensates

Enrichment of GFP–IIIa in 52K condensates was analysed using ZEN software and the Zones of Influence analysis tool to identify 52K condensates with AlexaFluor 647 channel using click thresholding (value lower: 113; value upper: 246), and background as ring element (Segmentation Zoi Ring Distance: 3; Ring Thickness: 3) surrounding 52K objects. Enrichment of the mean GFP signal in 52K condensates over the mean background signal is displayed for individual 52K foci (Fig. 5h). Background GFP intensities are calculated as mean GFP signal at ring elements surrounding 52K foci, and are displayed in Fig. 5h.

Protein purification

Overexpression of recombinant protein in BL21 (DE3) (NEB M0491S) was performed as described previously⁷². In brief, Escherichia coli pellets were resuspended in 50 ml of ice-cold buffer A (50 mM Tris pH 7.5, 500 mM NaCl) supplemented with cOmplete protease inhibitors (Sigma-Aldrich, 11697498001), 0.2% Triton X-100 (Thermo Fisher Scientific, 851110) and 5% DMSO (Sigma-Aldrich, D2650-100ml), and sonicated for 360 cycles (5 s on, 5 s off) on the Branson SFX150 sonicator. The bacterial lysate was kept under stirring on a stirrer during sonication on an ice bucket at 4 °C in a cold room. Bacterial lysates were cleared by centrifugation at 15,000×g for 15 min at 4 °C. For protein purification, we used the Äkta avant 25 chromatography system. All 50 ml of the cleared lysate was loaded onto the cOmplete His-Tag purification column (Merck, 6781535001) pre-equilibrated in buffer A. The loaded column was washed with 15 column volumes (CV) of buffer A. Fusion protein was eluted in 10 CV of elution buffer (50 mM Tris pH 7.5, 500 mM NaCl, 250 mM imidazole) and diluted 1:1 in storage buffer (50 mM Tris pH 7.5, 125 mM NaCl, 1 mM DTT, 5% DMSO, 10% glycerol). The fractions enriched for GFP were pooled after His-affinity purification and manually loaded through an injection valve connected to a 500 μl capillary tube onto an equilibrated Superdex 200 increase 10/300 GL column (Cytiva, 28-9909-44). The loaded column was equilibrated with 0.15 CV of ice-cold SEC buffer (50 mM Tris pH 7.5, 125 mM NaCl, 5% DMSO, 1 mM DTT) supplemented with cOmplete protease inhibitors. Fusion proteins were eluted into 300 μl fractions with 1.1 CV of ice-cold SEC buffer supplemented with cOmplete protease inhibitors. Elution fractions were pooled. Eluates were further concentrated by centrifugation at 4,000×g for 30 min at 4 °C using 30 kDa MWCO Amicon Ultra centrifugal filters (Merck, UFC903024). The concentrated fraction was diluted 1:100 in storage buffer, reconcentrated and stored at −80 °C.

In vitro msfGFP droplet formation assay

For in vitro droplet-formation experiments (Fig. 1f), we measured the concentration of purified msfGFP-tagged proteins using the NanoDrop 2000 system (Thermo Fisher Scientific) and subsequently diluted protein to the required concentration in storage buffer (50 mM Tris pH 7.5, 125 mM NaCl, 1 mM DTT, 5% DMSO, 10% glycerol). The in vitro droplet-formation assay was performed as previously described^82,83. Protein preparations were mixed 1:1 with 2.5 μl 20% PEG 8000 in deionized water (w/v). The resulting 5 μl was pipetted onto a chambered coverslip (Ibidi, 80826-90). Images were acquired after 3 min equilibration of the drop on the slide, with an LSM880 confocal microscope equipped with a Plan Apochromat ×63/1.40 NA oil DIC objective with a ×1 zoom. Quantification of condensate formation was based on at least ten images acquired in at least two independent image series per condition.

Image analysis of in vitro droplet formation

Protein droplets were detected using the ZEN blue v.3.4 Image Analysis and Intellesis software packages. Using a previously trained Intellesis model in spectral mode, we achieved image segmentation of individual pixels into objects (droplet area) or background (image background). Relative amounts of condensed protein were calculated by dividing the sum of the GFP signal in objects defined as droplet area by the overall sum of the GFP signal in the field of view. All values were calculated using RStudio. Plots were generated using GraphPad PRISM9. To fit data to a sigmoidal curve, we applied the in-built nonlinear regression function (Sigmoidal; x is concentration) (Fig. 1g).

RNA-seq

For experiments involving eGFP–BRD4::NUT (Fig. 4e), 450,000 HEK293T cells were seeded onto six-well plates and transfected the next day with 3 µg pRK5-eGFP–BRD4::NUT plasmids using 9 µl PEI STAR transfection reagent (Tocris, 7854). Total RNA was isolated from cells 24 h after transfection using the RNEasy mini kit with in-column DNase digestion (Qiagen). RNA-seq libraries were prepared using the KAPA HyperPrep Kit with RiboErase (Roche, KK8562) and sequenced in 100 bp paired-end mode on the Illumina NovaSeq2 system for 55–65 million fragments per sample.

Bulk RNA-seq analysis

Raw RNA-seq data were filtered and trimmed using TrimGalore v.0.6.10 (https://doi.org/10.5281/zenodo.7598955) with the default settings. Filtered data from HEK293K cells were mapped to a custom human genome hg38, including the eGFP sequence cloned using the STAR aligner⁸⁴ to hg38 human genome. Count-read tables were generated by using the same program. Differential expression analysis was performed using the DEseq2 package⁸⁵ in R (v.4.4)⁸⁶. Differentially expressed genes were defined as having a fold change ≥ 1.2, Benjamini–Hochberg P ≤ 0.01 and a minimum mean read count across the experimental samples of 50 reads. The differentially expressed genes are listed in the Supplementary Table 3.

PCA was performed using the PCAPlot function from the DEseq2 package on the normalized read matrix that was transformed using the variance-stabilizing transformation function from the DEseq2 package and plotted using ggplot2 (Extended Data Fig. 9d). The distance matrix was calculated using the dist function in R using the Euclidean distance and visualized with a pheatmap in R (Extended Data Fig. 9e). Volcano plots were created using ggplot2 (Fig. 4e and Extended Data Fig. 9g).

For the reference data on BRD4::NUT targets from a previous study⁸⁷, raw data were downloaded from the Gene Expression Omnibus (GSE233302) and processed as described above. Differentially expressed genes were defined with a fold change cut-off of 1.2 and an adjusted P < 0.01. Only BRD::NUT and control samples were used.

RT–qPCR

For experiments involving expression of NUP98::KDM5A and KS variants, total RNA was extracted using the Monarch Total RNA Miniprep Kit. Reverse transcription was performed using the RevertAID RT Kit (Thermo Fisher Scientific) and qPCR was done using the SsoAdvanced Universal SYBR Green Supermix (Bio-Rad) or AceQ Universal SYBR qPCR Master Mix (Vazyme, Red Maple Hi-tech Industry Park) on the Bio-Rad CFX-Connect Real-Time PCR Detection System. The results were normalized to GAPDH and analysed using the ${2}^{-\Delta \Delta {C}_{{\rm{t}}}}$ method (Supplementary Fig. 8e).

1,6-hexanediol treatments

Cleavable mCherry was included in eGFP–BRD4::NUT and GFP–NUP98::DDX10 expression vectors to better distinguish and compare transfected cells. U2OS cells were seeded on eight-well chamber slides (Ibidi, 80826-90) at density of 30,000 cells per well and transfected the next day with pRK5-mCherry-P2A-T2A-eGFP–BRD4::NUT plasmids using FuGENE HD and 150 ng plasmid per well according to the manufacturer’s instructions. The next day, cells were treated with 5% 1,6-hexanediol (Sigma-Aldrich, 240117) in cell culture medium for 5 or 15 min, fixed with 4% PFA for 10 min, washed and stored in PBS. Nuclei were stained using 0.5 µg ml⁻¹ DAPI in PBS (Invitrogen, D1306).

For the eGFP–BRD4::NUT 1,6-hexanediol experiments, images were analysed using ZEN Blue v.3.9 software. Nuclei were segmented using automatic Otsu thresholding and the mean intensity, s.d. and maximum intensities from the GFP and mCherry channels were measured for each nucleus. Nuclei with no expression (mean mCherry intensity < 1.5) abnormally high expression (mCherry expression > 40) and cells with saturated GFP signals were excluded. Images were acquired from three biological replicate experiments and, in the end, measurements were pooled and combined into single plots (Extended Data Fig. 9i).

To analyse 1,6-hexanediol treated GFP–NUP98::DDX10-expressing cells, nuclei were first segmented using Otsu thresholding (click thresholding: 2, 255) and GFP–NUP98::DDX10 foci in nuclei were detected on the basis of the GFP signal (click thresholding: 9, 255). Nuclear background areas outside the foci were determined with inverted thresholds and the area was shrunk with Erode set to 2. To reliably measure diffuse GFP in the nucleus, GFP acquisition settings were set accordingly. However, due to the high contrast of GFP intensity in NUP98::DDX10 foci versus nucleoplasm, the GFP signal was often saturated at NUP98::DDX10 foci already in cells with low GFP expression. As a compromise, we excluded all nuclei with saturated foci detected in areas greater than 8 pixels. Nuclei with no expression, defined as mean mCherry intensity < 1.5, and nuclei with very high expression, defined as mCherry expression > 40, were excluded. Images were acquired from two biological replicate experiments. Measurements were pooled and combined into single plots (Extended Data Fig. 10i). Owing to the small regions of saturated GFP that were permitted, mean GFP intensities for NUP98::DDX10 samples were not plotted, s.d. values of GFP intensity should be interpreted with caution and, instead, the background GFP is considered most reliable.

Colony-formation assay for NUP98::KDM5A

Each experiment was performed in triplicates (Supplementary Fig. 8b). In brief, 25 × 10³ mouse leukaemia cells were seeded in methylcellulose (MethoCult M3434, Stem Cell Technologies) and colonies were scored and replated (10 × 10³ cells or 5 × 10³ cells) every 7 days.

Flow cytometry for NUP98::KDM5A

For characterization of cells derived from the NUP98::KDM5A colony-formation assay, cells were washed with PBS and resuspended in PBS with 0.5% FCS, and then stained for 30 min with 1:200 dilutions of the following antibodies (all from BioLegend): anti-mouse Gr-1/Ly-6C BV421 (RB6-8C5) and anti-mouse KIT APC (2B8). For gating strategy, live cells were discriminated on the basis of forward scatter height (FSC-H) and side scatter height (SSC-H). Single cells were gated on the basis of forward scatter height (FSC-H) and forward scatter area (FSC-A). mCherry⁺ cells were identified by their signal intensity in the ECD channel. Cellular staining for anti-mouse Gr-1/Ly-6C was assessed on the basis of the signal intensity in the Pacific Blue channel (BV421), while anti-mouse KIT staining was evaluated on the basis of the signal intensity in the APC channel. The samples were analysed on the BD FACSCanto II. Data analysis was performed using the FlowJo (FlowJo) software package (Supplementary Fig. 8c,d).

Competitive cell proliferation assay for NUP98::KDM5A

To investigate the effect of the KS on proliferation, a competition-based proliferation assay was performed using GFP–NUP98::KDM5A AML cells transfected with doxycycline-inducible nb–KS constructs at an infection rate of approximately 25–30% as described previously⁵⁶. In total, 5 × 10⁵ cells were seeded in 24-well plates in triplicates. Doxycycline was added to the medium and fluorescence expression of mCherry (bicistronically expressed with nb–KS constructs) was measured every 2–3 days using the Cytoflex S instrument (Beckman Coulter). The effect of the nb–KS constructs on cell fitness was monitored by time-resolved measurements of mixed populations of competing cells expressing nb–KS constructs (mCherry⁺) versus cells not expressing nb–KS constructs (mCherry⁻). Data analysis was performed using the FlowJo (FlowJo) software package and values were normalized to day 3 after doxycycline induction (Supplementary Fig. 7b,c).

Growth curves for NUP98::KDM5A

Cells were sorted for mCherry and seeded in biological triplicates and treated with doxycycline every 48 h or 72 h. Cell numbers were determined at regular intervals using the Intellicyt iQue Screener (Essen BioScience, Sartorius Group) and integrated with ForeCyte Software (Essen Bioscience; Standard Edition 10.0 (R1) v.10.0.8272; build date, 25 August 2022; Fig. 4g).

Viruses and infections

For retrovirus production related to NUP98::KDM5A experiments, Platinum-E cells were co-transfected with 20 µg transfer vector and 5 µg pCMV-gag/pol using polyethyleneimine (branched, molecular mass, 25,000 Da, Sigma-Aldrich). The viral supernatant was collected 48 h and 56 h after transfection, filtered (0.45 µm) and supplemented with recombinant mIL-3 (10 ng ml⁻¹), mIL-6 (10 ng ml⁻¹) (both PeproTech) and mSCF (100 ng ml⁻¹). Mouse progenitor cells were spinoculated with viral supernatants (1:2 diluted) for 45 min at 37 °C at 500×g in the presence of polybrene (4 µg ml⁻¹) (Merck Chemicals and Life Science). For lentivirus production, Lenti-X 293T cells were transfected with 4 µg transfer vector, 2 µg psPAX2 and 1 µg pMD2.G using PEI. Lentivirus was collected 48 h and 72 h after transfection and filtered (0.45 µm). Target cells were spinoculated after addition of virus supernatant (1:3 diluted) followed by centrifugation for 90 min at 37 °C at 1,000×g in the presence of polybrene (5 µg ml⁻¹).

For experiments related to 52K constructs, human adenovirus type C5 (Ad5) wild type was purchased from ATCC (VR-5), propagated on HEK293 cells, purified through caesium chloride density ultracentrifugation and stored in 40% glycerol at −20 °C for infections. The Ad5 Δ52K mutant pm8001 (Δ52K)⁸⁸ (gift from P. Hearing) was propagated on a transgenic cell line (A549) expressing WT 52K⁵⁷, purified by caesium chloride density ultracentrifugation and stored in 40% glycerol at −20 °C for infections. Viruses were purified using two sequential rounds of ultracentrifugation in caesium chloride density gradients. To achieve a cryoprotective solution for storage, virus was diluted as followed: two parts virus in caesium chloride, one part 5× viral dilution solution (40 mM Tris pH 8, 400 mM NaCl, 0.4% BSA in H₂O), two parts 100% glycerol. Viral titres were determined by infectious focus-forming assay as described in the ‘Adenoviral progeny production’ section. All infections were carried out using a multiplicity of infection of 10 unless stated otherwise and collected at the indicated hours after infection. To infect cells, virus was diluted in cell culture medium without FBS. After 2 h at 37 °C, culture medium containing 10% FBS was added. For virus-yield assays, the virus infection medium was removed after 2 h, and cells were washed once in PBS before addition of culture medium to remove excess virus.

Adenoviral progeny production

Infected cells were collected by scraping and were then lysed by four cycles of freeze–thawing in liquid nitrogen and a 37 °C water bath. Cells were collected at 48 h after infection unless stated otherwise. Cell debris was removed from the lysates by centrifugation at maximum speed at 4 °C for 5 min. For analysis of virus yield, lysates were diluted serially in DMEM supplemented with 2% FBS and 1% penicillin–streptomycin and used to infect A549 cells. The infection medium was removed 2 h after infection, cells were washed once in PBS to remove excess virus, and cells were overlaid with growth medium. Cells were incubated for 24 h before fixation in 4% paraformaldehyde and analysed by immunofluorescence confocal microscopy using immunostaining of the viral DNA-binding protein (DBP) as an indicator of infection. For each of the three independent replicates, three fields of view were captured and the percentage of DBP-positive cells was determined. The serial dilution resulting in the closest to 50% DBP-positive cells was selected for calculation of progeny production. The total cell number was determined by counting cells grown in parallel under equivalent conditions. The number of focus-forming units was calculated as the product of total cell number and the percentage of antigen-positive cells, adjusting for the Poisson distribution. Virus input was determined by collecting infected cells at 4 h after infection, the number of infectious units was determined and the mean of replicates was calculated (Fig. 5e). For complementation of Δ52K mutant virus infection, cells were infected and then subsequently transfected at 2 h after infection.

Zebrafish embryo manipulations

Zebrafish were maintained and raised under standard conditions, and according to Swiss regulations (canton Vaud, license number VD-H28). Wild-type (TLAB) fish were used for this study. Embryos were collected immediately after fertilization. The embryos were grown at 28 °C and the developmental stage was determined as described by previously⁸⁹. In total, 180 pg of each RNA was injected in the yolk of one-cell-stage embryos. The embryos were allowed to develop in a 28 °C incubator and then manually dechorionated using forceps. Embryos were transferred to mounting medium (0.8% low-melting-point agarose containing 15% (v/v) OptiPrep (Sigma-Aldrich, D1156)) at 37 °C and mounted onto an Ibidi glass-bottom μ-dish (Ibidi, 81158-400).

Microscopy and image analysis for zebrafish embryos

Embryos were imaged on a Nikon spinning disc, using the Nikon SR HP Plan Apo ×100/1.35 Sil WD 0.3 objective, in a temperature-controlled chamber set at 28 °C. 4D imaging, x, y, z and time, was performed by keeping a z-stack of 27 µm and z-step of 0.3 µm with a time resolution of 2 min. Microscopy images were further analysed and processed using ImageJ. Whole nuclei at the 512-cell stage were first segmented in 3D and further analysis was performed using the 3D objects Counter plugin⁹⁰. The generated data frames in .csv format were exported to RStudio software for plotting and visualization. Statistical analysis was performed using GraphPad prism. Kruskal–Wallis tests with Dunn’s multiple-comparison correction were performed to calculate the significance between different groups.

SDS–PAGE and immunoblot analysis

For the generation of msfGFP knock-in cell lines

Cultured cells were washed twice in PBS and lysed in RIPA buffer for 30 min at 4 °C on an orbital shaker. Subsequently, the cell lysates were centrifuged for 10 min at 20,000×g. The cleared lysates were transferred to a new tube and quantified using a BCA assay (Thermo Fisher Scientific). Then, 20 μg of extracted protein was supplemented with lithium dodecyl sulfate (LDS) loading buffer (Thermo Fisher Scientific, NP0007) supplemented with 50 mM DTT and boiled at 98 °C for 5 min. The samples were then run on a 4–12% NuPAGE SDS gel (Thermo Fisher Scientific, NP0322BOX) and transferred onto a polyvinylidene fluoride membrane (Thermo Fisher Scientific, IB24002x3) using an iBlot2 Dry Gel Transfer Device (Invitrogen) according to the manufacturer’s instructions. Membranes were blocked with 5% skimmed milk in TBST for 1 h then incubated with primary antibodies diluted in 2% milk and TBST overnight at 4 °C. Primary antibodies used in this study include TCOF1 (Santa Cruz, sc-374536, 1:750), GFP (Invitrogen, A11122, 1:2,000), NPM1 (Invitrogen, 32-5200, 1:2,000), HP1α (CST, 2616, 1:1,000), histone H3 (Abcam, ab1719, 1:10,000), GAPDH (CST, 14C10, 1:4,000) and HSP90 (BD, 610419, 1:2,000). HRP-conjugated secondary antibodies were used against the host species at 1:1,000 dilution. The proteins were visualized with HRP substrate SuperSignal West Dura (Thermo Fisher Scientific) and detected using Bio-Rad Universal Hood III and Image Lab 6.1 software (Supplementary Fig. 13).

For the experiments performed relating to the adenovirus-52K experiments

SDS–PAGE and immunoblot analysis was performed using standard methods. In brief, protein samples were prepared using LDS loading buffer (Thermo Fisher Scientific, NP0007) supplemented with 25 mM DTT and boiled at 95 °C for 10 min. Equal amounts of protein lysates were separated by SDS–PAGE. Proteins were transferred onto methanol-activated polyvinylidene fluoride membrane (Millipore-Sigma, IPFL00010) at 30 V for 60–120 min and blocked in blocking buffer (5% milk in TBST supplemented with 0.05% sodium azide) for 1 h at room temperature. Membranes were incubated overnight at 4 °C with primary antibodies diluted in blocking buffer, washed for 30 min in TBST, incubated for 1 h at room temperature with HRP-conjugated secondary antibody diluted in blocking buffer and washed again for 30 min in TBST.

The following primary antibodies were used: anti-adenovirus type 5 antibody (raised against whole adenovirus capsids; recognizing late proteins Hexon, Penton, Fiber (Abcam, ab6982); rabbit, polyclonal, western blot, 1:10,000), antibody to 52K (gift from P. Hearing⁸⁰, rabbit, polyclonal, western blot, 1:10,000), IIIa (gift from P. Hearing⁹¹, rabbit, polyclonal, western blot, 1:10,000), DBP (gift from A. Levine⁸¹, mouse, B6-8, western blot, 1:1,000) and GAPDH (GeneTex, GTX100118, 43712, rabbit, polyclonal, western blot, 1:5,000). For immunoblot analysis, horseradish-peroxidase-conjugated goat anti-rabbit secondary antibody (Jackson Laboratories, 111-035-045) was used at a concentration of 1:10,000.

Proteins were visualized using the Pierce ECL Western Blotting Substrate (Thermo Fisher Scientific, 34577) and detected using a Syngene G-Box using GeneSys acquisition software. Images were processed and assembled in Adobe Illustrator CS6 (Fig. 5f and Supplementary Figs. 11f and 14).

Statistics and reproducibility

All experimental observations were confirmed by independent repeat experiments. No blinding or sample randomization was used during data capture or analysis. No data were excluded from data analysis. No statistical tests were performed to predetermine the sample size. Unless stated otherwise, the mean of numerical data is shown, with error bars representing the s.d. of the sample. Statistics were performed using GraphPad Prism v.9 and RStudio v.2024.04.2 and the multicomp package. All t-tests were two-sided. A normal distribution was assumed when determining statistical tests. Equal variance was not assumed, except for Tukey’s multiple-comparison tests. Exact P values were as follows: Fig. 1e: FRAP plots, P ≤ 0.0001 (Δkillswitch), P ≤ 0.0001 (F-to-E&D), P ≤ 0.0001 (F-to-A), P ≤ 0.0001 (F-to-G), P ≤ 0.0001 (0F), P ≤ 0.0001 (F12A), P = 0.02 (F13A), P ≤ 0.0001 (ΔKS-3F), P ≤ 0.0001 (11G3F3G), P ≥ 0.9999 (C16-to-A), P = 0.57 (M-to-E&D); GFP intensity plot, P = 0.97 (Δkillswitch), P = 0.56 (F-to-E&D), P = 0.68 (F-to-A), P = 0.53 (F-to-G), P = 0.27 (0F), P = 0.13 (F12A), P = 0.28 (F13A), P = 0.95 (ΔKS-3F), P = 0.19 (11G3F3G), P = 0.9999 (C16-to-A), P = 0.71 (M-to-E&D). Fig. 2f: NPM1 plot, P = 0.02 (nb), P = 0.16 (KS), P = 0.29 (KS_F-to-A), P = 0.01 (2×KS); TCOF1 plot, P = 0.99 (Nb), P = 0.41 (KS), P = 0.18 (KS_F-to-A), P = 0.19 (2×KS); SRRM2 plot, P = 0.92 (Nb), P = 0.39 (KS), P ≤ 0.0001 (2×KS), P = 0.9998 (2×KS_F-to-G); HP1α plot, P = 0.85 (Nb), P = 0.99 (KS), P = 0.99 (2×KS), P = 0.76 (2×KS_F-to-G).

AlphaFold models

AlphaFold models were predicted using AlphaFold (v.3)³³ in multimeric mode using the default parameters. Models were visualized with ChimeraX (v.1.6)⁹². Contact plots were generated using custom scripts.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.