Cell culture
hTERT-RPE1 cells (a telomerase-immortalized human retinal pigment epithelial cell line; CRL-4000, American Type Culture Collection) were grown in DMEM/F12/10% FCS, and TIG3 cells (a primary human embryonic lung fibroblast line; JCRB0506, JCRB Cell Bank)47 were grown in DMEM/10% FCS in a 5% O2/5% CO2 atmosphere. Cells were obtained directly from the respective source cell banks. No authentication was performed by the authors of this paper. Cells were regularly tested for mycoplasma contamination. Of 4-OHT (H7904, Sigma), 100 nM was used for all ER–RAS induction experiments in vitro. Of etoposide (E1383, Sigma), 50 µM was used for the DNA damage experiments in RPE1 and TIG3 cells.
BrdU incorporation and SA-β-gal assays
Cellular proliferation by BrdU incorporation and SA-β-gal analysis have been previously described48. RPE1 and TIG3 cells were incubated with BrdU for 2 h for the BrdU incorporation assay.
Mice
HDTVi was performed as previously described3. In brief, at 6–8 weeks of age, 25 μg of appropriate vector and 5 μg of SB13 transposase-containing plasmid were diluted in sterile-filtered normal saline to a total volume of 10% of the body weight of the animal, before being injected into the lateral tail vein in under 10 s. Mice were randomized into control and experimental groups. C57BL/6 and Fox Chase SCID mice used in this study were purchased from Charles River. All mice used in these experiments were female, apart from the long-term monitoring cohort for identifying sex differences in tumour formation. All procedures were conducted in accordance with the UK Animal (Scientific Procedures) Act 1986, approved by the CRUK Cambridge Institute Animal Welfare and Ethical Review Body (AWERB) and conducted under the authority of the Project Licence number PP3912882.
Mice were housed in individually ventilated cages (Tecniplast) at a temperature of 19–23 °C, humidity of 45–65%, with up to 75 air exchanges per hour in the cages, and a 12–12-h light–dark cycle with the lights on at 07:00. The maximum caging density was five mice from the same litter and sex starting from weaning. As bedding, Aspen woodchip (Datesand) were provided. Mice were fed a standardized mouse diet LabDiet 5R58 breeding and maintenance diet or 5053 high-fat diet (IPS) and provided drinking water ad libitum. All materials, including individually ventilated cages, lids, feeders, bottles, bedding and water were autoclaved before use. Sentinel mice were negative for at least all Federation of Laboratory Animal Science Associations (FELASA)-relevant murine infectious agent as diagnosed by our health monitoring laboratory, Surrey Diagnostics.
Tumour monitoring
The health of mice and impact of internal tumours were judged by external signs (for example, abdominal distension or weight gain exceeding 10% of normal body weight), clinical signs (for example, laboured breathing, rough hair coat, piloerection, inactivity, failure to eat or drink, fluid retention, neurological signs and digestive disturbances), aided by post-mortem assessment of morphological abnormalities in previously killed or deceased animals. To ensure early identification of health problems, animals with known or suspected pathologies received enhanced levels or surveillance (for example, hand checks). Primarily, mice were palpated, usually once a week, to detect the liver tumours. In the majority of cases, the liver tumours are detected before the development of clinical signs, and the animal was humanely culled by a schedule one method to alleviate any potential suffering. Occasionally, mice may develop clinical signs, as above, and were culled by a schedule one method to alleviate any further potential suffering. Limits specified by the project license were not exceeded in any of the experiments conducted.
Plasmids
Predictive reporter plasmids for the in vitro experiments: NLS-mVenus-P2A-ER–RAS on either the pLNCX2 (retroviral, Clontech) and the pRRL.SIN-18 (lentiviral, described in ref. 49) backbones. The nuclear localization signal on all of these constructs is derived from SV40 large T-antigen (PKKKRKV). Plasmids for HTVIs: pPGK-SB13; pT/CAGGS-NRASG12V-IRES-mVenus, pT/CAGGS-NRASG12V/D38A-IRES-mVenus15, pT/CAGGS-mVenus-P2A-NRASG12V, pT/PGK-mVenus-P2A-NRASG12V, pT/UBC-mVenus-P2A-NRASG12V and UBC–mVenus-P2A.
Single-cell immune suspensions
Dissected livers were homogenized (130-105-807, Miltenyi Liver Dissociation Kit) and passed through a 70-μm filter. After centrifugation, samples were washed twice in PEB buffer (PBS, 5 μM EDTA and 0.5% BSA). Immune cells were enriched using an OptiPrep gradient (07820, STEMCELL Technologies). Immune cells along the gradient interphase were washed and resuspended in FACS buffer (PBS, 5 mM EDTA and 5% BSA) and individually placed within a 96-well round-bottomed tissue culture plate. Pellets were incubated with TruStain FcX Fc-blocking solution (101319, BioLegend) and then treated with cell-surface panels of fluorophore-conjugated antibodies: (1) CD45–BV510 (563891, BD), CD3–AF647 (100209, BioLegend), CD4–BUV496 (612952, BD), CD8a–BV711 (100747, BioLegend) and NK1.1–BV421 (108731, BioLegend); (2) CD45–BV510 (563891, BD), CD11b–Super Bright 645 (64-0112-82, eBioscience), CD11c–BV421 (117329, BioLegend), Ly6C–PerCP-Cy5.5 (128011, BioLegend), F4/80–PE-Cy7 (123113, BioLegend), Gr-1–FITC (108405, BioLegend), CCR2–BV785 (150621, BioLegend), MHC-II–Spark UV 387 (107670, BioLegend) and PDL1–APC (124312, BioLegend). The samples of all flow cytometric studies were incubated with a Fixable Viability Dye eFluor 780 (65-0865-14, eBioscience). Stained cells were analysed using an LSRFortessa Cell Analyzer (BD), and acquired results were analysed using FlowJo software (v10.9.0, FlowJo, BD). AccuCheck Counting Beads (PCB100, Invitrogen) were used for absolute cell number assessment.
Flow cytometry
mVenus quantification was performed using a MACSQuantVYB (Miltenyi Biotech) flow cytometer. When DNA content quantification was required, Hoechst 33342 (stock 10 µg ml−1) was added to the media of adherent cells in culture to a final concentration of 1 ng ml−1. Cells were incubated on Hoechst-containing medium for 45 min before analysis.
Intrahepatic immune cells were prepared as above and then run on a BD Fortessa flow cytometer (Becton Dickinson); data were analysed using FlowJo v10. The gating strategy is provided in the Supplementary Information.
Protein expression by immunoblotting and immunofluorescence
Immunofluorescence and immunoblotting, on SDS–PAGE on gels of various concentrations, were performed as previously described48.
The primary antibodies (and their dilutions) for immunoblotting included: anti-β-actin (A5441, Sigma; AC15, mouse monoclonal, 1:5,000); anti-HRAS (sc29, Santa Cruz Biotechnology; F235, mouse monoclonal, 1:1,500); anti-GFP (632377, Clontech; rabbit polyclonal, 1:1,000); anti-IL-6 (MAB2061, R&D Biosystems; clone #1936, mouse monoclonal, 1:250); anti-IL-8 (MAB208, R&D Biosystems; clone #6217, mouse monoclonal, 1:500); anti-cyclin A (c4710, Sigma; CY-A1, mouse monoclonal, 1:1,000); and anti-p21 (sc-6246, Santa Cruz; F5, mouse monoclonal, 1:1,000). The primary antibodies (and their dilutions) for immunofluorescence included: anti-IL-8 (MAB208, R&D Biosystems; clone #6217, mouse monoclonal, 1:250); anti-BrdU (555627, BD Biosciences; 3D4, 1:500); and anti-phospho-histone H2A.X (Ser139) (05-636, Merck; JBW301, mouse monoclonal, 1:200, pH 8.0 for formalin-fixed paraffin-embedded sections).
The secondary antibody used was goat anti-mouse IgG (Alexa Fluor 555, 1:1,000; A-11034, Thermo Fisher) in PBS-T. Cells were counter-stained with DAPI at 1 μM in the secondary antibody solution. Fluorescence images were obtained using Leica DMI6000B epifluorescence light microscope or Leica Stellaris 8 confocal microscope, using LAS X software versions 3.7.5.24914 or 4.7.0 (Leica), respectively. Uncropped immunoblot images can be found in the Supplementary Information.
IHC
Formalin-fixed paraffin-embedded mouse and human tissues were stained with the primary antibodies listed at the concentrations below, after heat-induced epitope retrieval in citrate (pH 6) or Tris-EDTA (pH 9) buffers before visualization manually using the ImmPRESS IHC detection kit according to the manufacturer’s instructions and counterstaining with haematoxylin. Alternatively, automated chromogenic immunohistochemical staining was performed on a Leica Bond Max (Leica) using the polymer refine detection and refine red detection kits (Leica). All tissue sections were scanned on a Leica AT2 at ×20 or ×40 magnification and a resolution of 0.5 μm per pixel.
The following primary antibodies (and their dilutions) were used: anti-GFP (ab13970, Abcam; chicken polyclonal, 10 µg ml−1, pH 6.0); anti-RAS (ab52939, Abcam; EP1125Y, rabbit monoclonal, 1:1,000, pH 6.0); anti-p-ERK1/2 (9101, Cell Signaling Technology; rabbit polyclonal, 1:800, pH 6.0); anti-CK8 (MABT329, DSHB; TROMA-1, rat monoclonal, 2.98 µg ml−1); anti-CK19 (MABT913, DSHB; TROMA-III, rat monoclonal, 0.058 µg ml−1); anti-mouse nestin (MAB353, Chemicon; rat-401, mouse monoclonal, 1:200, pH 6.0); anti-human nestin (MAB5326, Chemicon; 10C2, mouse monoclonal, 1:120, pH 6.0); anti-AFP (sc-8399, Santa Cruz; C3, mouse monoclonal, 1:50, pH 6.0); anti-mouse DLK1 (FAB8634T, R&D Systems; 1168B, rabbit monoclonal, 1:200, pH 9.0); anti-human DLK1 (MAB1144, R&D Systems; 211309, mouse monoclonal, 4 µg ml−1, pH 9.0); anti-NOTCH1 (3608, Cell Signaling Technology; D1E11, rabbit monoclonal, 1:200, pH 6.0); anti-TGFβ (3709, Cell Signaling Technology; 56E4, rabbit monoclonal, 1:100, pH 6.0); anti-mouse CD4 (ab183685, Abcam; EPR19514, rabbit monoclonal, 0.3205 μg ml−1, pH 9.0); anti-mouse CD8α (98941, Cell Signaling Technology; D4W2Z, rabbit monoclonal, 1:200, pH 9.0); anti-mouse F4/80 (MCA497, Serotec; CLA3-1, rat monoclonal, 1:20, pH 6.0); anti-mouse FOXP3 (14-5773, eBioscience; FJK-16s, rat monoclonal, 5 μg ml−1, pH 9.0); anti-human CD4 (M7310, Dako; 4B12, mouse monoclonal, 1:50, pH 9.0); anti-human CD8 (RM-9116-S, Thermo Fisher Scientific; SP16, rabbit monoclonal, 1:100, pH 9.0); and anti-human CD68 (NCL-L-CD68, Novocastra; 514H12, mouse monoclonal, 1:50, pH 9.0).
The following horseradish peroxidase (HRP) polymer kit was used for manual IHCs: M.O.M. ImmPRESS HRP Polymer Kit (MP-2400, Vector Laboratories); ImmPRESS HRP Horse Anti-Rabbit IgG Polymer Kit (MP-7401, Vector Laboratories); and ImmPRESS HRP Goat Anti-Rat IgG Polymer Kit (MP-7404, Vector Laboratories).
Image analysis and quantification
For in vitro slides, quantification of γH2AX was performed in Fiji (ImageJ2 v2.14.0). In brief, a nuclear mask was applied based on the DAPI channel, and then the mean γH2AX intensity was measured per cell.
For in vivo liver tissue sections, quantification of γH2AX was performed manually after scanning using Axioscan 7 (Zeiss) at ×40 magnification. Random areas were selected and at least 100 NRAS+ or NRAS− cells per liver section were counted. Representative images were taken using TCS SP5 confocal microscope (Leica). For measuring the perecnt of positive tissue areas, image analysis was performed using the HALO (Indicalabs, v3.3.2541) with the Area Quantification v1.0 algorithm following the digitization of tissue sections. IHC images were trained independently to provide the best accuracy for the positive area and all the slides were reviewed manually following analysis to assess accuracy. In brief, the total section area was highlighted using the Flood fill annotation tool, and a minimum tissue optical density at 0.035 was used to eliminate non-tissue areas. Percentage stain-positive tissue was used as readout for statistical analysis performed using GraphPad Prism 10.2.1 (339).
Tumour scoring
Haematoxylin and eosin (H&E)-stained tissue sections were reviewed by a board-certified pathologist (S.J.A.) who was blinded to the experimental design. Tumours were graded according to the WHO classification of digestive system tumours50. Differentiation scores were assigned: DS1, well differentiated; DS2, moderately differentiated; DS3, poorly differentiated; and DS4, undifferentiated. For morphologically heterogeneous tumours, or where multiple lesions were present in the same liver, tumours were classified based on the worst grade.
Bulk RNA-seq
RNA was extracted from five biological replicates per condition using the Qiagen RNeasy plus kit according to the manufacturer’s instructions and quality checked using a Bioanalyser Eukaryote Total RNA Nano Series II chip (5067-1511, Agilent). Libraries were prepared using the TruSeq Stranded mRNA Library Prep Kit (20020594, Illumina) according to the manufacturer’s instructions and sequenced using the HiSeq-4000 platform (Illumina). Reads were aligned to the human genome version GRCh38 (downloaded from https://www.ensembl.org/Homo_sapiens/Info/Index) using STAR51, and per-gene read counting was performed using the featureCounts function of the subread package in R52. Low-quality reads (mapping quality less than 20) and known adapter contamination were filtered out using Cutadapt53. Differential expression analysis was performed with edgeR54,55, comparing each of the induced samples with their uninduced equivalent. Differentially expressed genes were identified using edgeR’s glmTreat function using a fold change of 1.2 in either direction and a false discovery rate cut-off of 0.05.
Gene set enrichment and pathway analysis
Rank-based gene set enrichment analysis and generating the associated random-walk plots were performed using the fgsea R package56. Expression values were tested against gene sets curated as part of the MSigDB, a collection of gene sets representing coherently expressed signatures designed to represent well-defined biological states or processes57. Overlap-based pathway and gene ontology enrichment was performed using the web-based Enrichr platform58,59.
All summary plots were generated in R, mostly using the ggplot2 package60. Upset plots were generated using the UpSetR package61, and heatmaps were generated using the pheatmap package, which also implements hierarchical clustering for the ordering of columns and rows where indicated.
Cancer Cell Line Encyclopedia and TCGA
Cancer Cell Line Encyclopedia expression data were downloaded from the DepMap Portal62. The liver cell lines were grouped into well-differentiated and poorly differentiated lines based on previous classification44,45. When projected into two dimensions, differentiation status of the cell lines was the primary driver of the first principal component. As such, genes were ranked from well to poorly differentiated based on their loadings along this principal component. TCGA expression and mutation data were downloaded from the GDC data portal25. Survival analysis and visualization of this data were performed using the survminer R package. For the diagnostic value of gene signatures, an intersect was taken between gene lists associated with the indicated Hoshida subclasses and either the Notch1-associated or Dlk1-associated branches in our data.
Human premalignant liver patient cohort
All biological samples were collected with informed consent from Addenbrooke’s Hospital, Cambridge, UK, according to procedures approved by the Office for Research Ethics Committees Northern Ireland (ORECNI; 20/NI/0109). All participants consented to publication of research results.
scRNA-seq and analysis
For hepatocyte scRNA-seq, livers were perfused with 0.05% collagenase in Hank’s balanced salt solution (HBSS) to partial dissociation, then cut into pieces with a razor blade or scalpel, in HBSS with 0.015% collagenase and 0.2% dispase. The resulting cell suspensions were incubated with 0.02% DNase in HBSS before red blood cell lysis (00-4333-57, eBioscience; 5 min on ice) and then washed with HBSS with 0.02% DNase (centrifuged for 7 min at 400g at 4 °C) to isolate hepatocytes. For RPE1 scRNA-seq, cells were trypsinized into single-cell suspension.
Cells isolated from the different conditions (RPE1) or mice (hepatocytes) were individually labelled with 1 μg of BioLegend TotalSeq Cell Hashing antibodies diluted in cell staining buffer (PBS, 3% FBS and 0.05% azide) for 30 min at 4 °C, and then washed three times with cell staining buffer (centrifuged for 7 min at 400g at 4 °C). Hepatocytes were flow sorted for mVenus positivity according to the gating strategy in Supplementary Information. In each cohort (Figs. 1 and 3), we used two mice per condition, except for non-oncogenic CAGGS–NRASG12V/D38A (one mouse) in the first cohort (Fig. 1). For RAS-induced RPE1 cells (day 6 post-4-OHT treatment), we used both individual subpopulations and a mixed population, with a mixed population (no 4-OHT treatment) as control. This allowed us to pool all conditions into the same experimental run. Cells were then pooled and resuspended to a concentration of 800 cells per microlitre for single-cell encapsulation using the Chromium Single Cell B Chip Kit (PN-1000073, 10X Genomics), followed by library prep using the Chromium Single Cell 3′ GEM Library & Gel Bead Kit v3 (PN-1000075, 10X Genomics) for the gene expression library and the Chromium Single Cell 3′ Feature Barcode Library Kit (PN-1000079, 10X Genomics) for the hashtag-oligo library. Both libraries were then pooled for paired-end sequencing on the HiSeq-4000 (OIS dataset and RPE1 dataset) or the Illumina NovaSeq 6000 platform (tumours dataset).
Hashtags used for each sample were: for the liver OIS dataset (TotalSeq-A anti-mouse), G12V-1 hashtag 1 (ACCCACCAGTAAGAC); G12V-2 hashtag 2 (GGTCGAGAGCATTCA); and D38A hashtag 3 (CTTGCCGCATGTCAT).
For the RPE1 dataset (TotalSeq-A anti-human), monoculture ‘S’ d6 hashtag 1 (GTCAACTCTTTAGCG); monoculture ‘M’ d6 hashtag 2 (TGATGGCCTATTGGG); monoculture ‘L’ d6 hashtag 3 (TTCCGCCTCTCTTTG); monoculture ‘XL’ d6 hashtag 4 (AGTAAGTTCAGCGTA); co-culture d0 hashtag 5 (AAGTATCGTTTCGCA); and co-culture d6 hashtag 6 (GGTTGCCAGATGTCA).
For the liver tumours dataset (TotalSeq-B anti-mouse), mVenus only-1 hashtag 1 (ACCCACCAGTAAGAC); mVenus only-2 hashtag 2 (GGTCGAGAGCATTCA); day 12-1 hashtag 3 (CTTGCCGCATGTCAT); day 12-1 hashtag 4 (AAAGCATTCTTCACG); day 30-1 hashtag 5 (CTTTGTCTTTGTGAG); day 30-2 hashtag 6 (TATGCTGCCACGGTA); tumour-1 hashtag 7 (GAGTCTGCCAGTATC); tumour-2 hashtag 8 (TATAGAACGCCAGGC); non-tumour-1 hashtag 9 (TGCCTATGAAACAAG); and non-tumour-2 hashtag 10 (CCGATTGTAACAGAC).
Resulting reads were aligned using the CellRanger pipeline to the mm10 genome assembly for the hepatocyte datasets and hg38 for the RPE1 dataset. Demultiplexing based on expression of hashtag oligos was performed using the CITE-seq-Count command, with no mismatches allowed. As all conditions to be compared were pooled into the same experimental run, direct analysis could be performed without the need for integration or batch correction. After quality-control filtering to remove low-quality sequenced cells, all downstream analysis, including pseudotime analysis, a technique that models single-cell transcriptional change as a continuum, was performed using the Seurat63,64, Monocle65 or dynverse66 implementations in R.
Statistical analysis
Statistical analyses were carried out in R (v4.1.1) or using the Prims10 built-in analysis (v10.1.1). The number (n) of biologically independent samples is described in the figure legends and Methods, and the data points are shown with the bar charts. Tests used to assess statistical differences between conditions are described in the respective figure legends. See Source Data.
For the mouse scRNA-seq experiments, in each cohort (Figs. 1 and 3), we used two mice per condition, except for non-oncogenic CAGGS-NRASG12V/D38A (one mouse) in the first cohort (Fig. 1). The western blot in Fig. 2d was repeated in three independent experiments, and results were reproduced. Figure 4e shows representative images from a cohort of 13 patients with hepatitis C (further patient details are in Supplementary Table 3). The immunofluorescence in Extended Data Fig. 3b was repeated in three independent experiments. The IHCs in Extended Data Figs. 6 and 10 were repeated for the number of n mice as indicated on the figure, and results were reproduced as shown in the associated quantifications.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.