Covariation MS uncovers a protein that controls cysteine catabolism

Mice

A heterogeneous cohort of 163 female DO mice (24 weeks, n = 110; 18 months, n = 10; 22 months, n = 29; 28 months, n = 14) were used, and details were reported previously¹⁰. Mice were originally from the Jackson Laboratory⁶¹, derived from eight founder strains: A/J, C57BL/6J, 129S1/SvImJ, NOD/ShiLtJ, NZO/H1LtJ, CAST/EiJ, PWK/PhJ and WSB/EiJ. The cohort were group-housed in a temperature-controlled (20–22 °C) room on a 06:00 to 18:00 light:dark cycle upon arrival and fed a chow diet during acclimatization before transferring to a thermoneutrality incubator (29 °C) and fed a rodent high fat diet (OpenSource Diets, D12492) with 60% kcal% fat, 20% kcal% carbohydrate and 20% kcal% protein (ref. ⁶²). Mice were fed ad libitum for eight weeks. All animal-related experiments were approved by Institutional Animal Care and Use Committee of the Beth Israel Deaconess Medical Center. Sample size for all animal work were chosen based on our previous work¹⁰. Randomization was performed to minimize batch effects. Researchers were not blinded to animal groups.

Tissue extraction

Mice were euthanized by rapid cervical dislocation, and tissues were extracted and frozen in less than 20 s following euthanasia using the freeze-clamping method⁶³. Freeze-clamped tissues were split into small pieces in liquid nitrogen and stored in a −80 freezer.

Proteomics sample preparation

DO samples from each tissue were randomly assigned to 12 batches for TMT-based proteomics. Tissue pellets were weighed while frozen and lysed in the lysis buffer containing 100 mM 4-(2-hydroxyethyl)-1-piperazineethanesulfonic acid (HEPES) pH 8.5, 8 M urea, 2% sodium dodecyl sulfate (SDS) and one Roche cOmplete protease inhibitors tablet per 15 ml. Lysis was performed with Tissuelyser II (Qiagen) in a 4 °C cold room to an initial concentration of ~ 4–10 mg protein per ml buffer. After centrifugation, a bicinchoninic acid (BCA) assay was performed to measure protein concentration. On the basis of this measurement, samples were diluted to 1 mg protein per ml buffer. For each tissue, a ‘bridge’ sample was generated by mixing 50 μg of each sample, in order to serve as a standard to reflect the average abundance of each protein across the entire cohort. Two-hundred micrograms of protein from each sample were treated with 5 mM tris(2-carboxyethyl)phosphine (TCEP) at 37 °C for 1 h to reduce protein disulfide bonds, followed by addition of 25 mM iodoacetamide for 25 min at room temperature in the dark to alkylate free thiols. Methanol–chloroform precipitation⁶⁴ was then performed to pellet proteins. The pellets were then resuspended in 200 mM N-(2-hydroxyethyl)piperazine-N′-(3-propanesulfonic acid) (EPPS) buffer pH 8, and digested with Lys-C and trypsin at an enzyme:protein ratio of 1:100 at 37 °C overnight. An additional round of 4 h trypsin digestion was performed the next day. The resulting mixture was then subjected to centrifugation and peptide quantification with microBCA (Thermo) kits. Based on this, 50 μg peptides from each sample were labelled by 100 μg of the TMTpro-16 reagents⁶⁵ for 1 h at room temperature following the streamlined-TMT protocol⁶⁶. Each TMT plex contained 15 randomized samples and a bridge sample. After a ratio check to confirm peptide loading in each TMT channel and TMT labelling efficiency, the reaction was quenched using 5 μl of 5% hydroxylamine for 15 min. All samples in a plex were then mixed based on the ratio check, desalted with Waters Sep-Pak cartridges, freeze-dried overnight using a speed-vac system. Three-hundred micrograms per TMT plex of dried peptides were resuspended in 10 mM ammonium bicarbonate pH 8.0, 5% acetonitrile (HPLC buffer A), and fractionated into 24 fractions with basic pH reversed-phase HPLC using an Agilent 300 extend C18 column. A 50-min linear gradient in 13–43% buffer B (10 mM ammonium bicarbonate, 90% acetonitrile, pH 8.0) at a flow rate of 0.25 ml min⁻¹ was used for fractionation. Each fraction was then purified with StageTips, dried in a speed-vac, and reconstituted in a solution containing 5% acetonitrile (ACN) and 4% formic acid. Sample preparation for other proteomics experiments in this work was conducted using the same workflow as described above, without the bridge channel.

LC–MS for proteomics

Two micrograms of peptides in each fraction were analysed by liquid chromatography–MS (LC–MS). Peptides were loaded onto a 100-μm capillary column packed in-house with 35 cm of Accucore 150 resin (2.6 μm,150 Å). Peptides were analysed by Orbitrap Eclipse Tribrid Mass Spectrometer (Thermo) coupled with an Easy-nLC 1200 (Thermo) using a 180-min gradient: 2%–23% ACN, 0.125% formic acid at 500 nl min⁻¹ flow rate. A FAIMSPro⁶⁷ (Thermo) device was used with compensation voltages at −40V/−60V/−80V. Data-dependent acquisition was used with a mass range of m/z 400–1,600 and 2 s cycles. MS1 resolution was set at 120,000, and singly charged ions were not further sequenced. MS2 was performed with standard automatic gain control (AGC) and 35% normalized collisional energy (NCE), and a dynamic exclusion window of 120 s and maximum ion injection time of 50 ms. Fragment ions were then selected for multi-notch SPS-MS3 method⁶⁸ with 45% NCE to quantify TMT reporter ions. For AP–MS experiments, samples were analysed using the same LC–MS system. A 60-min gradient, with 2%–25% ACN and 0.125% formic acid was used for analysis without the FAIMSPro device. MS1 was performed with 120,000 resolution, 375–1,500 m/z scan range, and 50 ms maximum injection time. MS2 was performed with a 0.7 Th isolation window, 30% normalized NCE, 35 ms maximun ion injection time and 120 s window of dynamic exclusion. Only species with 2–5 charges were selected for analysis, and priority was set to species with lower charge states.

Database searching

Database searching was conducted with the Comet algorithm⁶⁹ on Masspike, an in-house search engine reported previously¹⁷. All mouse (Mus musculus) entries from UniProt (http://www.uniprot.org, downloaded 29 July 2020) and the reversed sequences, as well as common contaminants (for example, keratins and trypsin) were used for searching, with the following parameters: 25 ppm precursor mass tolerance; 1.0 Da product ion mass tolerance; tryptic digestion (cleaving at lysine and arginine residues) with up to three missed cleavages. Methionine artificial oxidation (+15.9949 Da) was set as a variable modification. Carboxyamidomethylation (+57.0215) on cysteine was set as a static modification. For TMT-based experiments, TMTpro (+304.2071 Da) on lysine and peptide N terminus were set as additional static modifications. Peptides were filtered with a target-decoy^17,70,71 method to control the FDR to <1%. Parameters such as XCorr, ΔCn, missed cleavages, peptide length, charge state and precursor mass accuracy were used for filtering. Short peptides (<7 amino acids) were discarded. Proteins were assembled from peptides, and protein-level FDR was controlled, with the Picked FDR method⁷², to <1% combining all MS runs. For DO samples, an additional round of peptide filtering was applied, in order to remove peptides that are not shared across samples due to polymorphism⁷³. All founder strain protein sequences (A/J, C57BL/6J, 129S1/SvImJ, NOD/ShiLtJ, NZO/H1LtJ, CAST/EiJ, PWK/PhJ and WSB/EiJ) were downloaded from Ensembl⁷⁴ and in silico tryptic digested using the Protein Digestion Simulator (Pacific Northwest National Laboratory). A list containing peptides that are not shared across was used to filter out peptides from the DO experiments.

Proteomics quantification

For TMT-based quantification, peptide abundance was measured by TMT reporter ions. Each reporter was scanned using a 0.003 Da window, selecting m/z with the highest intensity. Isotopic impurities were corrected based on the manufacturer’s specifications, and TMT signal-to-noise ratio (S/N) was used for quantification. Peptides with summed S/N lower than 320 across 16 channels of each TMT plex or isolation specificity lower than 70% were discarded. Proteins were quantified by summing up the TMT S/N values of peptides, and protein quantification was normalized to ensure equal protein loading across all TMT channels. For each DO sample, protein relative abundance was presented as log₂ sample/bridge ratio, using the bridge sample in the same TMT plex as the biological sample. The values were analysed in both raw and median-centred formats^30,75. Both returned nearly identical results for subsequent bioinformatic analyses, and we reported values without median-centring in this work. For AP–MS experiments, peptides were quantified based on peak area of MS1, and proteins were quantified by summing up the abundance of peptides.

Metabolomics sample preparation

Metabolites from tissues were extracted with pre-chilled 80% methanol containing three internal standards (0.05 ng μl⁻¹ thymine-d₄, 0.05 ng μl⁻¹ inosine-¹⁵N₄, and 0.1 ng μl⁻¹ glycocholate-d₄), at a 4:1 buffer volume:sample mass ratio. Lysis was performed with Tissuelyser II (Qiagen) in a 4 °C cold room with three 30 s cycles using the highest power setting. Between every cycle, samples were chilled for 30 s. The mixture was spun in a 4 °C centrifuge at 18,000g for 15 min. The supernatant was collected for metabolomics analysis, and the pellet was saved for proteomics analysis. A ‘pool’ sample was generated by equally mixing all samples, representing the average metabolite abundance across all samples. Samples were then diluted tenfold with extraction buffer before loading to the LC–MS for analysis. For cell-based experiments, 12-well plates were used, and 100 μl of pre-chilled extraction buffer was used for every well. The mixture was then processed as described above.

LC–MS and quantification for metabolomics

Ten microlitres of metabolite extracts were loaded onto a Luna-HILIC column (Phenomenex) using an UltiMate-3000 TPLRS LC with 10% buffer A (20 mM ammonium acetate and 20 mM ammonium hydroxide in water) and 90% buffer B (10 mM ammonium hydroxide in 75:25 v/v acetonitrile/methanol). A 10-min linear gradient to 99% mobile phase A was used to analyse metabolites. Liquid chromatography was coupled with a Q-Exactive HF-X mass spectrometer (Thermo). Liver metabolites were analysed by an LC–MS system consists of a Vanquish LC coupled with an Orbitrap Exploris 120 mass spectrometer (Thermo) using the same column and gradient. Negative or positive ion mode was used with full scan analysis over 70–750 m/z at 60,000 resolution, 10⁶ AGC, and 100 ms maximum ion accumulation time. In-source CID was applied at 5.0 eV. Ion spray voltage was 3.8 kV, capillary temperature was 350 °C, probe heater temperature was 320 °C, sheath gas flow was set at 50, auxiliary gas was set at 15, and S-lens RF level was set at 40. Metabolite peaks were analysed using TraceFinder (Thermo) software through a targeted approach. Peaks were matched to a metabolite library of ~800 validated metabolites on the LC–MS system, including metabolic tracers and peak area was used to quantify metabolite abundance. To account for run-to-run variations, metabolite abundance was adjusted by the average peak area of three internal standards. For bile acids, standards for individual bile acid species were used to acquire the retention time for subsequent quantification. Total bile acid content was obtained by summing up the peak areas of individual species. For DO samples, the sample loading sequence was randomized, and a pool was run every ten biological samples. This pool was used to calculate a sample-to-pool ratio for every metabolite in the ten sample runs before the pool, representing the relative abundance of a metabolite in a sample compared to the average abundance of this metabolite across the entire cohort. Other targeted LC–MS analyses of metabolite extracts were performed on a Vanquish HPLC System coupled to an Exploris 120 mass spectrometer equipped with a HESI ion source (Thermo Fisher Scientific). The LC system was controlled by Chromeleon 7.3.1 (Thermo Fisher Scientific), and the MS is controlled by Xcalibur 4.7.69.37 (Thermo Fisher Scientific). When analysing hypotaurine, taurine, cysteine and cystine, metabolites were separated on the Luna-HILIC column as described above. When analysing bile acid and bile acid-taurine conjugates, metabolites were separated on an ACQUITY UPLC HSS T3 column (150 mm × 2.1 mm, particle size 1.8 μm) maintained at 35 °C. Solvent A: 5% acetonitrile in water with 0.1% formic acid; solvent B: 5% water in acetonitrile with 0.1% formic acid. A flow rate of 0.4 ml min⁻¹ was used and A/B gradient was as follows: being isocratic at 1% B for 5 min, linearly increasing to 99% B at 17.5 min, keeping at 99% B for 3.5 min, shifting back to 1% B in 0.1 min and holding at 1% B until 25 min. Mass spectrometer parameters: spray voltage positive electrospray ionization (ESI) mode +3.7 kV or negative ESI −2.5 kV; ion transfer tube temperature 275 °C; vaporizer temperature 320 °C; sheath gas 50 arbitrary units (a.u.); auxiliary gas 10 a.u.; sweep gas 1 a.u.; S-lens RF 70%; resolution 120,000; AGC target standard. The instrument was calibrated with FlexMix calibration solution (Thermo Fisher Scientific). When using Luna-HILIC column, each sample was analysed in ESI- mode; when using T3 column, each sample was analysed in ESI+ and ESI− switching mode. The m/z range was 70−800. Identity of metabolites had been confirmed with standards. Quantification of LC–MS data were performed through peak detection and integration functions in Freestyle software (Thermo Fisher Scientific) using mass ranges of calculated [M-H]⁻ m/z ± 5 ppm. In the case of quantifying stable isotope-incorporated metabolites, m/z window was manually examined to ensure exclusion of natural isotopes.

Hepatocyte isolation and transfection

Primary hepatocytes were isolated from 8- to 10-week-old male C57BL/6 mice by liver perfusion. Livers were perfused with liver digest medium (Invitrogen, 17703-034). The cell suspensions were filtered through a 70-µm cell strainer. Primary hepatocytes were collected by a Percoll (Sigma, P7828) gradient centrifugation. Cells were cultured in Dulbecco’s Modified Eagle Medium (DMEM) with 25 mM glucose, 10% fetal bovine serum (FBS), 2 mM sodium pyruvate, 1% penicillin/streptomycin, 1 μM dexamethasone and 100 nM insulin. After 12 h, plating medium was removed and incubated with maintenance medium (DMEM with 25 mM glucose, 0.2% bovine serum albumin (BSA), 2 mM sodium pyruvate, 1% penicillin/streptomycin). Cells were collected within 48 h. For transient transfection, hepatocytes were transfected with LRRC58 siRNAs (Sigma Aldrich, SASI_Mm02_00347085 and SASI_Mm01_00138492) or CDO1 (Sigma Aldrich, SASI_Mm01_00121551) using RNAiMAX (Invitrogen). SASI_Mm01_00138492: sense strand 5′-GUAUGACCCUCCGACUCUU[dT][dT]-3′; antisense strand 5′-AAGAGUCGGAGGGUCAUAC[dT][dT]-3′. SASI_Mm02_00347085: sense strand 5′-CUCAGAAGAUGAAGCCAGU[dT][dT]-3′; antisense strand 5′-ACUGGCUUCAUCUUCUGAG[dT][dT]-3′. SASI_Mm01_00121551: sense strand 5′-GAAGUUUAAUCUGAUGAUU[dT][dT]-3′; antisense strand 5′-AAUCAUCAGAUUAAACUUC[dT][dT]-3′.

Generation of LRRC58 overexpression Hep G2 line

Hep G2 cells were obtained from ATCC, authenticated by short tandem repeat profiling, and tested for mycoplasma contamination with mycoplasma-negative results. Cells were grown in EMEM (ATCC 30-2003), supplemented with 10% FBS (GeminiBio, 100–106) and 1% penicillin/streptomycin (Corning, 30-002-CI). Cells were detached using 0.25% trypsin (Gibco, 25200-056) and subcultured every 3–4 days. To generate the overexpression line, 12 μg of LRRC58 overexpression plasmid containing a Flag tag and an HA tag (BioPlex³⁶), 9 μg of psPAX2 (Addgene #12260), 4.5 μg of pMD2.G (Addgene #12259) were co-transfected into Lenti-X 293T cells with Lipofectamine 3000 according to the manufacturer’s instructions. The viral supernatant was filtered through a 0.45-μm polyethersulfone membrane (PES) filter 48 h post-transfection, and then infected into Hep G2 wild-type cell line in the presence of 8 μg ml⁻¹ Polybrene. Medium change was done 24 h post-transfection to complete medium. After incubation in complete medium for 24 h, cells were selected with 2 μg ml⁻¹ puromycin. After 5 days, cells were validated for LRRC58 expression by qPCR and proteomics.

Generation of LRRC58 knockdown in Hep G2 line

To generate LRRC58 knockdown cells, single guide RNA (sgRNA) oligonucleotide were designed as follows: forward: CACCGGCGCGCAGCTCTAAGAGCG; reverse: AAACCGCTCTTAGAGCTGCGCGCC. The following sgRNA oligonucleotides were used to generate the negative control cell line: forward: CACCGTTCGAAATGTCCGTTCGGT; reverse: AAACACCGAACGGACATTTCGAAC. DNA corresponding to sgRNAs was cloned into LentiCRISPRv2 (Addgene #52961). To generate the cell line, 12 μg of LentiCRISPRv2 containing LRRC58 sgRNA, 9 μg of psPAX2 (Addgene #12260), 4.5 μg of pMD2.G (Addgene #12259) were co-transfected into Lenti-X 293T cells with Lipofectamine 3000 according to the manufacturer’s instructions. The viral supernatant was filtered through a 0.45-μm PES filter 48 h post-transfection, and then infected into Hep G2 wild-type cell line in the presence of 8 μg ml⁻¹ Polybrene. Medium change was done 24 h post-transfection to complete medium. After incubation in complete medium for 24 h, cells were selected with 1 μg ml⁻¹ puromycin. After 5 days, cells were validated by qPCR and proteomics.

Immunoprecipitation of LRRC58 for immunoprecipitation–western blotting

Hep G2 LRRC58^OE cells lysed using radioimmunoprecipitation (RIPA) lysis buffer (Thermo Fisher, 89900) and protease inhibitors (Roche, 11836153001). The cell lysates were centrifuged at 21,000 rpm for 10 min at 4 °C and the supernatants were used for subsequent analysis. Samples were immunoprecipitated overnight at 4 °C with anti-Flag (Sigma Aldrich, F1804; 4 µg of antibody per sample) that was coated on protein G magnetic beads (Invitrogen, 10007D). After incubation, the beads were washed 6 times with 0.5% NP-40 in phosphate buffered saline (PBS) and 4 times in PBS. Sample was eluted using SDS–PAGE reducing sample buffer and heat samples at 95°C in a heating block for 10 min.

Immunoprecipitation of LRRC58 for AP–MS

Hep G2 LRRC58 overexpression cells were lysed in 0.5%. nonidet P-40 (NP-40) in PBS. The cell lysates were centrifuged at 21,000 rpm for 10 min at 4 °C and the supernatants were used for subsequent analysis. Samples were immunoprecipitated overnight at 4 °C with anti-Flag (Sigma Aldrich, F1804; 4 µg of antibody per sample) that was coated on protein G magnetic beads (Invitrogen, 10007D). A background control sample was generated by incubating the lysate with protein G magnetic beads that were not antibody-coated. After incubation, the beads were washed 5 times with 0.5% NP-40 in PBS and 5 times in PBS. Elution was achieved by 1% trifluoroacetic acid pH 2.5 at 60 °C for 5 min. Immunoprecipitated samples were dried in a speed-vac and then resuspended with 200 mM EPPS. Proteins were digested overnight at 37 °C with 4 μg trypsin and 4 μg Lys-C. Peptides were then purified by C18 stage tip and analysed by LC–MS as described above.

Western blotting

Samples were isolated using radioimmunoprecipitation (RIPA) lysis buffer (Thermo Fisher, 89900) or IP lysis buffer (Pierce) with protease inhibitors (Roche, 11836153001). Homogenates were centrifuged at 21,000g for 10 min at 4 °C, and the supernatants were used for subsequent analyses. Protein concentration was determined using the BCA assay (Pierce). Protein lysates were denatured in Laemmli buffer (60 mM Tris, pH 6.8, 2% SDS, 10% glycerol, 0.05% bromophenol blue, 100 mM DTT), resolved by 4–12% NuPAGE Bis-Tris SDS–PAGE (Invitrogen), and transferred to a polyvinylidene difluoride (PVDF) membrane using an iBlot 2 (Invitrogen). Primary antibodies anti-β-actin (13E5, Cell Signaling Technologies, 1:1,000); anti-CDO1 (12589-1-AP, Proteintech, 1:1,000); anti-CUL5 (Bethyl Laboratories, A302-173A, 1:1,000), anti-ELOB (Proteintech, 10779-1-AP, 1:500), anti-ELOC (Proteintech 12450-1-AP, 1:1,000), anti-vinculin (Cell Signaling Technologies, 4650, 1:1,000), anti-Flag M2-HRP (A8592, Sigma Aldrich, 1:1,000) were diluted in tris buffered saline containing 0.05% Tween (TBS-T) and 5% BSA. Membranes were incubated overnight with primary antibodies at 4 °C. For secondary antibody incubation, HRP-conjugated secondary antibodies (Promega anti-mouse W402B, 1:5,000 or 1:10,000, and anti-rabbit W401B, 1:5,000 or 1:10,000) were diluted in TBS-T containing 5% BSA. Results were visualized with enhanced chemiluminescence (ECL) western blotting substrates (Pierce and ThermoScientific SuperSignal West Pico PLUS 34580).

Immunoprecipitation of LRRC58 for Western blot

Flag–LRRC was immunoprecipitated from co-IP buffer with anti-Flag M2 magnetic agarose (Sigma Aldrich) and eluted with 3× Flag peptide (150 ng µl⁻¹, Sigma Aldrich). Western blots were performed as above.

Analysis of CDO1 degradation dynamics

Primary hepatocytes were plated at 2 × 10⁵ cells per ml into 6-well plates. Forty eight hours post-transfection cells were treated with cycloheximide for 0, 2, 4, 6, 8, 12 h at 20 μg ml⁻¹. Abundance of CDO1 was measured by western blotting as described above.

Generation of CDO1 post-translational stability reporter

Full-length CDO1 and CDO1 mutants were cloned into Cilantro 2 (gift from B. Ebert; Addgene plasmid #74450; http://n2t.net/addgene:74450; RRID:Addgene_74450) by Twist Bioscience. Lentivirus was packaged in Lenti-X 293T (Takara Bio) with pSpax2 and VSVG. Viral supernatant was filtered through a 0.45-µm filter. Hep G2 cells (ATCC) were spinfected in the presence of polybrene and selected in puromycin after 24 h. Empty Cilantro 2 vector was used as a negative control. After the indicated interventions, CDO1 post-translational stability was measured via flow cytometry on an LSR Fortessa flow cytometer (BD Biosciences) using the high-throughput sampler. Data were analysed with FlowJo v.10 by taking the ratio of the GFP mean fluorescence intensity to the mCherry mean fluorescence intensity in the mCherry⁺DAPI⁻ population.

Flag–LRRC overexpression vector

Flag-LRRC and Flag-LRRC58^∆256–291 were cloned by Twist Bioscience into pTwist Lenti CMV BSD.

Media

Cystine depletion medium was generated from a base of Dulbecco’s MEM with l-glutamine and glucose without cysteine and methionine (US Biologicals, D9813, additional bicarbonate added at 1.5 g l⁻¹) or Minimum Essential Medium Eagle without cystine, cysteine, glutamine or methionine (MP Biomedicals 091641454). As necessary, L-methionine (Sigma Aldrich M5308), l-glutamine (Sigma Aldrich G7513) and l-cystine dihydrochloride (Sigma Aldrich, C6727) were added to the final concentration used in EMEM (ATCC, 30-2003). Dialysed FBS was added at 10% (v/v)(Gibco) as well as 1% penicillin-streptomycin.

siRNA in Hep G2 cells

Cells were reverse transfected with siRNA targeting LRRC58 or scramble with Lipofectamine RNAimax per the manufacturer protocol and assayed 48 h after transfection (Sigma Aldrich, SASI_Hs02_0032-1028 and SASI_Hs02_0032-1029 for LRRC58 and Mission Universal Negative Control #1 for scramble) using RNAiMAX (Invitrogen). SASI_Hs02_0032-1028: sense strand 5′-CAUUAAGAUUCGAAAUAUU[dT][dT]-3′; antisense strand 5′-AAUAUUUCGAAUCUUAAUG[dT][dT]-3′. SASI_Hs02_0032-1029: sense strand 5′-GAAAUCUGCCUUCUCUGAA[dT][dT]-3′; antisense strand 5′-UUCAGAGAAGGCAGAUUUC[dT][dT]-3′.

Compounds used in reporter assays

Cells were treated with compounds including cycloheximide (Sigma Aldrich, 01810), l-cysteine (Thermo Scientific J63745.22), d-cysteine (Santa Cruz Biotechnology, sc-255054), MLN4924 (Selleck S7109), bortezomib (Selleck S1013), glutathione ethyl ester (Cayman Chemical 14953-50), diamide (Sigma Aldrich, D3648), taurine (Sigma Aldrich, T8691), hypotaurine (Sigma Aldrich H1384), hydrogen peroxide (Sigma Life Science, H1009), cysteine sulfinic acid (Sigma Aldrich, C4418) and dithiothreitol (Thermo Fisher, A39255).

Inhibition of proteasome and autophagy

Primary hepatocytes were plated 2 × 10⁵ cells per ml into 6-well plates, then cells were treated with 10 μM MG132 for 0, 1, 2, 4, 6 or 8 h, or 30 μM chloroquine for 0, 1, 2, 4, 6 or 8 h. Abundance of CDO1 was measured by western blotting as described above.

Inhibition of neddylation

Primary hepatocytes were plated at 2 × 10⁵ cells per ml into 6-well plates and treated with 10 μM MLN4924 for 0, 2, 4, and 6 h. Abundance of CDO1 was measured by Western blotting as described above.

Liver LRRC58 knockdown in vivo

Male mice of C57BL/6J background were purchased from the Jackson Laboratory at 11 weeks of age and allowed 1 additional week for acclimatization feeding standard chow diet. AAVs (serotype 8) that carry short hairpin RNA (shRNA) targeting LRRC58 and AAVs that carry scramble shRNA were prepared by VectorBuilder (VB230509-1233rsh: TTAGCTGCAAGGACCATTAAG and VB010000-0023jze: CCTAAGGTTAAGTCGCCCTCG, respectively). At 12 weeks, a dose of 1 × 10¹¹ genome copies per ml of AAV was administered as a bolus over 20 s by tail vein injection. Both scramble and LRRC58^KD cohorts were chow-fed for two weeks and were then subjected to cholesterol and triglycerides measurements, as well as metabolomics and proteomics. Mice were housed in a temperature-controlled (23 °C) room on a 12-h light-dark cycle. All animal-related experiments were approved by Institutional Animal Care and Use Committee of the Beth Israel Deaconess Medical Center.

Gene expression quantification by qPCR

Total mRNA was extracted from animal tissues and cultured cells using TRIzol (Invitrogen), purified with a PureLink RNA Mini Kit (Invitrogen) and quantified using a Nanodrop 2000 UV–visible spectrophotometer. cDNA was prepared using 2 μg total RNA by reverse transcription–PCR (RT–PCR) using a high-capacity cDNA reverse transcription kit (Applied Biosystems), according to the manufacturer’s instructions. Real-time quantitative PCR (qPCR) was performed on cDNA using SYBR Green probes. qPCR was performed on a 7900 HT Fast Real-Time PCR System (Applied Biosystems) using GoTaq qPCR Master Mix (Promega). Reactions were performed in a 384-well format using an ABI PRISM 7900HT real-time PCR system (Applied Biosystems). Fold changes in expression were calculated by the ΔΔC_t method using mouse RPLP0 as an endogenous control for mRNA expression. All fold changes are expressed normalized to the vehicle control. SYBR primer pair sequences were as follows. RPLP0: forward, 5′-AGATTCGGGATATGCTGTTGGC-3′; reverse,5′-TCGGGTCCTAGACCAGTGTTC-3′; LRRC58:forward, 5′- CGC GCC CTT CAG ACC C -3′: reverse, 5′-AGG TAT AAA CAT TCT AAA CTC CGC A-3′; CDO1: forward, 5′-GGGGACGAAGTCAACGTGG-3′; reverse, 5′-ACCCCAGCACAGAATCATCAG-3′.

Cholesterol measurements

Total cholesterol was measured by Infinity Cholesterol assay (Thermo Scientific, TR13421). Approximately 50 mg liver was homogenized in PBS to a concentration of 200 mg ml⁻¹ liver homogenate. Twenty microlitres of each homogenate was plated in 96-well clear-bottom UV plates (Thermo Scientific, 8404) and incubated at 37 °C in the presence of 20 μl of 0.25% deoxycholate for 5 min. Then, 200 μl of Infinity Cholesterol was added, followed by 15 min incubation at 37 °C. Absorbance was measured at 500 nm using a standard plate reader. Control Serum Wako I (Wako, 466-26201) was used in each assay as a standard for analysis of absorbances and quantification of cholesterol concentrations in Fig. 5d. Lipids were isolated from liver tissue utilizing 2:1 chloroform-methanol extraction. In brief, 20 µl of 2:1 chloroform-methanol per mg of liver tissue was used for homogenization followed by organic phase separation with 20% H₂O (of the total volume) via centrifugation. The organic lipid-containing fraction was extracted, then vacuum dried and redissolved to a final concentration of 30 µl mg⁻¹ in 1× reaction buffer. Total cholesterol was then determined using the Amplex Red Cholesterol Assay Kit (Thermo Fisher, A12216) following the manufacturer instructions and measured using a 96-well fluorescent plate reader (BMG Labtech, CLARIOstar) with excitation of 545 nm and emission at 590 nm in Fig. 5k.

Total triglycerides level measurements

Triglycerides were measured by Infinity Triglycerides assay (Thermo Scientific, TR22421). Approximately 50 mg of liver was homogenized in PBS to a concentration of 50 mg ml⁻¹ liver homogenate. 20 μl of each homogenate was plated in 96-well clear-bottom UV plates (Thermo Scientific, 8404) and incubated at 37 °C in the presence of 20 μl of 1% deoxycholate for 5 min. Then 200 μl of Infinity Triglycerides was added followed by 15 min incubation at 37 °C. Absorbance was measured at 500 nm using a standard plate reader. Control Serum Wako I (Wako, 466-26201) was used in each assay as a standard for analysis of absorbances and quantification of triglyceride concentrations.

Cholesterol measurements in cells

Cholesterol levels were measured using total cholesterol Assay Kit (Colorimetric, STA-384; Cell Biolabs) according to the manufacturer’s instructions. Cholesterol content was normalized to the cell number.

Cystine tracing

Tracing medium was prepared from DMEM powder with l-glutamine, glucose, and without cysteine or methionine (US Biological Life Sciences, D9813). This medium was then supplemented with l-methionine (0.01500 g l⁻¹), sodium bicarbonate (3.7 g l⁻¹), 0.2% heat-shocked BSA and 2% penicillin-streptomycin. Primary hepatocytes were transfected with LRRC58 siRNA (Sigma Aldrich, SASI_Mm02_00347085) using RNAiMAX (Invitrogen) following the protocol supplied by the manufacturer. Primary hepatocytes were plated in 12-well plates and cultured with plating medium (DMEM with 25 mM glucose, 10% FBS, 2 mM sodium pyruvate, 1% penicillin/streptomycin, 1 μM dexamethasone, and 100 nM insulin). The following morning, hepatocytes were incubated with maintenance medium (DMEM with 5 mM glucose, 0.2% BSA, 2 mM sodium pyruvate, 1% penicillin/streptomycin). Forty eight hours after transfection, the medium was changed to tracing medium with 0.03120 g l⁻¹ l-cystine (¹³C₆¹⁵N₂, Cambridge Isotope Laboratories, 1252803-65-8). Cells were incubated in the tracing medium for 30 min. Metabolites from each well of cells were then extracted on ice with 125 μl cold 80 % methanol containing three internal standards: 0.05 ng μl⁻¹ thymine-d₄, 0.05 ng μl⁻¹ inosine-¹⁵N₄ and 0.1 ng μl⁻¹ glycocholate-d₄. Extracts were then subjected to metabolomic analysis as described above.

Cystine tracing in vivo

Male mice on a C57BL/6J background were purchased from the Jackson Laboratory at 11 weeks of age and allowed 1 additional week for acclimatization feeding standard chow diet. AAVs (serotype 8) that carry shRNA targeting LRRC58 and AAVs that carry scramble shRNA were prepared by VectorBuilder (VB230509-1233rsh and VB010000-0023jze, respectively). At 12 weeks, a dose of 1 × 10¹¹ genome copies per ml of AAV was administered as a bolus over 20 s by tail vein injection. Both scramble and LRRC58^KD cohorts were chow-fed for 2 weeks. After injection, 1 mg of l-cystine (¹³C₆¹⁵N₂, Cambridge Isotope Laboratories, 1252803-65-8) was administered as a bolus over 20 s by tail vein injection for 30 min. Samples were subjected to metabolomics, and qPCR. Mice were housed in a temperature-controlled (23 °C) room on a 12-h light-dark cycle. All animal-related experiments were approved by Institutional Animal Care and Use Committee of the Beth Israel Deaconess Medical Center.

Cystine depletion and supplementation in primary hepatocytes

Medium was prepared from DMEM powder with l-glutamine, glucose, and without cysteine or methionine (US Biological Life Sciences- D9813). This medium was then supplemented with l-methionine (0.01500 g l⁻¹), sodium bicarbonate (3.7 g l⁻¹), 0.2% heat-shocked BSA and 2% penicillin-streptomycin. Primary hepatocytes were plated in 12-well plates and cultured with plating medium (DMEM with 25 mM glucose, 10% FBS, 2 mM sodium pyruvate, 1% penicillin/streptomycin, 1 μM dexamethasone, and 100 nM insulin). The following morning, hepatocytes were incubated with maintenance medium (DMEM with 5 mM glucose, 0.2% BSA, 2 mM sodium pyruvate, 1% penicillin/streptomycin). Twenty four hours after plating, the medium was changed to 0.03120 g l⁻¹ l-cystine, no cystine or 0.0624 g l⁻¹ l-cystine (Sigma Aldrich – C6727) overnight. Samples were extracted for proteomics and qPCR as described above.

LRRC58 protein quantification via parallel reaction monitoring (PRM)

Proteins were extracted from frozen cell pellets using 200 μl of extraction buffer (100 mM Tris pH 8.5, 1% SDS, 1× Roche cOmplete Protease Inhibitor Tablet) with 2 μl of DNase 1 (Thermo Scientific) and were extracted at room temperature in a Thermo mixer at 900 rpm. Extracts were centrifuged at 13,000g and the supernatant was assayed for protein concentration using a BCA assay. Following quantification, 50 μg of each sample was reduced with 5 mM DTT for 30 min before alkylating with 20 mM IAM for 20 min. Samples were raised to 190 μl with water before adding 10 μl of 1:1 water-rinsed Sera-Mag carboxylated paramagnetic beads (Cytivia). Proteins were precipitated with 200 μl of 95 proof ethanol (Fisher Scientific) and incubated at room temperature for 5 min. The beads were washed twice with 80% ethanol, before resuspending in 75 μl of 100 mM EPPS (pH 8) with 1 μg of Trypsin (Fisher Scientific). Samples were incubated at 37 °C for 16 h, before clean up using a 50 mg C18 Sep-Pak (Waters), following manufacturer’s instructions.

Samples were reconstituted in 0.1% formic acid in water and three technical replicates of 600 ng were loaded onto Evosep tips per sample, following manufacturer’s instructions. The samples were analysed using an Evosep One: EV-1000 (Evosep) with the Evosep 15SPD method. The 15SPD method has a cycle time of 88 min where sample selection was performed in a randomized order. Detection of the peptides was performed on a timsTOF HT (Bruker) in prm-PASEF mode, with a mass range of 100–1,700 Da, ion mobility range of 0.60–1.6 V s cm⁻², ramp rate of 9.42 Hz, and a ramp time of 100 ms. PRM targets were collected from 73 min to 83 min. The DLTYDPPTLLELAAR peptide from LRRC58 was triggered at an m/z of 844.4487 (+2 charge state) at an isolation width of 3 Da, between 0.90 V s cm⁻² and 1.3 V s cm⁻². Following acquisition, data were analysed in Compass Data Analysis (Bruker) using chromatogram integration at 844.4487 and 1.0995 V s cm⁻².

Constructs for biochemistry

For TR-FRET analysis: CDO1, LRRC58 and eGFP-LRRC58 were cloned into pAC-derived vectors⁷⁶. CDO1 was expressed with an N-terminal StrepII–Avi tag, while LRRC58 and eGFP-LRRC58 with an N-terminal Flag tag. For SEC and in vitro ubiquitylation assays: LRRC58, CUL5, ELOB, ELOC and RBX2 were cloned into pAC-derived vectors. LRRC58 was expressed with an N-terminal StrepII tag, while RBX2 with an N-terminal Flag tag. CUL5, ELOB and ELOC were untagged. CDO1 was cloned into a pNIC-Bio2 derived vector with an N-terminal 6× His tag.

Protein expression and purification

Baculovirus for protein expression of pAC-derived constructs were generated by transfection of expression plasmids into Sf9 cells at a density of 0.9 × 10⁶ cells per ml in ESF 921 medium (Expression Systems). Viral titre was increased by three rounds of infection in Sf9 cells. Strep–LRRC58, CUL5, ELOB, ELOC and Flag–RBX2 were co-expressed, while Strep-Avi-CDO1 was expressed separately. High Five cells were infected at a density of 1.8–2.0 × 10⁶ cells per ml in SF-4 Baculo Express ICM medium (BioConcept) and collected 40–48 h post-infection by centrifugation (1,500 rpm, 20 min). The collected cells were resuspended in lysis buffer (50 mM HEPES/NaOH pH 7.4, 200 mM NaCl, 0.1% (v/v) Triton X-100, 1 mM TCEP) supplemented with protease inhibitors and lysed by sonication, and the supernatant was treated with benzonase for 10 min. The cell lysate was clarified by ultracentrifugation (40,000 rpm, 1 h) and filtered. Filtered lysate was applied to Strep-Tactin XT 4Flow resin (IBA) equilibrated in lysis buffer. The resin was washed with wash buffer (50 mM HEPES/NaOH pH 7.4, 500 mM NaCl, 1 mM TCEP) and bound protein was eluted with elution buffer (50 mM HEPES/NaOH pH 7.4, 200 mM NaCl, 50 mM biotin, 1 mM TCEP). Further cleanup was done with anion exchange chromatography (buffer A: 50 mM HEPES/NaOH pH 7.4, 1 mM TCEP, buffer B: 50 mM HEPES/NaOH pH 7.4, 750 mM NaCl, 1 mM TCEP) followed by SEC (SEC buffer: 50 mM HEPES/NaOH pH 7.4, 150 mM NaCl, 1 mM TCEP). Flag–eGFP–LRRC58 and Flag–LRRC58 were each co-expressed with ELOB and ELOC in High Five cells and purified as described above with lysis and wash buffer instead containing 200 mM NaCl. Lysate was then applied to anti-Flag resin (Genscript) and eluted (50 mM HEPES/NaOH pH 7.4, 200 mM NaCl, 150 μg ml⁻¹ Flag peptide) prior to ion exchange and SEC.

His–CDO1 was expressed in a LOBSTR E.coli expression strain (Kerafast). After transformation, 2 l cultures in TB medium were grown at 37 °C to an OD₆₀₀ of ~0.6 and induced with 0.35 mM isopropyl β-d-1-thiogalactopyranoside (IPTG). Temperature was decreased to 18 °C, proteins were expressed overnight, and cultures were collected by centrifugation (3,300g, 20 min). Cell pellets were resuspended in His lysis buffer (50 mM HEPES/NaOH pH 7.4, 200 mM NaCl, 20 mM imidazole, 5% (v/v) glycerol, 1 mM TCEP) supplemented with protease inhibitors and lysed using sonication. After clearance by ultracentrifugation (39,000 rpm, 1 h), the supernatant was treated with benzonase for 10 min and incubated with high affinity Ni-charged resin (Genscript). Bound proteins were eluted with increasing imidazole concentrations (150–750 mM) and elution fractions were cleaned up by ion exchange chromatography (buffer A: 50 mM HEPES/NaOH pH 7.4, 2 mM TCEP, buffer B: 50 mM HEPES/NaOH pH 7.4, 750 mM NaCl, 2 mM TCEP) followed by SEC in ENL SEC buffer (30 mM HEPES/NaOH pH 7.4, 150 mM NaCl, 3 mM TCEP).

Size-exclusion chromatography

Purified CRL5–LRRC58 (1.71 μM) was mixed with 2.57 μM CDO1 (a 1:1.5 molar ratio) in a final volume of 500 μl in SEC buffer (50 mM HEPES/NaOH pH 7.4, 150 mM NaCl, 1 mM TCEP). The mixture was incubated on ice for 1 h and then separated by SEC. CRL5–LRRC58 alone was separated by SEC to assess differences in co-elution of CRL5–LRRC58 with CDO1.

TR-FRET assay

StrepII–Avi–CDO1 was biotinylated directly from elution fractions after affinity purification (which already contained 50 mM biotin) overnight in a 2 ml reaction composed of 5 mg of the eluted CDO1, 40 mM ATP, 10 MgCl₂, and 300 μg BirA. Biotinylated CDO1 was then purified by SEC (SEC buffer: 50 mM HEPES/NaOH pH 7.4, 150 mM NaCl, 1 mM TCEP). To measure the K_d value of complex formation, 200 nM biotinylated CDO1 and 2 nM of Tb-coupled streptavidin were mixed in an assay buffer with 25 mM HEPES, 150 mM NaCl, 1 mM fresh TCEP, 0.5% BSA, and 0.05% Tween-20 (mixture 1). Serial dilutions of eGFP–LRRC58–ELOB–ELOC at a final concentration ranging from 9.8 nM to 20 μM were prepared in the same buffer (mixture 2). A total of 7.5 μl each of mixtures 1 and 2 were added to each well in a 384-well microplate and incubated for 1 h at room temperature and read on a Pherastar FS (BMG) plate reader with a 337/490/520 nm filter set. Background fluorescence due to eGFP–LRRC58–ELOB–ELOC was calculated by repeating the assay without CDO1 and subtracting the signal for the corresponding eGFP–LRRC58–ELOB–ELOC concentration. The ratio of emissions at 520/490 from five cycles were averaged as the final TR-FRET reading. K_d values were calculated using one-site specific binding and plotted in GraphPad Prism 10.

Displacement assay

To confirm displacement of eGFP–LRRC58–ELOB–ELOC with unlabelled LRRC58–ELOB–ELOC, a final concentration of 200 nM biotinylated CDO1 and 2 nM of Tb-coupled streptavidin were mixed with 7 μM eGFP-LRRC58 in the assay buffer described above (mixture 3). Serial dilutions of unlabelled LRRC58–ELOB–ELOC at a final concentration ranging from 9.8 nM to 20 μM were prepared in the same buffer (mixture 4). A total of 7.5 μl each of mixtures 3 and 4 were added to each well in a 384-well microplate and incubated for 1 h at room temperature. Background fluorescence due to eGFP–LRRC58–ELOB–ELOC was calculated by repeating the assay without CDO1 and subtracting the signal for the corresponding eGFP–LRRC58–ELOB–ELOC concentration. The ratio of emissions at 520/490 from 5 cycles were averaged as the final TR-FRET reading. Half-maximal inhibitory concentration (IC₅₀) values were calculated using an inhibitor versus response variable slope (four-parameter) model and plotted in GraphPad Prism 10.

Ubiquitylation assay

To test ubiquitylation of CDO1 by CRL5–LRRC58, a 30 μl mixture of 1 μM CDO1, 0.5 μM CRL5–LRRC58, 0.2 μM of the E1 enzyme UBA1, 2 μM of the E2 enzyme UBE2D3, 5 mM ATP, 5 mM MgCl₂, 25 mM HEPES/NaOH pH 7.5, 100 mM NaCl, 1 mM TCEP, 60 μM ubiquitin, and dH₂O to bring the assay volume to 30 μl was prepared. A 6.5 μl aliquot of the reaction was taken and added to 13 μl of gel loading buffer as a time 0 timepoint prior to adding the ubiquitin to start the reaction. The reaction was incubated at 37 °C, and at time 2, 5, and 10 min, 7 μl of the reaction was taken and quenched in 13 μl gel loading buffer. Six microlitres from each timepoint was run on a 4–20% gel at 240 V for 22 min. For the control experiment, one component of the mixture was removed at a time and replaced with buffer (25 mM HEPES/NaOH pH 7.5, 150 mM NaCl, 1 mM TCEP) for a total of 6 reactions in a reaction volume of 10 μl, and reactions were quenched with 10 μl of gel loading buffer. For western blot analysis proteins were transferred onto PVDF membranes using an iBlot 2 dry blotting system (Thermo Fisher Scientific). A specific primary antibody was used to detect CDO1 (1:1,000 dilution anti-CDO1, Life Technologies 12589-1-AP). Blots were imaged using a LI-COR Odyssey CLx detecting an anti-rabbit secondary antibody (1:4,000 dilution Anti-Rabbit IgG, LI-COR, 92632211).

Data analysis

All data analyses were performed using R 4.2.0 or Python unless otherwise noted. (1) Assessment of TMT plex effect. For each tissue type, 13 out of 163 samples were randomly selected and measured three times. Two times measured in the same TMT plex, and one time measured across 11 different TMT plexes. Principal component analysis (PCA) was used to examine clustering of samples on the basis of sample ID and measurement batch. (2) Assessment of run-to-run variation of metabolomics. Metabolomics samples were not multiplexed, therefore hundreds of LC–MS runs were required to complete metabolomic analysis of the DO samples. In order to assess the effect of the long running sequence on quantification of metabolites, four metabolic pool samples (representing the average abundance of metabolites across the entire DO cohort, see metabolomics sample preparation section above), were designated as ‘mock’ samples. These samples were treated as biological samples and spiked into the running sequence every ~50 runs. Since these mock samples were the same as pools, the theoretical sample-to-pool value of every metabolite in these samples should be 1. Batch effects were assessed by the deviation between the measured median values and 1. (3) Assessment of protein contamination. Proteins measured in DO samples were visualized in a PCA plot, and PC1 loading was extracted. Proteins with the top-2.5% PC1 rotation that deviated from the majority of the protein population were examined for tissue contamination using a well-established tissue-specific protein expression dataset⁶⁰. Tissue-specific proteins in this dataset were identified using SILAC heavy-to-light ratio comparing the abundance of a protein in one tissue to its average abundance across all tissues. A total of 104 proteins that have muscle-specific expression, and with low expression in BAT were identified among proteins with top-2.5% PC1 loading as described above. Due to the nature of PCA analysis not allowing missing values, these proteins, along with proteins that had missing values and were highly correlated with these 104 proteins (Pearson’s r > 0.9) were removed from the BAT dataset, prior to downstream bioinformatic analysis. A total of 250 proteins were removed. (4) Analysis of protein abundance. Quantified proteins were mapped onto PaxDB⁷⁷, a database for absolute protein abundance. Proteins were stratified based on expression levels represented in part per million (ppm) in the proteome. In each category, proteins quantified in this work were compared with those measured in the literature^12,21,22,78. (5) Analysis of proteomic and metabolomic variation. Coefficient of variation (CV) was used as the measurement of proteomic and metabolomic variation, which is calculated by population standard deviation divided by population mean. CV of proteins and metabolites measured in this work were compared with previous works with deep omics coverage^27,29,63. Quantification data of proteins and metabolites in this work were presented as log₂ transformed values, and these values were transformed back before CV analysis. (6) Metabolite annotation and ancestry analysis. Every metabolite was first manually mapped on Chemical Entities of Biological Interest (ChEBI, https://www.ebi.ac.uk/chebi/), a database containing curated identifiers for chemical identities including metabolites. ChEBI IDs are used by major databases for metabolite identities, biochemical reactions and biological pathways. For every metabolite measured in MPCA, a list of ChEBI IDs were used to annotate the metabolite: the ChEBI IDs of the metabolite itself; the ChEBI IDs of its conjugate acids and/or bases; the ChEBI IDs of its salt adducts; the ChEBI IDs of the metabolite with different charge states; and ChEBI IDs of chemicals that have the same chemical formula and structure but named differently. Both primary and secondary ChEBI IDs were extracted. These IDs were then used for ancestry mapping using the ancestry mapping table provided by ChEBI. For instance, through ancestry mapping, ‘succinate’ was mapped onto ‘succinate^2-’ (CHEBI:30031) then onto ‘dicarboxylic acid dianion’ (CHEBI:28965). This was to prepare for mapping MPCA protein–metabolite edges onto edges from established biochemical reactions (Rhea), pathways (Reactome) and transporters (TCDB), details described below. Without ancestry analysis, established edges between proteins and metabolites that are not at the bottom of the ChEBI hierarchy would become false negatives of the analysis. An example is RHEA-15421: a long-chain fatty acid + ATP + CoA = a long-chain fatty acid acyl CoA + AMP + diphosphate, enzyme: long-chain-fatty acid-CoA ligase 5 (ACSL5). Myristic acid is a long-chain fatty acid and a substrate to this reaction. The ‘ACSL5-a long-chain fatty acid’ edge was recapitulated by the ‘ACSL5-myristic acid’ edge in MPCA, through first mapping myristic acid (CHEBI:30807) to its ancestor long-chain fatty acid anion (CHEBI:57560). The top six levels of ChEBI hierarchy contain IDs with broad child chemical entities, and therefore were removed from ancestry mapping to prevent false positives from ambiguous edge mapping. (7) Mapping MPCA edges onto established biochemical reactions, biological pathways and transporters of metabolites. Established protein–metabolite relationships were downloaded from Rhea³² (https://www.rhea-db.org), Reactome⁷⁹ (https://reactome.org) and TCDB³³ (https://tcdb.org). Only edges with known evidence in mouse and human were used for this analysis. One node of the edge must be a protein, and the other node of the edge must be a metabolite. For every biochemical reaction, edges were generated between enzymes and substrates/products in this reaction. For pathways, direct protein–metabolite physical interactions and upstream and downstream regulation were used to establish edges. For transporters, edges were generated by connecting every protein transporter to all metabolites it transports. For each analysis, the list of edges was filtered to only contain proteins and metabolites measured in MPCA, ruling out edges not recapitulated due to measurement biases. For each tissue, a Fisher’s exact test was performed to examine the statistical significance of MPCA correlations recapitulating established protein–metabolite functional edges that reflect biochemical reactions, biological pathways, and transporters of metabolites. (8) Visualization of protein–metabolite networks. Protein–metabolite edges were visualized with Cytoscape v.3.9.1⁸⁰. Every node was either a protein or a metabolite, and edges were only between a protein and a metabolite. To simplify the network, any protein–metabolite edge shared across multiple biochemical reactions were only counted once during network visualization. Edges from both BAT and liver were compiled in the same network. (9) Analysis of features of MPCA-derived significant protein–metabolite correlations. CVs for each metabolite were calculated using raw sample-to-bridge ratio without log₂ transformation. These CVs were then correlated with the number of significant correlations for each metabolite. Number of total Rhea edges were correlated with the number of significant Rhea edges (Rhea edges that have been recapitulated by significant MPCA correlations) for each metabolite. Enrichment of protein population/classes among significantly correlating protein–metabolite pairs were performed using one-sided (right-sided) Fisher’s exact test. Protein populations were obtained from these databases: mitochondrial proteins: MitoCarta 3.0, mouse⁸¹; metabolic enzymes: mammalian metabolic enzymes database⁸²; and kinases, Kinbase.com⁸³. Gene Ontology (GO) term enrichment for protein correlates of each individual metabolite was performed using one-sided (right-sided) Fisher’s exact tests. GO: Biological Process and GO: Cellular Component datasets were obtained from https://www.ebi.ac.uk/QuickGO/. (10) Linking accessory members to established metabolic pathways. Significant pairwise protein–protein, protein–metabolite and metabolite-metabolite correlations were used in this analysis. The 1,411 Reactome lowest level mouse pathways were downloaded, and the size of each pathway ranged from 1 to 532 members. A Fisher’s exact test, adapted from our previous works^10,84, was set up to calculate statistical enrichment of potential accessory proteins and metabolites associated with these established pathways. For a given established pathway, we tested its first-degree neighboring members for significant association with the pathway. For each individual test, we first counted the number of edges that linked the candidate member to the established pathway; then counted the number of edges the established pathway had to other proteins that were not the candidate member; then counted the number of edges the candidate member had to other members that were not a part of the established pathway; lastly counted edges that did not involve the established pathway nor the candidate member. These four numbers were used to set up the Fisher’s exact test. This test was looped through all established pathways. The resulting P values were subjected to multiple testing correction, separately in BAT and liver, using the Benjamini–Hochberg procedure⁸⁵, and any association with an adjusted P value < 0.05 was considered significant. (11) Enrichment analysis of MPCA pairwise correlations. A protein–metabolite interaction network was constructed with nodes representing proteins and metabolites and edges denoting experimentally supported physical interactions. Protein–metabolite interactions were obtained from Rhea and TCDB, while protein–protein interactions were obtained from CORUM and BioPlex. For each protein–metabolite pair measured in MPCA, we computed the shortest path between the protein and metabolite within this combined network, the number of edges traversed in the path representing the hop distance between the species. Pairs with hop distance of one correspond to known physical interactions. Pairs separated by two or more hops were considered indirect functional associations. To assess the extent to which MPCA correlations are enriched for known interactions at each functional distance, we mapped MPCA protein–metabolite pairs to the reference network using ChEBI ancestry mapping, as described above, to account for cases where MPCA identifies chemically distinct species interacting with a protein. These mapped associations were then ranked by statistical significance (P value). For a given threshold k, we computed the fraction of the top-k MPCA associations that were found in the reference network at each hop distance. Enrichment was calculated by dividing the observed number of MPCA-discovered edges at a given distance in the top k by the number expected under random selection, given the total number of evaluable protein–metabolite pairs at that distance. This analysis was repeated across a range of k values to generate enrichment curves for each hop distance. The same analysis was also performed to evaluate the fold enrichment of LASSO associations that reached global statistical significance, FDR q < 0.05. (12) Machine learning to discover protein regulators of metabolites. Linear regression using the LASSO method³⁴ was done using the R package glmnet. Relative protein abundance (log₂ sample/bridge ratio) was used as the matrix to predict metabolite abundance (log₂ sample/pool ratio). Data were not further scaled prior to analysis. A seed number of 100,000,000 was set to ensure reproducibility of the analysis. For all observations, the weight was set to 1. Data points were split to 90% and 10%, where for each iteration, 90% of the data were used for modelling, and 10% were used for validation. Cross-validation was performed using squared-error. Five values—0, 0.25, 0.5, 0.75 and 1—were set for mixing the relaxed fit with the regularized fit. To prevent overfitting, proteins with none-zero coefficients obtained at one standard error away from the minimum cross-validation error (lamda.1se, which gives the most regularized model) were used as the output. In addition, predicted metabolite abundance was correlated to the measured abundance to assess the modelling. (13) Determination of FDR across all LASSO predictions. To estimate statistical significance of LASSO-derived protein–metabolite associations, we performed ordinary least squares (OLS) regression for each protein using the metabolites with non-zero coefficients from that protein’s LASSO model. A two-sided t-test was then used to obtain P values. Resulting P values were subjected to Benjamini–Hochberg correction to control global FDR q < 0.05 was determined significant. (14) Analysis of extreme outliers. For each tissue, we rank-ordered the values of LASSO coefficients for all protein–metabolite LASSO predictions. Extreme outliers were determined based on IQR. Values lying beyond 3 times the IQR below Q1 or above Q3 were determined to be ‘extreme’ outliers (Q1 − 3 × IQR and Q3 + 3 × IQR). (15) Assigning validation scores to metabolites. Metabolites in each tissue were ranked based on the number of LASSO predictors with literature evidence in Rhea, Reactome and TCDB. To allow for comparison between tissues, a score ranged from 1–10 was given to each metabolite in each tissue by linearly scaling the number of its LASSO associations with literature evidence. The metabolite with the highest number of established LASSO edges was scored 10, while metabolites with non-zero predictors but no edge with literature evidence were scored 1. For every metabolite, its validation score in BAT and validation score in liver were then summed up for an overall validation score. (16) Functional annotation of LASSO predictions. LASSO hits for each metabolite were mapped onto CORUM and BioPlex to examine known protein interactors. If a newfound LASSO protein predictor of a metabolite physically interacts with a protein known to regulate this metabolite via a local metabolic network based on Rhea, TCDB and Reactome, then this LASSO hit was annotated as potentially regulating the metabolite through the known network. For instance, LRRC58 is found to physically interact with CDO1 based on BioPlex, and CDO1 and hypotaurine are known to be involved in R-MMU-1614558 and R-HSA-1614558, Degradation of cysteine and homocysteine (the hypotaurine-taurine pathway). Therefore, the newfound LRRC58-Hypotaurine edge is annotated as ‘May act through CDO1 and R-HSA-1614558 (Degradation of cysteine and homocysteine);R-MMU-1614558 (Degradation of cysteine and homocysteine)’. In addition, we annotated LASSO protein predictors of metabolites based on whether the corresponding protein is a known metabolic enzyme⁸², transporter based on TCDB³³, or associated with mitochondrial function⁸¹. (17) ROC and precision-recall analysis. All correlations between proteins and metabolites in both liver and BAT were combined, and all non-zero LASSO coefficients identified in both tissues were merged to generate ROC and precision-recall curves. Threshold values for the curves were based on adjusted P values from pairwise correlation analysis, and FDR values from LASSO analysis. Analyses were done in Python. The ROC curves and area under the curve (AUC) were computed using the roc_curve and auc functions from scikit-learn, while the precision-recall curves and average precision were computed using precision_recall_curve and average_precision_score from the same library. The plots were visualized using Matplotlib and Seaborn. (18) Re-analysis of BioPlex AP–MS data. Unfiltered HEK 293T data were downloaded from https://BioPlex.hms.harvard.edu/interactions.php. Experiments with CDO1 and CUL5 as the bait proteins were extracted. CompPASS plus score and NWD score were calculated based on protein spectral counts in a given experiment and the frequency of this protein to appear in different AP–MS experiments, as detailed in BioPlex^36,84. (19) AlphaFold-Multimer modelling. Three-dimensional structures of protein complexes were predicted using the ColabFold⁸⁶ implementation of the AlphaFold-Multimer algorithm⁸⁷. In each case, proteins were represented by their canonical sequences as reported by Uniprot. Five models were returned for each AlphaFold analysis, and the top model was selected based on ipTM Score. Each model was submitted multiple times, varying the order in which the constituent proteins were listed, and the highest-scoring model was retained. (20) Co-evolution analysis. This analysis was performed using CladeOScope (https://tabachlab.shinyapps.io/CladeOScope/), a web application for co-evolution analysis based on clades⁵², which represent unbroken lines of evolutionary descent. This analysis is typically used to uncover functional interactions between genes and proteins that have co-operative functions. The CladeOScope score of genes that co-evolved with Lrrc58 or Cdo1 were downloaded and −log₁₀-transformed for plotting. A −log₁₀-transformed CladeOScope score of 0 represents the top co-evolving gene partner. (21) Analysis of enrichment of biological processes and disease networks in DO mice stratified by hepatic LRRC58 protein abundance. Mice with liver LRRC58 abundance in the top-10% of our DO cohort were compared with those in the bottom 10%. Proteins with a fold change between these populations larger than 2 or smaller than 0.5 with two-tailed t-test P value smaller than 0.05 were defined as significantly upregulated or down regulated. Enriched biological processes were determined using the GOrilla GO analysis tool⁸⁸. Significantly upregulated or downregulated proteins were used as the foreground, and all measured proteins in the liver were used as the background. Enrichment of disease networks were analysed based on protein involvement in disease networks downloaded from DisGeNET⁸⁹. We filtered the list to only include liver diseases and network contained at least 3 members. For each disease network, enrichment was examined using Fisher’s exact test with upregulated or downregulated proteins as the foreground, and all measured proteins in the liver as the background and only enrichment but not depletion was considered. The resulting P values from all tests were then subjected to multiple test correction using the Benjamini–Hochberg procedure⁸⁵, and any association with an adjusted P value < 0.05 was considered significant. (22) Analysis of SNPs of Lrrc58 and Cdo1. SNPs were found by comparing across all eight founder strains of the DO cohort, and C57BL/6J was set as the reference genome⁶¹. Based on the loci of the SNPs, potential consequences were assigned, such as upstream gene variant, missense variant or intron variant. SNPs with high confidence were included in this analysis, and data were downloaded from the Jackson Laboratory (https://churchilllab.jax.org/foundersnps/search#).

MPCA web application

The MPCA web application (https://mpca-chouchani-lab.dfci.harvard.edu/) was hosted on the Shiny R server. The application was written and developed with the Shiny R package, and data visualizations were made possible with the following packages: shiny, tidyverse, ggpubr, visNetwork, png, dqshiny, DT, gsubfn, shinyjs, glue, shinydashboard and plotly.

Statistical analyses

Quantification and statistical analysis pipelines are described in the sections above. Data analysis was performed in Excel, R, and GraphPad Prism. Data were expressed as mean ± s.e.m. unless otherwise noted, and P values were calculated using two-tailed Student’s t-test for pairwise comparison of variables. For enrichment analysis, Fisher’s exact tests were used. For correlation and enrichment analysis, P value was adjusted using the Benjamini–Hochberg procedure⁸⁵, and adjusted P < 0.05 was considered significant. Sample sizes were determined based on prior reports using similar methods^10,12. Unless otherwise stated, all stated replicates are biological replicates. For cell experiments, each biological replicate was originated from a shared genetically validated stock, independently plated, cultured for at least 48 h, and independently replated prior to the experiment. For MS analyses, sample order was randomized.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.