The ribosome lowers the entropic penalty of protein folding

August 7, 2024

201

Protein expression and purification

DNA constructs of FLN5 were previously described^7,11. Coding sequences for titin I27 and HRAS were introduced into the pLDC-17 vector using standard procedures. Further mutations were introduced using site-directed mutagenesis; for ¹⁹F labelling, amber stop codons were introduced⁶ in position 32 in HRAS, and residue 14 with an additional K87H point mutation in I27. FLN5 variants were expressed as His-tagged proteins and isotopically labelled in Escherichia coli BL21 DE3-Gold cells as previously described^6,7; an identical protocol was used to produce purified samples of I27 and HRAS. RNC constructs comprised an arrest-enhanced variant of the SecM stalling sequence, FSTPVWIWWWPRIRGPP, as previously described⁶. Purification of isolated FLN5 A₃A₃ was performed by affinity chromatography followed by size-exclusion chromatography in the presence of 6âM urea prior to buffer exchange into Tico buffer (10âmM Hepes, 30âmM NH₄Cl, 12âmM MgCl₂, 1âmM EDTA). The full protein sequence of the FLN5 A₃A₃ is deposited together with its chemical shift assignment on the BMRB (entry 51023). For the RDC, pulse-field gradient NMR (PFG-NMR) and PRE-NMR experiments, the additional mutation C747V (referred to FLN5 A₃A₃ V747) was introduced to yield a cysteine-less construct for site-specific spin labelling. The protein concentration was determined using the BCA assay according to the manufacturerâs instructions. RNCs were expressed, isotopically labelled uniformly with ¹⁵N, or site-specifically with ¹⁹F, and purified as previously described^6,7. For samples for intermolecular PRE-NMR experiments involving ribosome labelling, we generated modified E. coli BL21 strains with cysteine mutations in uL23 and uL24 using CRISPR as previously described². RNC samples were prepared in Tico buffer for experiments. Western blot analyses were undertaken with an anti-hexahistidine horseradish peroxidase-linked antibody (Invitrogen, 1:5,000 dilution).

Fluorescent and PEG-maleimide labelling of 70S and RNC samples

Ribosomes and RNCs were first reduced using 2âmM TCEP overnight at 277âK, then buffer exchanged into labelling buffer. For fluorescein-5-maleimide and PEG-maleimide, labelling was performed in Tico at pH 7.5. ABD-MTS labelling was performed in labelling buffer (50âmM HEPES, 12âmM MgCl₂, 20âmM NH₄Cl, 1âmM EDTA, pH 8.0). Samples were labelled using a 10x molar excess of ABD-MTS, or fluorescein-5-maleimide. Cysteine mass-tagging by PEGylation was performed as previously described with 10,000-fold molar excess of PEG over sample⁷. ABD-MTS and PEGylation reactions were analysed using 12% Bis-Tris SDSâPAGE gels⁶¹. The fluorescein-labelled reactions were run on a 20% Tricine SDSâPAGE gel, modified from ref. ⁶².

NMR spectroscopy

All NMR experiments were recorded with Topspin 3.5pl2. NMR experiments of FLN5 A₃A₃ were performed in Tico buffer at pH 7.5 and 283âK. Chemical shifts were previously assigned⁷ and obtained from data recorded on a Bruker Avance III operating at 700 and 800âMHz equipped with TCI cryoprobes. All samples contained 10% (v/v) D₂O and 0.001% (w/v) DSS as a reference. Data were processed analysed using NMRPipe⁶³ (v11.7), CCPN⁶⁴ (v2.4) and MATLAB (R2017a, Mathworks).

Amide ¹H and ¹⁵N chemical shifts were obtained from two-dimensional ¹Hâ¹⁵N SOFAST-HMQC experiments⁶⁵ using an acquisition time of 50âms in the direct dimension. The inter-scan delay was 50âms. CÎ± chemical shifts were obtained from 3D BEST-HNCA experiments recorded at 800âMHz with acquisition times of ~50âms and inter-scan delays of 150âms. Câ chemical shifts were obtained from BEST HNCO experiments recorded at 700âMHz using acquisition times of ~50âms and inter-scan delays of 200âms. RNC samples were doped with 20âmM NiDO2A (Ni(ii) 1,4,7,10-tetraazacyclododecane-1,7-bis(acetic acid)) to enhance sensitivity⁶⁶. Cosine-squared window functions were used in processing the spectra.

For PRE-NMR experiments, we used a cysteine-less construct with the C747V mutation and introduced six and eight labelling sites in the isolated and ribosome-bound protein, respectively. Samples were reduced overnight at 277âK in Tico supplemented with 2âmM TCEP. TCEP was then removed by buffer exchange into labelling buffer (50âmM HEPES, 12âmM MgCl₂, 20âmM NH₄Cl, 1âmM EDTA, pH 8.0) and subsequently labelled overnight at 277âK with 10Ã molar excess of MTSL. Following labelling, excess MTSL was removed by buffer exchanging the sample back into Tico buffer for NMR. The same labelling protocol was used for isolated protein and RNC samples. To measure the PREs, we recorded the signal intensities with MTSL in the paramagnetic and diamagnetic state. Direct measurements of relaxation rates proved not feasible for RNC samples due sensitivity limitations. 2D ¹Hâ¹⁵N SOFAST-HMQC experiments⁶⁵ were recorded at 800âMHz and 283âK using ~100âÎ¼M of protein or ~10âÎ¼M of RNC. Experiments were recorded with an acquisition time of 100âms and 35âms, in the direct and indirect dimension, respectively. The inter-scan delay was 450âms to allow for complete relaxation. To acquire the diamagnetic data, the sample was reduced with 2.5âmM (RNC) or 100Ã molar excess (isolated) sodium ascorbate. Following complete reduction, the same HMQC experiment was recorded. To extract the PREs, spectral peaks were first fitted to a Lorentzian shape in both the direct and indirect dimension using NMRPipe⁶³. Errors were obtained from the spectral noise (RMSE). From the fitted peaks, intensity ratios of I_para/I_dia were calculated and converted to PRE rates for Bayesian ensemble reweighting by numerically solving equation S34 (see Supplementary NotesÂ 3â4) for Î₂. Sample integrity was monitored using interleaved ¹H,¹⁵N SORDID diffusion measurements as previously described⁷.

PFG-NMR experiments were used to measure the diffusion coefficients and the R_h of FLN5 variants. 1D ¹H,¹⁵N-XTSE diffusion measurements were recorded at 700 (FLN5, FLN5 Î6) and 800âMHz (FLN5 A₃A₃). Eight to sixteen gradient strengths ranging linearly from 5% to 95% of the maximum gradient strength of 0.556âTâm^â1 were used. By measuring the signal intensity at each gradient strength, diffusion coefficients could be obtained by fitting the data to the StejskalâTanner equation⁶⁷, which were converted to R_h using the StokesâEinstein equation.

RDCs for isolated FLN5 A₃A₃ were measured in Tico buffer at 283âK and pH 7.5 in a PEG/octanol mixture⁶⁸. RDCs are reported as the splitting of the isotropic splitting subtracted from the aligned splitting, corrected for the negative gyromagnetic ratio of ¹⁵N. RDCs were measured by preparing a solution containing 4.6% (w/w) pentaethylene glycol monooctyl ether (C₈E₅), 1-octonal (molar ratio 1-octanol:C₈E₅â=â0.94) and 110âÎ¼M of protein. Alignment was confirmed by measuring the D₂O deuterium splitting at 283âK (17.6âHz). All RDC NMR experiments were acquired on a Bruker Avance III HD 800âMHz spectrometer equipped with a TCI cryoprobe. A set of four different RDCs (¹D_NH, ¹D_CÎ±CO, ¹D_CÎ±HÎ± and ²D_HNCO) was measured per sample (isotropic and anisotropic) using the 3D BEST HNCO (JCOH and JCC) or BEST-HNCOCA (JCAHA) experiments^69,70,71. The one-bond ¹Hâ¹⁵N coupling was determined by recording two ¹⁵N-HSQC sub-spectra, in-phase (IP) and anti-phase (AP). For the measurement of the ¹H-¹³CO coupling constants a BEST HNCO-JCOH experiment was used with an introduced DIPSAP filter. Such J-mismatch compensated DIPSAP spin-state filter offers an attractive approach for accurate measurement of small spinâspin coupling constants⁷². For that, three separate experiments were recorded with different filter lengths (2Ïâ=â1/J) for each anisotropic and isotropic media, where the sub-spectra associated to the separated spin states (two in phase and one anti-phase) are combined using a linear relation k (IP)â+â(kâââ1) (IP)âÂ±â(AP) with kâ=â0.73, the theoretical optimized scaling factor. The spectra were recorded with 144âÃâ104âÃâ1,536 complex points in the ¹³C(t₁)/¹⁵N (t₂)/¹H (t₃) dimensions, respectively, and with the spectral widths set to 15,244âHz (¹H), 2,070âHz (¹⁵N) and 1,510âHz (¹³C) for the HNCO-JCOH. For the HNCO-JCC and HNCOCA-JCAHA 256âÃâ200âÃâ1,536 complex points were acquired in the ¹³C(t₁)/¹⁵N (t₂)/¹H (t₃) dimensions, with spectral widths of 15,244âHz (¹H), 1,900 Hz (¹⁵N) and 1,214âHz/5050âHz (¹³C). The recycle delay was set to 200âms, the acquisition time to 100âms with 16 scans per increment, and the data was acquired in the non-uniform sampling format (2246 points for HNCO-JCOH and 7680 for the HNCO-JCC/HNCOCA-JCAHA experiments were sampled using the schedule generator from the web portal nus@HMS (http://gwagner.med.harvard.edu/intranet/hmsIST/). The time domain data was converted into the NMRPipe⁶³ format and reconstructed using the sparse multidimensional iterative lineshape-enhanced method (SMILE)⁷³. Coupling constants were obtained from line splitting in the ¹³C or ¹⁵N dimension obtained with CCPN analysis software⁶⁴.

¹⁹F NMR experiments were recorded on a 500âMHz Bruker Avance III spectrometer equipped with a TCI cryoprobe at 298âK (unless otherwise indicated) using a 350âms acquisition time and 1.5â3âs recycle delay as previously described⁶. We used an amber-suppression strategy to incorporate the unnatural amino acid tfmF, as previously described⁶. Multiple experiments were recorded in succession to monitor sample integrity over time also as previously described⁶. Data were processed using NMRPipe⁶³. Spectra were baseline corrected, peaks were fit to Lorentzian functions and errors of the linewidths and integrals (that is, populations) were estimated using bootstrapping (200 iterations, calculating the standard error of the mean), or from the spectral noise for states whose resonance was not detectable, in MATLAB⁶. ¹⁹F-translational diffusion experiments were performed as previously described⁶.

Thermodynamic parameters of folding (ÎH, ÎS and ÎC_p) were obtained from a nonlinear fit to a modified GibbsâHelmholtz equation, assuming ÎC_p remains constant across the experimental temperature range:

$${\rm{ln}}\left({K}_{{\rm{eq}},T}\right)=-\left(\frac{\Delta {H}_{{T}_{0}}+\Delta {C}_{{\rm{p}}}(T-{T}_{0})}{R}\right)\left(\frac{1}{T}\right)+\left(\frac{\Delta {S}_{{T}_{0}}+\Delta {C}_{{\rm{p}}}{\rm{ln}}\left(\frac{T}{{T}_{0}}\right)}{R}\right)$$

(S1)

K_eq is the equilibrium constant, T is the temperature in Kelvin and T₀ is the standard temperature (298âK). We also fitted the data to the linear vanât Hoff equation (assuming ÎC_pâ=â0).

$${\rm{ln}}\left({K}_{{\rm{eq}}}\right)=-\left(\frac{\Delta H}{R}\right)\left(\frac{1}{T}\right)+\frac{\Delta S}{R}$$

(S2)

The Scipy package with optimize.curve_fit function was used to perform the fits⁷⁴ and errors were estimated as one s.d. from the diagonal elements of the parameter covariance matrix. All parameters (ÎH, ÎS, ÎC_p) generally showed strong correlations with each other (rââ¥â0.8), and thus, their uncertainties correlate also. These parameter correlations are expected⁷⁵. The magnitudes of ÎH and âTÎS are also expected to correlate because we study the temperature dependence of folding in a range where ÎG of folding is close to 0.

Folding free energies were calculated from the experimental populations using ÎGâ=ââRTln(K). The folding free energy of the FLN5+67 wild-type RNC, where no unfolded state is observable, was estimated on the basis of two destabilizing mutants FLN5(V664A/F665A) and FLN5(V707A). The stability of these mutants was measured using ¹⁹F NMR on and off the ribosome. The FLN5+67 wild-type folding energy (ÎG_N-U) was then calculated as the average from the V664A/F665A and V707A mutants using ÎG_WT,+67â=âÎG_mut,+67âââÎÎG_mut-WT,iso, where ÎÎG_mut-WT,iso is the experimentally measured destabilization in isolation. Given that at FLN5+34, both mutants show a weaker destabilization than in isolation, we reasoned that this estimate of the FLN5+67 wild-type folding free energy is its lower bound (most negative).

¹⁹F transverse relaxation rate (R₂) measurements were recorded using a Hahn-echo sequence and acquired as pseudo-2D experiments with relaxation delays of 0.1 to 200âms. Data were processed using NMRPipe and analysed using MATLAB. Data were fit to lineshapes and R₂ was obtained by fitting the integrals to single exponential functions. We also orthogonally determined R₂ from linewidth measurements of spectra acquired by 1D ¹⁹F pulse-acquire experiments, which showed excellent correlations. The lineshape-derived R₂ values also showed good correlation with previously determined rotational correlation times²⁵ (Ï_C). We additionally determined the S²Ï_C of FLN5 in 60% glycerol at 278âK by measurements of triple quantum build-up and single quantum relaxation as previously described^76,77. Thus, our R₂ values can be used to determine rotational correlation times (Ï_C,exp). The obtained Ï_C,exp was used to estimate the bound population as $\frac{{\tau }_{{\rm{C}},\exp }-{\tau }_{{\rm{C,iso}}}}{{\tau }_{{\rm{C,bound}}}-\,{\tau }_{{\rm{C,iso}}}}$, where Ï_C,iso is the rotational correlation time of the isolated protein²⁵ (7.7âns at 298âK) and Ï_C,bound is the expected rotational correlation time of the bound state. Ï_C,bound is taken as the rotational correlation time of the ribosome itself (~3,000âns at 298âK) for a fully rigid bound state. From the bound populations (p_B), the resulting change in the folding free energies of the intermediates was calculated as ÎÎG_I-U,RNC-isoâ=âRT(ln(1âââp_B)). We report the estimate for a fully rigid bound state in the main text (${S}_{{\rm{bound}}}^{2}=1.0$) but note that even one order of magnitude more flexibility in the bound state (${S}_{{\rm{bound}}}^{2}=0.1$) only accounts for up to 1.1âÂ±â0.6 and 0.4âÂ±â0.1âkcalâmol^â1 of stabilization for I1 and I2 on the ribosome at FLN5+47, respectively. These estimates still cannot account for the >4âkcalâmol^â1 of intermediate stabilization observed on the ribosome⁶.

All NMR experiments of RNCs are recorded and continuously interleaved with a series of 1D ¹H/19âF spectra and ¹H,¹⁵N/¹⁹F diffusion measurements^6,7,61,78. These provide the most sensitive means to assess changes in the sample, and when alterations in signal intensities or linewidths (that is, transverse relaxation rates), chemical shifts or translational diffusion measurements of the nascent chain are observed, data acquisition is halted. Only data corresponding to intact RNCs are summed together and subjected to a final round of analysis. Where signal-to-noise remains low, datasets from multiple samples are compared to ensure identical spectra, before summation together into a single NMR spectrum. Biochemical assays provide an orthogonal means to assess nascent chain attachment to the ribosome. Identical samples incubated in parallel with NMR samples are analysed by SDSâPAGE (under low pH conditions⁶¹) and detected with nascent-chain-specific antibodies. Ribosome-bound species migrate with an addition ~17-kDa band-shift relative to released nascent chains due to the presence of the tRNA covalently linked to the nascent chain. Combined with time-resolved NMR measurements, these analyses confirm that the reported NMR resonances originate exclusively from intact RNCs.

Mass spectrometry

FLN5 A₃A₃ was buffer exchanged into 100âmM (NH₄)₂CO₃ at pH 6.8 (using formic acid for pH adjustment). Analyses were run on the Agilent 6510 QTOF LCâMS system at the UCL Chemistry Mass Spectrometry Facility. Samples contained ~10â20âÎ¼M of protein and 10âÎ¼l were injected onto a liquid chromatography column (PLRP-S, 1,000âÃ, 8âÎ¼m, 150âmmâÃâ2.1âmm, maintained at 60âÂ°C). The liquid chromatography was run using water with 0.1% formic acid as mobile phase A and acetonitrile with 0.1% formic acid as phase B with a gradient elution and a flow rate of 0.3âmlâmin^â1. ESI mass spectra were continuously acquired. The data were processed to zero charge mass spectra with the MassHunter software, utilizing the maximum entropy deconvolution algorithm.

Small-angle X-ray scattering

We measured SAXS of an isolated FLN5 A₃A₃ C747V sample in Tico buffer supplemented with 1% (w/v) glycerol. Data collection was performed at the DIAMOND B21 beamline (UK)⁷⁹ with a beam wavelength of 0.9408âÃ, flux of 4âÃâ10¹²âphotonsâs^â1 and an EigerX 4âM (Dectris) detector distanced at 3.712âm from the sample. A capillary with a 1.5âmm diameter kept at 283âK was used for data acquisition. We acquired SAXS data at multiple protein concentrations (5.5, 2.75, 1.38, 0.69, 0.34 and 0.17âmgâml^â1) to assess whether the sample exhibited signs of aggregation or interparticle interference. At 5âmgâml^â1, we observed weak signs of interparticle interference in the low q region of the scattering profile, which is also reflected in the R_g obtained by Guinier analysis (using the autorg tool from ATSAS⁸⁰; Supplementary Table 1). Data were recorded as a series of frames, non-defective frames were averaged, and buffer subtracted with PRIMUS⁸⁰. Size-exclusion chromatographyâSAXS (SECâSAXS) experiments were additionally performed in Tico buffer with 1% (w/v) glycerol using a KW402.5 (Shodex) column to confirm the monodispersity of the sample. We chose the 2.75âmgâml^â1 dataset as the final dataset to compare with our molecular dynamics simulations. This dataset exhibited the highest signal to noise ratio and did not show signs of interparticle interference, and accordingly, the R_g obtained from the 2.75âmgâml^â1 dataset is consistent with the value obtained from lower concentrations and the main SECâSAXS peak (Supplementary Table 1).

Circular dichroism spectroscopy

The circular dichroism (CD) spectrum of isolated FLN5 A₃A₃ V747 was acquired in 10âmM Na₂HPO₄ pH 7.5 at 283âK. A Chirascan-plus CD spectrometer (Applied Photophysics), a protein concentration of 44âÎ¼M and a cuvette with a 0.5âcm pathlength were used.

HRAS refolding experiments

HRAS refolding experiments were performed with the HRAS G-domain (residues 1â166). The protein was unfolded overnight at 298âK in Tico buffer with 2âmM Î²-mercaptoethanol, 8âM urea and protein concentration of 15âÎ¼M. The protein was then refolded by rapidly diluting into Tico buffer (supplemented with 2âmM Î²-mercaptoethanol and 50âÎ¼M GDP) to reach final urea and protein concentrations of 0.94âM and 1.76âÎ¼M, respectively, and allowed to incubate at 298âK for 24âh. For NMR analyses of refolded samples, we prepared 18âÎ¼M of refolded protein with the same urea concentrations and dilutions.

We assayed the functional/activity state of HRAS using GDP/GTP nucleotide exchange (âactivityâ) assay⁸¹ with fluorescently labelled GTP that exhibits higher fluorescence when bound to HRAS than free in solution (BODIPY FL GTP, ThermoFisher). 0.4âÎ¼M of HRAS, 0.01âÎ¼M of BODIPY GTP and 1âÎ¼M of SOS_cat (the catalytic domain of Son of sevenless) were incubated at room temperature and the maximum (plateau) fluorescence recorded and normalized by the signal of the buffer (signal/noise ratio). SOS_cat was produced as previously described⁸². Fluorescence measurements were performed using the CLARIOstar microplate reader (BMG Labtech) with excitation and emission wavelengths set to 488 and 514ânm, respectively.

The proteolytic stability of HRAS was assayed with thermolysin at a concentration of 0.05âmgâml^â1 incubated with HRAS samples over the course of 5âh in vitro and 9âh in rabbit reticulocyte lysate (RRL, TNT coupled reticulocyte lysate, Promega). Reactions were quenched with 23âmM EDTA. Timepoints were analysed by western blot analysis using a pan-RAS polyclonal antibody (ThermoFisher, 1:1,000 dilution), utilizing an anti-rabbit IgG horseradish peroxidase-linked secondary antibody (Cell Signaling Technology, 1:1,000 dilution). Densitometry analyses were performed with ImageJ⁸³. For the RRL experiments, refolding reactions were performed in RRL for 24âh at 298âK and a final HRAS concentration of 1.6âÎ¼M followed by pulse proteolysis and we quantified the relative band intensities (refolded/control) for each time point to account for increased background on the western blot during the proteolysis reaction.

Molecular dynamics simulations

We used the FLN5 A₃A₃ C747V sequence for all simulations. A reliability and reproducibility checklist is provided in Supplementary Table 8. GROMACS (version 2021)⁸⁴ was used for all all-atom molecular dynamics simulations in explicit solvent. We employed the Charmm36m force field in combination with the CHARMM TIP3P water model (C36m) and the CHARMM TIP3P water model with an increased water hydrogen LJ well-depth (denoted here as C36m+W)⁸⁵. We also used the a99sb-disp force field together the a99sb-disp TIP4P-D water model⁸⁶. Default protonation states were used in all cases. Starting from a random extended conformation, for all force field combinations the system was solvated in a dodecahedron box with 151,135 water molecules and 12âmM MgCl₂ (resulting in 455,116 atoms and an initial box volume of 4,688ânm³). Systems were then energy minimized using the steepest-decent algorithm. For the following dynamics simulations, we used the LINCS algorithm⁸⁷ to constrain all bonds connected to hydrogen and a timestep of 2âfs using the leap-frog algorithm for integration. Nonbonded interactions were calculated with a cut-off at 1.2ânm (including a switching function at 1.0ânm for van der Waals interactions) and the particle mesh Ewald (PME) method⁸⁸ was used for long-range electrostatic calculations. We then equilibrated the systems in two phases. First, we performed a 500âps equilibration simulation in the NVT ensemble with position restraints on all protein heavy atoms. The temperature was kept at 283âK using the velocity rescaling algorithm⁸⁹ and a time constant of 0.1âps. Next, we further equilibrated the systems for 500âps in the NPT ensemble at 283âK and a pressure of 1âbar with a compressibility of 4.5âÃâ10^â5âbar^â1 using the Berendsen barostat⁹⁰. Following equilibration, we relaxed our initial structure for 100âns at 283âK without any position restraints using the ParrinelloâRahman algorithm⁹¹ and then picked five structures from this simulation for production simulations. We ran a total of 5Ãâ2âÎ¼s (with different initial coordinates and velocities) yielding a total of 10âÎ¼s of sampling per force field. For the C36m+W combination we ran an additional 5Ãâ2âÎ¼s starting from 5 new starting structures yielding 20âÎ¼s in total.

We also generated a prior ensemble with a physics-based coarse-grained (C-alpha) model. We generated the C-alpha model template from the FLN5 crystal structure using SMOG 2.3⁹², where all bonded terms have a global energy minimum at the values taken in the crystal structure⁹³. Nonbonded van der Waals interactions were modelled using a 10â12 Lennard-Jones potential with Ï and Î» parameters described in the M1 parameter determined by Tesei et al.⁹⁴ (equation (S3)). We used the arithmetic mean of two residues to determine Ï and Î». Electrostatic interactions were modelled using the DebyeâHÃ¼ckel theory with parameters described previously⁷. Interactions between CÎ± beads separated by less than four residues were excluded. We ran initial simulations at a range of reduced temperatures to determine the effect on the average compactness and ran final simulations at a reduced temperature of 1.247 (150âK in GROMACS) as we did not observe a significant increase in average R_g beyond this temperature. Simulations were run for a total of 3âÃâ10⁹ steps with GROMACS (v2018.3).

$${u}_{{\rm{LJ}}}=\mathop{\sum }\limits_{i}^{N}\lambda \left[{5\left(\frac{\sigma }{r}\right)}^{12}-6{\left(\frac{\sigma }{r}\right)}^{10}\right]$$

(S3)

After simulations, the coarse-grained ensemble was backmapped to an all-atom structure using PULCHRA (v3.06)⁹⁵.

RNC simulations were parameterized using the C36m+W force field/water model combination^85,96. We modelled the ribosome using the structure PDB 4YBB⁹⁷ as a template, which we previously refined against a cryo-EM map containing an FLN5 RNC⁹⁸. As in our previous work, we only retained ribosome atoms around the nascent chain exit tunnel and accessible surface outside the vestibule⁷. The FLN6 linker and SecM sequence were initially modelled using a cryo-EM map of a FLN5+47 RNC (Mitropoulou et al., manuscript in preparation). The rest of the nascent chain (MHHHHHAS N-terminal tag and FLN5) was then built using PyMol version 2.3 (The PyMol Molecular Graphics System, SchrÃ¶dinger) and we generated a random initial starting structure with a short simulation using a structure-based force field, SMOG2.3⁹², without native contacts. The FLN5+31 A₃A₃ RNC (containing the C747V mutation) complex was then centred in a dodecahedral box, solvated using 1,030,527 water molecules and neutralized with 706 Mg²⁺ ions, resulting in a final system size of 3,163,127 atoms. The initial box volume was 32,117ânm³. The large box size was necessary to accommodate the highly expanded unfolded state. We then used the same cut-offs and simulation methods as for the isolated protein. We initially also ran a 500âps equilibration simulation in the NVT ensemble using position restraints on all heavy atoms using a force constant of 1,000âkJ mol^â1 nm² in along the x, y and z axes. We used a temperature of 283âK, which was held constant using the velocity rescaling algorithm⁸⁹ and a time constant of 0.1âps. Then, we ran a 500âps equilibration simulation in the NPT ensemble at 283âK using the same position restraints. The pressure was kept at 1âbar with a compressibility of 4.5âÃâ10^â5âbar^â1 using the Berendsen barostat⁹⁰. The position restraints for all nascent chain atoms (except the terminal residue at the PTC in the ribosome) were then removed, while the ribosome atoms kept being position restrained. In this setup, we ran a 1âns equilibration simulation at 283âK and 1âbar, using the ParrinelloâRahman algorithm⁹¹. All production simulations were performed using position restraints for the ribosome atoms and C-terminal nascent chain residue at the PTC. Using the equilibrated configuration, we then ran two simulations of ~100âns to picked ten starting structures for production simulations. Then, ten production simulations of 1.5âÎ¼s each (15âÎ¼s) were initiated from these different starting structures using random initial velocities. Before the production simulation, each structure was re-equilibrated at 283âK and 1âbar with a 500âps NVT and 500âps NPT simulation.

Lastly, to compare our C36m+W simulations with a model that only considers steric exclusion as a nonbonded interaction, we also ran simulations of a simple all-atom model, based on a structure-based model template⁹². We used the FLN5 crystal structure to define the energy minima of all bond and dihedral angles and removed all native contacts. Simulations of isolated and ribosome-bound FLN5 A₃A₃ were run for 1âÃâ10⁹ steps and 100,000 frames were harvested for analysis. This ensemble was used to compare the expansion of the ensemble, ribosome interactions and conformational entropy with the C36m+W simulations.

Calculation of PREs

The transverse PRE rates of backbone amide groups, Î₂, were back-calculated from the ensembles using the SolomonâBloembergen equation^99,100

$${\varGamma }_{2}=\frac{1}{15}{\left(\frac{{\mu }_{0}}{4\pi }\right)}^{2}{\gamma }_{{\rm{H}}}^{2}\,{g}_{{\rm{e}}}^{2}{\mu }_{{\rm{B}}}^{2}S(S-1)[4J(0)+3J({\omega }_{{\rm{H}}})]$$

(S4)

where Î¼₀ is the permeability of space, Î³_H is the gyromagnetic ratio of the proton, g_e is the electron g-factor, Î³_B is the Bohr magneton, S is the proton nuclear spin and J(Ï₀) is the generalized spectral density function. For flexible spin labels attached via rotatable bonds the spectral density can be expressed as in equation (S5)¹⁰¹.

$$J({\omega }_{{\rm{H}}})=\langle {r}^{-6}\rangle \left[\frac{{S}^{2}{\tau }_{{\rm{c}}}}{1+{({\omega }_{{\rm{H}}}{\tau }_{{\rm{c}}})}^{2}}+\frac{(1-{S}^{2}){\tau }_{{\rm{t}}}}{1+{({\omega }_{{\rm{H}}}{\tau }_{{\rm{t}}})}^{2}}\right]$$

(S5)

where $\langle {r}^{-6}\rangle $ is the average of the electronâhydrogen distance (r) distribution, S² is the generalized order parameter for the electronâhydrogen interaction vector, Ï_C is the correlation time defined in terms of the rotational correlation time of the protein (Ï_r) and the electron spin relaxation time (Ï_s):

$${\tau }_{c}={\left({\tau }_{r}^{-1}+{\tau }_{s}^{-1}\right)}^{-1}$$

(S6)

Ï_t is the total correlation time defined as:

$${\tau }_{{\rm{t}}}={\left({\tau }_{r}^{-1}+{\tau }_{s}^{-1}+{\tau }_{{\rm{i}}}^{-1}\right)}^{-1}$$

(S7)

Ï_i is the internal correlation time of the spin label. Since for nitroxide labels electron spin relaxation occurs on a much slower timescale than rotational tumbling^101,102, Ï_C can be approximated to Ï_r such that expression for Ï_t simplifies to

$${\tau }_{{\rm{t}}}={\left({\tau }_{{\rm{C}}}^{-1}+{\tau }_{i}^{-1}\right)}^{-1}$$

(S8)

Given that Ï_C is not known a priori, we iteratively scanned Ï_C values in the range of 1 to 15âns to find a value for which optimal agreement with the experimental data is achieved (as judged by the reduced Ï²)^94,103. The spin label correlation time¹⁰⁴, Ï_i was set to 500âps, in agreement with molecular dynamics simulations¹⁰⁵ and electron spin resonance measurement¹⁰⁶.

The generalized order parameter S² for the electronâhydrogen interaction vector can be decomposed into its radial and angular components¹⁰⁷:

$${S}_{{\rm{PRE}}}^{2}\approx {S}_{{\rm{PRE}},{\rm{angular}}}^{2}{S}_{{\rm{PRE}},{\rm{radial}}}^{2}$$

(S9)

where the individual components are defined as

$${S}_{{\rm{PRE,angular}}}^{2}=\frac{4\pi }{5}\mathop{\sum }\limits_{m=-2}^{2}{\left|\langle {Y}_{2}^{m}({\varOmega }^{{\rm{mol}}})\rangle \right|}^{2}$$

(S10)

$${S}_{{\rm{PRE}},{\rm{radial}}}^{2}=\langle {r}^{-6}{\rangle }^{-1}\langle {r}^{-3}{\rangle }^{2}$$

(S11)

and ${Y}_{2}^{m}$ are the second order spherical harmonics and Î©^mol are the Euler angles in the frame. A weighted ensemble average of S² can be calculated by taking a weighted ensemble average of the individual radial and angular components.

A previously published rotamer library containing 216 MTSL rotamers¹⁰⁸ was used to explicitly model the flexibility of the spin label, similar to other existing methods^109,110. The rotamer library was aligned to all employed labelling sites for each conformer using the backbone atoms of the labelling site and Cys-MTSL moiety. Clashing rotamers were discarded, where a steric clash between the rotamer and the protein was defined using a 2.5âÃ cut-off distance. Only backbone and CÎ² atoms were considered for the protein, assuming sidechains can rearrange to accommodate the MTSL rotamer¹¹¹. For MTSL, only the sidechain was included (heavy atoms beyond the CÎ² atom). Protein frames for which at least one labelling position cannot sterically allow any MTSL rotamers were discarded. The rotamer library was used to calculate a weighted ensemble-averaged Î₂ over the rotamer ensemble for each protein conformer in the protein ensemble using equations (S3âS11). The protein ensemble average can then be calculated by averaging Î₂ over the ensemble.

PRE intensity ratios were then calculated from the ensemble-averaged PRE rate, $\langle {\varGamma }_{2}\rangle $, using

$$\frac{{{\rm{I}}}_{{\rm{p}}{\rm{a}}{\rm{r}}{\rm{a}}}}{{{\rm{I}}}_{{\rm{d}}{\rm{i}}{\rm{a}}}}=\frac{{R}_{2}{{\rm{e}}}^{-2\Delta \langle {\varGamma }_{2}\rangle }}{{R}_{2}+\langle {\varGamma }_{2}\rangle }\times \frac{{R}_{2,{\rm{M}}{\rm{Q}}}}{{R}_{2,{\rm{M}}{\rm{Q}}}+\langle {\varGamma }_{2}\rangle }$$

(S12)

where R₂ is the linewidth in the proton dimension (residue-specific), R_2,MQ is the linewidth in the nitrogen dimension (multiple-quantum term) and Î is the delay time in the HMQC experiment (5.43âms). See Supplementary NoteÂ 3 for additional details.

For RNCs, we considered that that ribosome tethering may increase the correlation time of the electronâamide interaction vector due to restricted molecular tumbling near the exit tunnel. We therefore calculated an order parameter, ${S}_{{\rm{NC}}}^{2}$, which quantifies the motion of the electron-interaction vector over the entire nascent chain conformer ensemble (${S}_{{\rm{NC}}}^{2}$ is distinct from the order parameter S² that quantifies the motion of the MTSL rotamer library attached to a labelling site for a specific protein conformer; equation (S9)). S² is given by

$${S}_{{\rm{NC}}}^{2}\approx {S}_{{\rm{NC,angular}}}^{2}{S}_{{\rm{NC,radial}}}^{2}$$

(S13)

where ${S}_{{\rm{NC}},{\rm{angular}}}^{2}$ and ${S}_{{\rm{NC}},{\rm{radial}}}^{2}$ are given by

$${S}_{{\rm{PRE}},{\rm{angular}}}^{2}=\frac{4\pi }{5}\mathop{\sum }\limits_{m=-2}^{2}{| \langle {Y}_{2}^{m}({\varOmega }^{{\rm{mol}}})\rangle | }^{2}$$

(S14)

$${S}_{{\rm{PRE}},{\rm{radial}}}^{2}=\langle {r}^{-6}{\rangle }^{-1}\langle {r}^{-3}{\rangle }^{2}$$

(S15)

and ${Y}_{2}^{{\rm{m}}}$ are the second order spherical harmonics and Î©^mol are the Euler angles in the frame. We approximated the position of the free electron with the CÎ± atom of the labelling site in this case. A ${S}_{{\rm{NC}}}^{2}$ value of 0 indicates that the vector tumbles completely independent of the ribosome and that the correlation time of the electronâamide vector is the same as for the isolated protein, Ï_C,iso. A ${S}_{{\rm{NC}}}^{2}$ value of 1 means that the vector tumbles with the same rotational correlation time as the ribosome (Ï_r,70Sâ=â3.3âÎ¼s per cP, as determined by fluorescence depolarization¹¹², and Ï_r,70Sâ=â4.3âÎ¼s at 283âK in H₂O). The effective correlation time, Ï_C,eff, of each amide-electron vector is given by

$${\tau }_{{\rm{C,eff}}}\,={S}_{{\rm{NC}}}^{2}{\tau }_{{\rm{r,70S}}}+(1-{S}_{{\rm{NC}}}^{2}){\tau }_{{\rm{C,iso}}}$$

(S16)

We used a value of 3âns for Ï_C,iso, which was the optimal value determined for isolated FLN5 A₃A₃. Generally, Ï_C (equation (S6)) is approximated as Ï_CâââÏ_r because the electron spin relaxation time, Ï_s, occurs on a much slower timescale. In fact, measurements of the spin relaxation time of nitroxides have been measured to be on a timescale from hundreds of nanoseconds to several microseconds^113,114,115. The calculated values of Ï_C,eff are predominantly below 100âns except for labelling sites C744, uL23 G90C and uL24 N53C, where values of up to ~250âns are observed (Supplementary Tables 5 and 6). Thus, we still expect Ï_C to be dominated by Ï_r and make use of the Ï_CâââÏ approximation.

Finally, reference PRE profiles for a fully extended peptide were calculated from a linear polyalanine chain using a Ï_C of 5âns and R_2,H/R_2,MQ of 100âHz.

Bayesian inference reweighting

We performed ensemble refinement by reweighting the molecular dynamics-derived ensembles against the experimentally deduced Î₂ rates using the Bayesian Inference of Ensembles (BioEn) software and method described in the corresponding paper^116,117. These calculations were performed using in-house scripts of the software with the modification to incorporate upper and lower bound restraints in addition to regular restraints with gaussian errors. To this end, these inequality restraints were treated as normal gaussian restraints but subjected to a conditional statement. Lower bound restraints (Î₂â>â64.5âs^â1 for isolated FLN5 A₃A₃; Î₂â>â96.0âs^â1 for the RNCs) were applied only if the back-calculated Î₂ was below the lower bound value. Similarly, upper bound restraints (Î₂â<â2.2âs^â1 for isolated FLN5 A₃A₃; Î₂â<â3.7âs^â1 for the RNCs) were applied only if the back-calculated average was above the upper bound. This effectively allows the back-calculated value to vary freely above the lower bound and below the upper bound but imposes a penalty if the inequality condition is not met. The errors of the lower and upper bound values were taken as the combined relative error of that datapoint (that is, the intensity ratio).

As described by KÃ¶finger et al.¹¹⁷, the reweighting optimization problem can be efficiently solved by minimizing the negative log-posterior function (L).

$$L=\theta {S}_{{\rm{KL}}}+\mathop{\sum }\limits_{i=1}^{M}\frac{{\left({\sum }_{\alpha =1}^{N}{w}_{\alpha }{y}_{i}^{\alpha }-{Y}_{i}\right)}^{2}}{2{\sigma }_{i}^{2}}$$

(S17)

Î¸ expresses the confidence in the initial ensemble, N is the ensemble size, M is the number of experimental restraints, w_Î± is the vector of weights for the conformers in the ensemble, ${y}_{i}^{\alpha }$ is the back-calculated experimental value i, Y_i is the experimental restraint i, Ï_i is the uncertainty of experimental restraint i, and S_KL is the KullbackâLeibler divergence defined as

$${S}_{{\rm{KL}}}=\mathop{\sum }\limits_{\alpha =1}^{N}{w}_{\alpha }{\rm{ln}}\frac{{w}_{\alpha }}{{w}_{\alpha }^{0}}$$

(S18)

${w}_{\alpha }^{0}$ is the vector of initial weights (which were uniform). We used the log-weights method to minimize the negative log-posterior¹¹⁷ and performed reweighting calculations for a range of Î¸ values, as the optimal value of Î¸ cannot be known a priori. Therefore, we conduct L-curve analysis^117,118 by plotting S_KL (entropy) on the x axis and the goodness of fit, quantified by the reduced Ï² value, on the y axis. The reduced Ï² was calculated against the experimental intensity ratios (I_para/I_dia). This is an effective method to prevent overfitting and introducing a minimal amount of bias into the prior ensemble^117,119. After reweighting, we also calculated the effective fraction of frames contributing to the ensemble average¹¹⁹ as an indication of the extent of fitting.

$${N}_{{\rm{eff}}}=\exp (-{S}_{{\rm{KL}}})$$

(S19)

For RNCs, we used the same approach with an additional modification. Since the PRE depends on Ï_C,eff and ${S}_{{\rm{NC}}}^{2}$ which are a function of the weights of individual structures in the ensemble, this consequently leads to changes in Ï_C,eff and ${S}_{{\rm{NC}}}^{2}$ when reweighting is performed. Therefore, the conformer-specific PRE values that were used for reweighting are not the same anymore after reweighting. To account for this, we performed 20 iterative rounds of reweighting where each additional round receives input weights and Ï_C,eff from the previous round. We found that this leads to convergence of the weights and conformer-specific PREs.

We found that for the ribosomal labelling sites, uL23 G90C and uL24 N53C, the reweighting results are sensitive to the specific ribosome structure used to fit the MTSL rotamer library to, since small variations in the local structure of the labelling site can lead to different rotamer distributions. We tested two different rotamer distributions for the ribosomal labelling sites (Extended Data Fig. 5a), finding that one of them (referred to as R2) gives better agreement with the intermolecular PRE data after reweighting and fits better into the expected density or MTSL rotamers when rotamers are fitted to ten high-resolution ribosome structures (Extended Data Fig. 5a). The R2 rotamer distribution is more representative of the expected variation from structural changes in the labelling sites and was therefore used for our final reweighting calculations.

Calculation of RDCs, R
_h and chemical shifts from molecular dynamics simulations

To back-calculate the R_h from static structures we used an approximate relationship between R_g and R_h values¹²⁰, the latter being calculated from the programme HYDROPRO¹²¹. Thus, we calculated the R_g from CÎ± atoms using MDAnalysis¹²² and then converted it to R_h using

$${R}_{{\rm{h}}}=\frac{{R}_{{\rm{g}}}}{\frac{{\alpha }_{1}\left({R}_{{\rm{g}}}-{\alpha }_{2}{N}^{0.33}\right)}{{N}^{0.60}-{N}^{0.33}}+{\alpha }_{3}}$$

(S20)

N is the number of amino acids, Î±₁ takes a value of 0.216âÃ^â1, Î±₂ takes a value of 4.06âÃ, and Î±₃ has a value of 0.821. The estimated value of R_h (relative to the HYDROPRO calculation) has an average relative uncertainty¹²⁰ of 3%. HYDROPRO itself has a relative uncertainty of Â±4% with respect to experimental values¹²¹. Therefore, we treat the back-calculated ensemble-average R_h with a total relative uncertainty of Â±5%. The ensemble average was calculated as previously described by Ahmed et al. for back-calculation of PFG-NMR derived values¹²³ of R_h

$$\langle {R}_{{\rm{h}}}\rangle ={\rm{ln}}{\left(\langle \exp \left(-{R}_{{\rm{h}}}^{-1}\right)\rangle \right)}^{-1}.$$

(S21)

Chemical shifts were calculated using the SHIFTX2 software¹²⁴ and RDCs were calculated using the global alignment prediction method implemented in PALES¹²⁵. We then scaled the magnitude (that is, the extent of alignment) of the calculated RDCs by a global factor to optimize the Q-factor for each ensemble.

Calculation of SAXS profiles from molecular dynamics simulations

We used Pepsi-SAXS¹²⁶ to compute the theoretical scattering profiles of each conformer in the molecular dynamics ensembles. We treated the contrast of the hydration layer (Î´_p) and the effective atomic radius (r₀) as global parameters and used values of 3.34âe^âânm^â3 and 1.025âÃâr_m (r_mâ=âaverage atomic radius of the protein) in line with previous work that showed these parameters to well suited for flexible proteins¹²⁷. The constant background and scale factor were also fitted globally using least-squares regression^103,127. The goodness of fit was assessed using the reduced Ï² metric, where n is the number of datapoints, q is the scattering angle, ${I}_{q}^{{\rm{calc}}}$ and ${I}_{{q}}^{\text{exp}}$ are the calculated and experimental scattering intensities, respectively, and Ï_q is the experimental error:

$${\chi }_{r}^{2}=\frac{1}{n}\mathop{\sum }\limits_{{q}}^{n}\frac{{({I}_{{q}}^{\text{calc}}-{I}_{{q}}^{\text{exp}})}^{2}}{{\sigma }_{{q}}^{2}}$$

(S22)

Structural analysis

The Python package MDAnalysis¹²² and MDTraj¹²⁸ were used for general analysis of the ensembles involving atomic coordinates. For native contact analysis, we calculated the fraction of native contacts (relative to the native FLN5 crystal structure) as¹²⁹

$$Q(X)=\frac{1}{N}\sum _{ij}\frac{1}{1+{{\rm{e}}}^{\left(\beta \left({r}_{i,j}-\lambda {r}_{i,j}^{0}\right)\right)}}$$

(S23)

where r_i,j and ${r}_{i,j}^{0}$ are the distances between atoms i and j in frame X and the template structure, respectively, Î² modulates the smoothness of the switching function (default value 5âÃ^â1 used) and Î» is a factor allowing for fluctuations of the contact distance (default value 1.8 used).

Asphericity was calculated using MDAnalysis¹²² as defined by Dima and Thirumalai¹³⁰:

$$\Delta =\frac{3}{2}\frac{{\sum }_{n=1}^{3}{\left({\lambda }_{i}-\bar{\lambda }\right)}^{2}}{tr{T}^{2}}$$

(S24)

$\bar{\lambda }$ represents the mean eigenvalue obtained from the inertia tensor $\bar{\lambda }=\,\frac{{\rm{trT}}}{3}$.

For the intrachain contact analysis, we defined contacts between CÎ±âCÎ± distances of less than 10âÃ. The contact features qualitatively were unchanged when using lower cut-off values or when calculating contacts between all heavy atoms. Secondary structure populations were calculated using DSSP¹³¹ implemented in MDTraj. The SASA was calculated using GROMACS⁸⁴. Clustering was also performed in GROMACS using the GROMOS algorithm¹³² and CÎ± RMSD cut-offs in the range of 1.2â1.8ânm.

Error analysis from ensembles

Errors from the molecular dynamics ensembles were estimated using a block analysis of the full concatenated ensembles (composed of multiple statistically independent simulations). We performed block analysis for the concatenated ensembles to verify that the estimate of the standard error of the mean (s.e.m.) plateaus/fluctuates at block sizes larger than blocks corresponding to the individual trajectories. The final block size was chosen either in the plateau region of block analysis plots or corresponding to the blocks of the statistically independent simulation (10 independent simulations were run for the isolated and RNC systems with the C36m+W force field, and thus 10 blocks were chosen for block analysis and error estimation). The error after reweighting with PRE-NMR data was calculated the same way using a weighted standard error, where blocks are weighted according to the weights obtained from reweighting with PRE-NMR data. Exemplar block analysis plots are shown in Supplementary Fig. 8.

Energetic analyses from structural ensembles

The conformational entropy was calculated as defined by Baxa et al.¹³³. Proline, glycine and alanine entropies were calculated from the backbone probability distribution P_i(Î¦,Î¨). Residues with a maximum of two sidechain torsion angles, X_n, the entropy was calculated from the probability distribution P_i(Î¦,Î¨,X₁,X₂), while residues with more sidechain torsion angles was calculated from the sum of entropies obtained using the P_i(Î¦,Î¨,X₁), and P_i(X_n), after subtraction of the entropy obtained from P_i(X₁). Entropies were calculated from probability distributions using $S={-k}_{{\rm{B}}}{\sum }_{i=1}^{n}{P}_{i}{\rm{ln}}({P}_{i})$. We used a block analysis from the pooled ensembles (i.e., all individual trajectories concatenated together) to check that the entropy difference between on and off the ribosome is robust with respect to sampling by calculating entropy changes with increasing amounts of total sampling (from the 15 and 20Î¼s of concatenated sampling for the RNC and isolated protein, respectively). The errors were then also estimated from the same sampling/block sizes up to 7.5âÎ¼s of molecular dynamics sampling. This is because the estimate of entropy differences trend increases up to total sampling times of 7.5âÎ¼s (Extended Data Fig. 6g).

The energetic contributions due to changes in solvation were estimated based on empirical relationships between changes in the polar and apolar accessible surface area^75,134 (ÎASA_polar and ÎASA_apolar). The apolar and polar surface area of the protein were defined based on the atomic partial charges in the C36m force field⁸⁵. Atoms with an absolute charge of less than or equal to 0.3 were defined as apolar. The change in heat capacity of hydration is related to these quantities by

$$\Delta C=\Delta {C}_{{\rm{apolar}}}+\Delta {C}_{{\rm{polar}}}=\alpha \times \Delta {{\rm{ASA}}}_{{\rm{apolar}}}+\beta \times \Delta {{\rm{ASA}}}_{{\rm{polar}}}$$

(S25)

where Î± and Î² are 0.34âÂ±â0.11 and â0.12âÂ±â0.12âcalâmol^â1âK^â1âÃ^â2, respectively. We obtained these values as an average and standard deviation of parameters previously reported in the literature as summarized in ref. ¹³⁵ to account for the uncertainty of the parameters in addition to the uncertainty coming from conformational sampling in our simulations. The enthalpy change due to solvation is then obtained from⁷⁵

$$\Delta {H}_{{\rm{solv}}}\left(333\,{\rm{K}}\right)=\gamma \times \Delta {{\rm{ASA}}}_{apolar}+\delta \times \Delta {{\rm{ASA}}}_{polar}$$

(S26)

$$\Delta {H}_{{\rm{solv}}}\left(T\right)=\Delta {H}_{{\rm{solv}}}\left(333\,{\rm{K}}\right)+\Delta C(T-333\,{\rm{K}})$$

(S27)

T is the temperature and Î³ and Î´ constants taking on values of â8.44 and 31.4âcalâmol^â1âÃ^â2, respectively. While we are not aware of alternative parameter sets for the solvation enthalpy (equation (S26)) in the literature, we treated these parameters with a relative uncertainty of 50% to show that even with such high levels of uncertainty our conclusions are not affected. Finally, the solvation entropy and change in free energy are then calculated using

$$\Delta {S}_{333{\rm{K}},{\rm{solv}}}=\Delta {C}_{{\rm{apolar}}}\,{\rm{ln}}\left(\frac{T}{{T}_{{\rm{apolar}}}}\right)-\Delta {C}_{{\rm{polar}}}\,{\rm{ln}}\left(\frac{T}{{T}_{{\rm{polar}}}}\right)$$

(S28)

$$\Delta {G}_{{\rm{solv}}}=\Delta {H}_{{\rm{solv}}}-T\Delta {S}_{{\rm{solv}}}$$

(S29)

where T_apolar and T_polar are the temperatures at which ÎS_solv,apolar and ÎS_solv,polar are 0 (385âK and 335âK, respectively). Our previous work indicated ribosome solvation changes during coTF is not a major factor in coTF thermodynamics (see Supplementary NoteÂ 9), we estimated the above quantities using surface areas calculated excluding the ribosome. We regard these absolute quantities as an estimated upper bound for ÎG_solv because it is likely that folding intermediates and the native state also interact with the ribosome⁶, thus effectively cancelling out any reduction in SASA of the unfolded state due to ribosome interactions. However, in the following section we describe an alternative, more direct approach for the solvation entropy that does not rely on this assumption.

Calculation of solvation entropy changes using the 2PT method

The water and solvation entropy changes were also assessed more directly from molecular dynamics simulations using the two-phase thermodynamic (2PT) method¹³⁶ implemented in the DoSPT code (https://dospt.org/index.php/DoSPT)¹³⁷. For these calculations, we chose five snapshots from our isolated FLN5 A₃A₃ V747 simulations detailed above (that is, with different initial protein conformations and solvent configuration) and use these to initiate short molecular dynamics simulations for entropy calculations. We first re-equilibrated the boxes for 10âns at the target temperature in the NPT ensemble at 1âbar using the velocity rescaling algorithm⁸⁹ and the ParrinelloâRahman algorithm⁹¹ as detailed above and the velocity Verlet integration algorithm (md-vv in GROMACS⁸⁴). Production simulations were then run in the NVT ensemble at 283âK and 298âK (to assess the effect of temperature on the water entropy calculations) for 20âps using the md-vv integrator and saving coordinates and velocities for analysis every 4âfs. Control simulations of pure TIP3P (CHARMM TIP3P) water in a cubic box with a box vector length of 5ânm, resulting in 4,055 water molecules. Five independent simulations were performed by first energy minimizing the system using the steepest-decent algorithm. Then, using a 2âfs timestep and thermostat/barostat settings as for the protein and the md-vv integrator we equilibrated the water box first in the NVT ensemble for 1âns, followed by 1âns in the NPT ensemble using the Berendson barostat⁹⁰. The water box was then further equilibrated in the NVT ensemble for 1âns prior to the production simulation in the NVT ensemble for 20âps, saving coordinates and velocities every 4âfs. These production simulations were also performed at 283âK and 298âK and then used to calculate the molar entropies of pure water at these temperatures with DoSPT.

For water entropy calculations in the protein system, we first analysed the radial distribution function water surrounding the protein molecule using our 15 Î¼s and 20 Î¼s molecular dynamics ensembles of the isolated protein and RNC and the GROMACS rdf functionality⁸⁴ to identify the region of the first two hydration shells that show significantly reduced water dynamics. Using this analysis, we chose a distance cut-off of 3.5âÃ between the protein and water centre of mass to define the hydration layer around the protein. With this criterion we then calculated the probability distribution and average number of water molecules in the hydration layer to assess the difference in solvation on and off the ribosome. Water molecules that remain within a defined distance range from the protein during the entire 20âps production simulation were then selected to calculate the average molar entropy per molecule of water in different environments with DoSPT. The accessible volume for this subsystem was estimated by using the average volume occupied per water molecule in a pure water box under identical conditions multiplied by the number of molecules. To obtain the change in solvation entropy (difference between the RNC and isolated system, ÎS_solv,RNC-iso), we used

$$\Delta {S}_{{\rm{solv,RNC}}-{\rm{iso}}}={N}_{{\rm{diff}}}\Delta {S}_{{\rm{solv,water}}}$$

(S30)

where N_diff is the average difference in the number of water molecules in the hydration layer (RNC-iso) and ÎS_solv,water is the entropy difference between water molecules in the hydration layer (0â3.5âÃ from the protein) and water molecules in bulk solution (defined here as 36â46âÃ from the protein).

Reporting summary

Further information on research design is available in theÂ Nature Portfolio Reporting Summary linked to this article.

The ribosome lowers the entropic penalty of protein folding

Protein expression and purification

Fluorescent and PEG-maleimide labelling of 70S and RNC samples

NMR spectroscopy

Mass spectrometry

Small-angle X-ray scattering

Circular dichroism spectroscopy

HRAS refolding experiments

Molecular dynamics simulations

Calculation of PREs

Bayesian inference reweighting

Calculation of RDCs, R
_h and chemical shifts from molecular dynamics simulations

Calculation of SAXS profiles from molecular dynamics simulations

Structural analysis

Error analysis from ensembles

Energetic analyses from structural ensembles

Calculation of solvation entropy changes using the 2PT method

Reporting summary

AI offers way to image and assess clinical cell samples

Practical lithium–organic batteries enabled by an n-type conducting polymer

An agentic system for rare disease diagnosis with traceable reasoning

Most Popular

A Wagon-Y Polestar 4 Variant (With A Rear Window) Is Coming This Year, And Two Brand New Models Will Follow In The Next Two...

Shia LaBeouf’s Behavior While Shooting New Movie Concerned Some on Set

Charley Crockett Announces New Album Age of the Ram

Brunello Cucinelli on Saks Global, Reports Strong 2025 Gains

Recent Comments

ABOUT US

POPULAR POSTS

A Wagon-Y Polestar 4 Variant (With A Rear Window) Is Coming This Year, And Two Brand New Models Will Follow In The Next Two...

Shia LaBeouf’s Behavior While Shooting New Movie Concerned Some on Set

Charley Crockett Announces New Album Age of the Ram

POPULAR CATEGORY

The ribosome lowers the entropic penalty of protein folding

Protein expression and purification

Fluorescent and PEG-maleimide labelling of 70S and RNC samples

NMR spectroscopy

Mass spectrometry

Small-angle X-ray scattering

Circular dichroism spectroscopy

HRAS refolding experiments

Molecular dynamics simulations

Calculation of PREs

Bayesian inference reweighting

Calculation of RDCs, R h and chemical shifts from molecular dynamics simulations

Calculation of SAXS profiles from molecular dynamics simulations

Structural analysis

Error analysis from ensembles

Energetic analyses from structural ensembles

Calculation of solvation entropy changes using the 2PT method

Reporting summary

Most Popular

Recent Comments

ABOUT US

POPULAR POSTS

POPULAR CATEGORY

Calculation of RDCs, R
_h and chemical shifts from molecular dynamics simulations