Thursday, December 11, 2025
No menu items!
HomeNatureGut micro-organisms associated with health, nutrition and dietary interventions

Gut micro-organisms associated with health, nutrition and dietary interventions

ZOE PREDICT cohorts definition

The ZOE PREDICT programme comprises several distinct studies that together constitute one of the largest multi-omic health initiatives, linking diet, person-specific metabolic responses to foods, and the gut microbiome. In this work, we considered and harmonized five ZOE PREDICT cohorts: PREDICT 1, PREDICT 2, PREDICT 3 US21, PREDICT 3 US22A, and PREDICT 3 UK22A. The PREDICT 1 cohort (NCT03479866) was described previously9,51. In brief, PREDICT 1 enrolled 1,098 participants (n = 1,001 from the UK and n = 97 from the USA) who underwent a clinical visit to collect anthropometric information and blood samples, followed by an at-home phase during which postprandial responses to both standardized tests and ad libitum meals were recorded. Stool samples were collected at home before the in-person clinical visit. The PREDICT 2 study (NCT03983733) had a similar collection protocol to PREDICT 1 but was conducted entirely remotely and included data from 975 people from 48 US states (including the federal District of Columbia and without participants from North Dakota and Hawaii). The PREDICT 3 cohorts (US21, US22A and UK22A) are research cohorts (NCT04735835) embedded within the ZOE commercial product. Participants provide informed written consent for their data to be used for scientific research purposes. In total, 32,621 samples (n = 11,798 for US21, n = 8,470 for US22A and n = 12,353 UK22A) were collected and retrieved. The studies were fully remote, participants completed health and food questionnaires at baseline, and self-collected and shipped stool samples. Cardiometabolic markers were collected as described below. Furthermore, we considered and analysed two registered clinical nutritional intervention studies, namely METHOD36 (NCT05273268) and BIOME37 (NCT06231706), focusing on the microbiome changes and their links with the two derived SGB-level rankings (ZOE MB health-ranks and diet-ranks). All study protocols are registered and available on clinicaltrials.gov through the clinical trials number and link affiliated with each trial.

Sample collection, DNA extraction and sequencing

For the PREDICT 1 cohort, sample collection, DNA extraction and sequencing were described previously9. The PREDICT 2 samples were collected in Zymo buffer, DNA extraction was performed at QIAGEN Genomic Services using DNeasy 96 PowerSoil Pro, and sequencing was performed on the Illumina NovaSeq 6000 platform using the S4 flow cell and targeting 7.5 Gb per sample. The PREDICT 3 samples were self-collected into tubes containing the DNA-Shield Zymo buffer. Sample processing was performed by Zymo and Prebiomics. In brief, DNA extraction by Zymo used the ZymoBIOMICS-96 MagBead DNA kit, whereas Prebiomics used the DNeasy 96 PowerSoil Pro kits. Sequencing libraries were prepared using the Illumina DNA Prep Tagmentation kit, following the manufacturer’s guidelines. Whole-genome shotgun metagenomic sequencing on the Illumina NovaSeq 6000 platform used the S4 flow cell and targetted 3.75 Gb per sample.

All raw sequenced data were quality controlled using the preprocessing pipeline available at https://github.com/SegataLab/preprocessing, which comprises three steps: (1) removal of reads with low-quality (Q < 20), too short (length under 75 nt), or with more than two ambiguous bases; (2) removal of host contaminant DNAs (Illumina’s spike-in phiX 174 and human genomes, hg19); and (3) synchronization of paired-end and unpaired reads.

Dietary data processing

In the PREDICT cohorts, we assessed long-term food intakes using FFQs, which were largely consistent across cohorts. Specifically, for PREDICT 1 participants (UK), we used a modified 131-item European Prospective Investigation into Cancer and Nutrition (EPIC) FFQ52. Participants in PREDICT 2 (USA) were surveyed using a similarly validated Diet History Questionnaire-III FFQ, including 135 items about food and beverages, as well as 26 questions about dietary supplements53. In PREDICT 3 UK22A and US22A, we developed and used a 264-item FFQ adapted from the EPIC-Norfolk Study FFQ and the Diet History Questionnaire-III. Consequently, there is a large overlap between the food items collected across the FFQs; for example, 90% of questions in the EPIC FFQ are included in the PREDICT 3 FFQ. This FFQ also includes additional food items to accurately capture modern eating habits—a limitation of older FFQ versions54. In the PREDICT 3 US21 cohort, FFQs were not collected, and only short-term logged dietary data collected using the ZOE mobile phone app were used instead.

Starting from both long- and short-term dietary data, we computed three versions of the PDI55, namely, the overall PDI, the healthful PDI (measuring the adherence to a healthier plant-based foods diet) and the unhealthy PDI (measuring the intake of unhealthful plant-based foods), as well as the healthy eating index23 (measuring how consumed foods align with dietary guidelines), the alternative Mediterranean diet score (measuring the adherence to a Mediterranean diet)56 and the Healthy Food Diversity (HFD) index (measuring the number, distribution and health value of consumed foods)57. Specifically, to calculate PDIs and the healthy eating index, food items were first assembled into food groups by mapping them onto a ‘food tree’ consisting of a database of nutrient information arranged according to a hierarchical tree structure: level 1 (9 food groups), level 2 (52 food groups) and level 3 (195 food groups). UK foods were mapped onto the Composition of Foods Integrated Dataset (CoFID)58 using food categories or sub-group codes, whereas US foods were similarly mapped onto the US Department of Agriculture Food and Nutrient Database for Dietary Studies database. Level 3 foods were aggregated and harmonized by nutrition scientists to allow for comparisons across cohorts. The Mediterranean diet and HFD scores were calculated as described previously9.

Host health and anthropometric marker collection

In PREDICT 1, sex and age were self-reported, whereas height, weight and blood pressure were measured at a clinic visit (day 0). At the clinic visit, participants were also fitted with wearable continuous glucose monitor CGM) devices (Abbott Freestyle Libre Pro (FSL)), visceral fat mass was measured using dual-energy X-ray absorptiometry scans following standard manufacturer’s recommendations (DXA; Hologic QDR 4500 plus) and fasting GlycA was measured using a high-throughput NMR metabolomics (Nightingale Health) 2016 panel. Fasting and postprandial venous blood samples were also collected at the clinic; plasma glucose and serum total cholesterol, HDL-C and triglycerides were measured using Affinity 1.0, and whole blood HbA1c% was measured using Viapath. The ten-year ASCVD risk was calculated as per the 2019 American College of Cardiology (ACC) and American Heart Association (AHA) clinical guidelines59. Additional data were collected over the subsequent 13-day period at home; postprandial responses to eight standardized meals (seven in duplicate) of differing macronutrient (fat, carbohydrate, protein and fibre) content were measured using CGMs and dried-blood-spot analysis as described previously13. T2D and hyperlipidemia were self-reported via health questionnaires. The PREDICT 2 and PREDICT 3 studies were fully remote. Sex, age, height, weight and blood pressure were self-reported, and fasting and postprandial responses for total cholesterol, HDL-C, triglycerides and HbA1c were assessed using whole blood finger-prick samples collected at home using dried-blood-spot analysis by commercial laboratories (CRL, Eurofins Biomnis). CGMs were fitted at home by participants. A selection of standardized meals smaller than in PREDICT 1 was tested in PREDICT 2 and PREDICT 3 (a metabolic challenge meal, and medium-fat and carbohydrate breakfast and lunch meals). Some of the considered markers represent the same metabolic function over time and showed positive correlations between their fasting and postprandial measurements, whereas others represent opposite types of the same biomolecular pathway and showed negative correlations among them (Supplementary Table 4).

Public human microbiome datasets

We leveraged 27,011 public metagenomic samples from 107 cohorts available through the curated MetagenomicData 3 (cMD3) resource60,61 to define the cohorts used for the meta-analyses on BMI and healthy–diseased comparison (‘Statistical and meta-analyses’). For the meta-analysis on BMI, we selected cohorts with stool microbiome samples from healthy participants (self-assessed, not reporting a diagnosis), aged at least 16 years, BMI ≥ 18.5 and sex information available. Cohorts with fewer than 30 people were excluded. Furthermore, the ThomasAM_2018_c and LeChatelier_2013 cohorts were excluded as duplicates in the YachidaS_2019 and NiesenHB_2014 cohorts, respectively. Overall, 6,182 samples from 34 different cohorts and 20 countries were retrieved. Participants were classified into three categories: healthy weight (BMI ≥ 18.5 and <25), overweight (BMI ≥ 25 and <30) and obese (BMI ≥ 30). Then, each combination of country, dataset and two BMI categories was tested if at least 15 samples were retained. These led to analysing a total of 5,348 samples from 27 cohorts (2,837 healthy weight, 1,562 overweight and 949 obese participants; Supplementary Table 9). For the health–diseased meta-analyses, we selected from cMD3 participants aged at least 16 years, BMI ≥ 18.5 and the sex information available that were part of a case–control study of one of the following diseases: CRC, IBD (including ulcerative colitis and Crohn’s disease), T2D, IGT and ASCVD. Studies with fewer than 30 people were excluded. In total, we considered ten datasets of CRC (650 cases and 645 controls), two datasets of IGT (273 cases and 492 controls), five datasets of T2D (775 cases and 900 controls), three datasets of IBD (103 controls, 59 of which used in two different comparisons, 60 individuals with Crohn’s disease and 68 individuals with ulcerative colitis) and three datasets of CVD (283 cases and 508 controls). Notably, German and French participants of the MetaCardis cohort were separated, and this led to a set of 449 controls used in both the T2D and the IGT analyses, whereas only the 176 controls from France were used in the CVD analysis. Overall, the total number of samples analysed was N = 4,816 (2,707 controls and 2,109 cases) from 25 cohorts and 10 countries (Supplementary Table 20). The cohort selection for the two analyses used the script https://github.com/waldronlab/curatedMetagenomicDataAnalyses/blob/main/python_tools/meta_analysis_data.py available in cMD3.

Microbiome taxonomic profiling

All microbiome samples from the PREDICT cohorts were profiled using MetaPhlAn 4 (v.4.beta.2, database vJan21_CHOCOPhlAnSGB_202103), without performing read subsampling, as the benefit of occasionally detecting a few additional low-abundance species in samples with a higher number of reads outweighs the potential noise from uneven sequencing depth. Samples retrieved from cMD3 (described in ‘Public human microbiome datasets’) were profiled with MetaPhlAn 4 (v.4.beta.1, database vJan21_CHOCOPhlAnSGB_202103) using default parameters in both cases (among default parameters, the stat_q is set to 0.2 by default, which defines the quantiles for the robust average coverage calculation), which precludes the necessity for additional prevalence filters considering its default parameters are tailored for the taxonomic profiling of human microbiome samples19. MetaPhlAn 4 is a publicly available taxonomic profiler for metagenomic samples (Github repository: https://github.com/biobakery/MetaPhlAn) that leverages medium and high-quality genomes from isolates and metagenome-assembled genomes (MAGs). Isolate genomes and MAGs are clustered at 95% average nucleotide identity to define SGBs, as described previously20. If an SGB cluster contains a genome isolate, then it is referred to by that isolate’s taxonomic label. If an SGB contains only MAGs, then it represents an unknown species cluster and is assigned the taxonomic label of a genus, family or phylum, according to which is the genomically closest to a taxonomic label from isolate genomes. As the taxonomic classification of MetaPhlAn depends on species-specific marker genes, sometimes there are several SGBs of very closely related genomes for which the identification of SGB-specific markers is not feasible. In this case, more than one SGB can be considered together, and the label ‘_group’ is appended to the representative SGB ID. In this way, MetaPhlAn 4 improves the resolution of the taxonomic profiling task62.

Rankings definition

We first identified a subset of prevalent SGBs to ensure a minimum number of non-zero relative abundance values. In each PREDICT cohort, we selected markers that are intermediary measures of host health or diet health, and they were organized into four categories: personal, dietary, fasting and postprandial (Supplementary Table 2). Second, we calculated the partial Spearman’s correlation between each SGB and health markers, adjusting for sex, age and BMI, using the ‘pingouin’ Python package (v.0.5.4, https://github.com/raphaelvallat/pingouin) (Extended Data Figs. 3 and 5). The relative abundance values of SGBs (including zeros) were used as input for the correlations. Third, the SGB-marker partial correlations were sorted ascending if the marker was considered as positive with respect to health, or descending if the marker was considered as negative. These sorted partial correlations were ranked and normalized according to cohort sample sizes into percentiles ranging from 0 to 1 (function pandas.DataFrame.rank with param pct=True from pandas v.2.1.3) (Fig. 2b,c and Extended Data Figs. 2 and 4). Fourth, for each category of markers, we computed the average percentiles across markers (Fig. 2a and Extended Data Fig. 4). SGBs were retained in the overall rankings if they were ranked in at least two different cohorts, leading to a final ranking of 661 SGBs. Finally, the ZOE MB health-rank 2025 was defined by first averaging the personal, fasting and postprandial category percentiles within each cohort, and then averaging these cohort-specific averages. The ZOE MB diet-rank 2025 instead was defined by averaging the dietary percentiles across all cohorts (Fig. 2a, Extended Data Fig. 4 and Supplementary Table 5). The ZOE Microbiome Rankings are also available at https://zoe.com/our-science/microbiome-ranking.

Machine learning

To assess the link to the human gut microbiome composition, we developed and used a machine learning framework based on random forest classification and regression algorithms from the scikit-learn (v.1.3.2) Python package (as implemented in the RandomForestClassifier and RandomForestRegressor functions, respectively), both with ‘n_estimators=1000’ and ‘max_features=sqrt’ parameters63. We trained random forest classifiers and regressors on MetaPhlAn 4-estimated SGB-level relative abundances (arcsine square-root transformed) to assess the extent to which the outcome variable was predictable from the microbiome as a proxy of the strength of the microbiome–variable association. This framework was used and described originally in ref. 9 and accounts for the presence of twin pairs in the data, which avoids biases due to identical values in twins. In brief, the framework uses a cross-validation approach, splitting the dataset randomly into training and testing folds with an 80:20 ratio, respectively, and repeated 100 times (as implemented in the StratifiedShuffleSplit function). Folds are also constructed to maintain a similar ratio of the two classes to predict as they appear in the full data. For target variables with continuous values, classification was performed by contrasting the first against the fourth quartile, the first three against the fourth quartile and the first against the last three quartiles. Performances were evaluated using the AUC for the classification task, whereas Spearman’s correlation between the real and predicted values was used for the regression task22.

Statistical and meta-analyses

We performed a meta-analysis to determine the possible links between BMI (categorized into ‘healthy weight’, ‘overweight’ and ‘obese’) and our ranked SGBs across various publicly available studies comprising a total of 5,348 people who were not diagnosed with any specific disease. We first evaluated the ZOE MB health- and diet-ranks by assessing the cumulative relative abundance and richness of the 50 most favourable and the 50 least favourable SGBs in each dataset in each BMI category: healthy weight, overweight and obese (see ‘Public human microbiome datasets’ for the specific cut-offs). Specifically, we assessed the number of intra-dataset, between-BMI groups pairwise comparisons in which the group median abundance or the group median count was higher in the lower BMI group (when considering most favourable SGBs from both ranks) or higher in the higher BMI group (when looking at least favourable SGBs). Next, we fit linear models for each dataset and pair of BMI categories: healthy weight versus overweight, healthy weight versus obese, and overweight versus obese. In the first model, we looked at the count of the 50 most favourable and unfavourable ZOE MB health- and diet-ranked SGBs. A second model was fitted on the cumulative relative abundance (arcsine square-root transformed) of the 50 most favourable and unfavourable SGBs in the two rankings. All models were adjusted by sex and age. Cohen’s d was used to estimate the effect size of the normalized difference between unfavourable and favourable ranked SGBs when considering cumulative abundances. This quantifies the difference between the means of two groups in terms of standard deviations. Specifically, as originally defined, a ‘small’ effect size corresponds to d = 0.2, a ‘medium’ effect size to d = 0.5 and a ‘large’ effect size to d = 0.8 (ref. 64). In these models, the lower BMI category of each comparison was used as the negative control, so negative coefficients reflect a higher count of SGBs in the lower BMI category, whereas positive coefficients reflect a higher count of SGBs in the higher BMI category. Effect sizes were summarized through meta-analysis, computed as a random-effect model using the Paule–Mandel heterogeneity on adjusted mean differences from the linear regression models (standardized for cumulative abundances). We assessed the presence of the 50 most favourable and most unfavourable SGBs from both the ZOE MB health- and diet-ranks among the countries considered in these analyses (18 in total) and when considering only people of healthy weight (n = 2,837). To link the ranked SGBs with the country, we fit a linear model on the count and cumulative relative abundance of the SGBs, and the models were adjusted by the sequencing depth of the study. We used ordinary least squares adjusted by sequencing depth when comparing two datasets from different countries, and linear mixed model blocked by dataset ID and adjusted by sequencing depth when comparing pairs of countries in which at least one country was represented by more than one dataset (country- and sequencing depth-adjusted P values are presented in Supplementary Table 11).

A second meta-analysis tested the associations between our ZOE MB health- and diet-ranked SGBs and five gut-associated diseases (CVD, T2D, IBD, CRC and IGT) across studies, for a total of 4,816 samples (‘Public human microbiome datasets’). Linear models were used to predict the binary disease outcome (healthy versus diseased) for each disease, using the cumulative abundances (arcsine square-root transformed) of the 50 most favourable or unfavourable SGBs, adjusting by sex, age and BMI. The betas of the linear models were converted into SMDs as described previously65. We also defined models to predict healthy versus diseased using the sum of the SGB ranks normalized between −1 and 1, considering all 661 SGBs for the ZOE MB health- and diet-ranks, once using the direct sum of the SGB ranks and once weighting ranks by the relative abundance of each SGB in each sample (transformed using the arcsine and square-root function to avoid overestimating the ranks of highly abundant species due to compositionality). SMDs were calculated similarly to those in the previous case. In all meta-analytical models, the set of cohorts considered comprised studies encompassing several diseases with a shared control group that we analysed separately. To account for the overlaps in the studies considered, we computed weights based on the inverse effect sizes variance-covariance matrix, as suggested previously66,67. Thus, five meta-analyses were performed, one for each disease: CVD (three datasets), T2D (six datasets), IBD (three datasets), CRC (ten datasets) and IGT (two datasets). Of note, in the comparisons of controls versus T2D, IGT and CVD, the MetaCardis French and German sub-cohorts were considered as different datasets, and their controls were meta-analysed as different cohorts. In particular, only French control samples were used in the CVD analysis, which included only French cases. Finally, meta-analysis summaries were computed using the same technique. Analyses we carried out with Python (v.3.12.0), using also the following libraries: numpy (v.1.26.2), scipy (v.1.11.4), statsmodels (v.0.14.0), and matplotlib (v.3.8.2) and seabron (v.0.11.2) for visualization.

Ethical compliance

All study protocols are registered on clinicaltrials.gov and procedures are compliant with all relevant ethical regulations. Ethical approval for the PREDICT 1 study was obtained in the UK from the King’s College London Research Ethics Committee (REC) and Integrated Research Application System (IRAS 236407), and in the USA from the institutional review board (Partners Healthcare Institutional Review Board (IRB) 2018P002078). Ethical approval for the PREDICT 2 study (Pro00033432) was obtained from Advarra IRB. Ethical approval for the PREDICT 3 study (Pro00044316, HR/DP-21/22-28300 and HR/DP-24/25-45829) was obtained from Advarra IRB and King’s College London REC. Ethical approval for the METHOD study (Pro00044316; protocol no. 00044316) was obtained from Advarra IRB. Ethical approval for the BIOME study (HR/DP-23/24-39673) was obtained through King’s College London REC. All participants provided written informed consent and all studies were carried out in accordance with the Declaration of Helsinki and Good Clinical Practice.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

RELATED ARTICLES

Most Popular

Recent Comments