Human post-mortem sample preparation
Anonymized human samples were obtained from The Edinburgh Brain and Tissue Bank, MRC London Brain Bank for Neurodegenerative Diseases, Cambridge Brain Bank, South West Dementia Brain Bank, Parkinson’s UK Brain Bank and University of Leipzig Medical Centre Institute of Anatomy, in line with each bank’s Research Ethics Committee approval. Subjects were approached in life for written consent for brain banking, and all tissue donations were collected and stored following legal and ethical guidelines. Donor details for snRNA-seq, spatial transcriptomics and smFISH are given in Supplementary Table 1.
For snRNA-seq, frozen blocks of post-mortem hypothalamus were sourced from adult donors with BMI ranging from 18 to 28 kg m−2 and no significant neuropathology. Dissections were performed following delineation of relevant anatomy in cresyl-violet-stained sections from the anterior and posterior surfaces of each sample by a consultant pathologist. Samples from the relevant region were then acquired using a punch biopsy or macrodissected from 100-μm-thick frozen cryostat sections spanning the whole specimen.
For spatial transcriptomics, post-mortem formaldehyde-fixed, paraffin-embedded (FFPE) human brain samples covering the hypothalamus were obtained from the MRC Brain Bank Network. Selection of samples and areas to include in spatial transcriptomics analyses were based on anatomical landmarks using Luxol fast blue/haematoxylin-eosin staining of myelinated fibres and cell bodies; n = 9 samples from n = 7 different donors (2 male, 5 female). BMI ranged from 16 to 41 kg m−2 at the time of death.
Nucleus dissociation and RNA sequencing
Nuclei were isolated by Dounce homogenization and purified using a protocol modified from ref. 14. Briefly, chopped samples were transferred to a 15-ml Dounce homogenizer with 5 ml homogenization buffer (100 μM of dithiothreitol (Sigma–Aldrich), 0.1% Triton X-100 (Sigma–Aldrich), 2× EDTA Protease Inhibitor (Roche), 0.4 U μl−1 RNasin RNase inhibitor (Promega; 10,000 U, 40 U ml−1) and 0.2 U μl−1 Superase.In RNase Inhibitor (Ambion; 10,000 U, 20 U μl−1) in nuclei isolation medium (250 mM sucrose, 25 mM KCl (Ambion), 5 mM MgCl2 (Ambion) and 10 mM Tris buffer, pH 7.0 (Ambion) in nuclease-free water (Ambion)) with 1 μl ml−1 DRAQ5 (Biostatus), and dissociated mechanically using 10 strokes with pestle A and 20 strokes with pestle B. Homogenates were filtered through a 100-μm filter and centrifuged at 600g for 5 min in a precooled centrifuge. The supernatant was discarded and the pellet resuspended in 27% Optiprep solution diluted in homogenization buffer and centrifuged at 13,600g for 20 min at 4 °C. The nuclear pellet was collected and resuspended in wash buffer (1% BSA, 0.4 U μl−1 RNasin and 0.2 U μl−1 Superase.In in PBS (Sigma–Aldrich)) and centrifuged at 700g for 5 min at 4 °C. This was repeated twice before being passed through a 40-μm cell strainer and this final sample was used to create sequencing libraries. For two donors, single nuclear suspensions were sorted using fluorescent-activated nucleus-sorting (FANS) on a BD FACSMelody instrument. The gating was set according to forward scatter, side scatter and fluorescence at 647/670 nm to detect DraQ5 nuclear staining, and 567 nm to detect NeuN-PE staining. NeuN+ events were sorted into a collection tube to enrich for neuronal nuclei.
Sequencing libraries were generated using 10x Genomics Chromium Single-Cell 3′ Reagent kits (v.3.1) according to the standardized protocol. cDNA was amplified for 19 cycles. Paired-end sequencing was performed using an Illumina NovaSeq 6000.
Sequence alignment, cell calling and quality control
Raw sequence reads were mapped and genes counted based on the Human GRCh38, Ensembl 98 gene model, both using 10x Genomics CellRanger v.4-5 (https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/what-is-cell-ranger) using the parameter –include-introns. CellBender v.2.0 (ref. 51) was used to recalibrate unique molecular identifier (UMI) counts and cell calling.
After removal of flagged nuclei, our snRNA-seq dataset included 571,091 nuclei from 58 samples, which contributed between 748 and 45,771 cells. We used scran’s quickCluster function52 to obtain an initial set of clusters that were used as input cluster assignments to scDblFinder, which was run with multiSampleMode set to ‘split’53. We additionally ran an initial Seurat-based processing of the whole dataset, including detection of highly variable features, scaling of data, principal component analysis and preliminary clustering54. All nuclei detected by scDblFinder as doublets or that were part of Seurat clusters with more than 75% of doublets were removed. We further filtered the data using the sample-based thresholds and additionally set a global threshold of maximum mitochondrial RNA of 10% and a minimum of 800 UMIs per nucleus. After filtering the dataset for doublets and low-quality nuclei, it comprised 353,678 nuclei from the 58 samples, which contributed between 609 and 20,424 nuclei.
The processed snRNA-seq data of all hypothalamus samples (ROIGroupCoarse = ‘Hypothalamus’) were extracted from the loom file published by Siletti et al.12. This included a total of 134,471 nuclei that we merged with data from our own study.
snRNA-seq integration
Our combined human dataset includes 82 10x samples from 11 different donors and two independent studies with a total of 488,149 cells after merging and initial quality control. To integrate all cells and make the data comparable we used scvi-tools (v.0.19.0)55, which we have shown previously to be a powerful integration tool that preserves cell-type purity while removing batch differences14; scvi always models the library size (nUMI) and we used the sample ID as the covariate (‘batch_key’) to allow future use with scArches. Similar to our previous study we optimized the main hyperparameters of scvi by running a grid search over pre-defined parameter ranges using our published pipeline (https://github.com/lsteuernagel/scIntegration). scIntegration evaluates different scvi model outputs for mixing of samples (using the entropy of the sample distribution in each cell’s nearest neighbours), the purity of cells (cell-type distribution in each cell’s nearest neighbours) and the average silhouette width for cluster separation. We defined a set of ground truth cell types using signatures for mouse glial cell types from our mouse HypoMap14 and additionally added a set of manually curated neuron signatures (Supplementary Table 3). We then visualized the hyperparameters of all runs by the evaluation metrics to choose a final set of optimal parameters. Overall, all models integrated the data well and we mostly found small improvements (Supplementary Table 4). The final scvi model was trained for 100 epochs with a dropout rate of 0.1. The model had two layers and 256 nodes per layer (n hidden) and the latent space had 80 dimensions. All other parameters were set to default.
snRNA-seq clustering and annotation
The integrated embedding from the final scvi model was used for downstream analysis. We adapted our previous dataset harmonization pipeline14 for many of the following steps but changed it where necessary. We started with an initial round of clustering and annotated these clusters using marker gene signatures for principal cell types, including some non-hypothalamic ones. We found several clusters of cells that probably reside outside the hypothalamus (for example, SCL17A7+ neurons or thalamic SHOX2+ neurons). After annotating all cells, we removed the likely non-hypothalamic clusters and a few clusters representing low-quality cells, leaving us with a final dataset of 433,369 cells. Due to the imbalance of main cell-type distribution (for example, 40.4% of all cells are oligodendrocytes), we split the data into four main subsets for clustering and tree building: neurons, Oligo, AstroEpen and other non-neuronal cells. We ran Leiden clustering on different resolutions 100 times and combined them into a single consensus clustering per resolution using hybrid bipartite graph formulation56 to improve robustness. For each subset, several flat consensus clusters were combined into a consensus hierarchical tree using mrtree57. Marker genes of each cluster versus all others, as well as only its sibling nodes in the subtree were calculated using a batch-stratified Wilcoxon rank sum test58 and corrected for multiple testing using Bonferroni correction. The subtrees were pruned by merging nodes with insufficient differences (fewer than five strong marker genes, fewer than 50 cells or more than 90% of cells originating from a single donor) with their closest sibling node based on Euclidean distance in the integrated embedding. We repeated this pruning five times and used the final hierarchical tree in the following step. We then merged all four subtrees into the final clustering tree, which spans five distinct levels (C0–C4) with 4–452 distinct clusters; however, for non-neuronal cell types only up to four levels exist14. We manually labelled the first levels of the tree (C0, C1) based on cell type (broad class) and general location for neurons. For glial cells, we additionally annotated clusters with common names on levels C2 and C3 where applicable. For neurons, on level C2, we used neurotransmitter identity and consecutive numbers to label clusters. On levels C3 and C4 we used up to two marker genes to label clusters. Marker genes with high specificity both versus all other clusters and versus sibling clusters were prioritized. For four clusters of AgRP, NPW, HDC and PMCH neurons, we manually overwrote the label since the key neurotransmitter genes were not the top-scoring gene. When analysing genes of interest, we used the 99th (POMC, AGRP) or 95th (receptors) percentile of expression percentage as cutoff to select a subset of clusters for detailed examination.
Cross-species comparison
The cross-species integration with the mouse HypoMap dataset14 was conducted using only the neurons from both species. An overview of the pipeline can be found in Extended Data Fig. 5a. Homologous genes were identified using Ensembl v.101 (ref. 59), corresponding to Gencode v.35 used by Siletti et al.12. To reduce 1:N gene relationships, only the gene with the highest sequence homology was retained. The remaining 18,279 homologous genes were used to subset the expression matrices for both species. Highly variable genes (HVGs) were selected for each species individually, by identifying HVGs per sample (human) or batch (mouse) and ranking by occurrence. A total of 2,500 HVGs were selected per species and the overlap of 1,404 genes was used as input to an scvi model to obtain an integrated embedding including both species. The parameters for scvi were adapted from the HYPOMAP scvi model described above. To achieve more aggressive mixing and move cells from the two species closer together, the number of training rounds (epochs) was increased to 600.
Cluster averages of the scvi embedding were calculated for clusters C4 in human and C465 in mouse. The Pearson correlation coefficients of cluster averages between species were used to identify corresponding (‘matched’) clusters between species. To remove M:N relationships, the correlations were adjusted and filtered: first, we grouped by either human or mouse cluster and obtained the maximum correlation value for each cluster (human and mouse). Then, for all correlation values of each cluster, the difference between the actual values and the maximum correlation was subtracted from the actual correlation values to obtain an adjusted value. Next, a graph was constructed with clusters as nodes and edges between all clusters across species with an adjusted correlation greater than 0.7. To remove all remaining M:N relationships the graph was pruned so that, for any node, all 1:N edges were kept if the neighbouring clusters had no edges to other nodes. If neighbouring nodes had several edges, only the edge with maximum adjusted correlation was retained.
Uniprot60 was queried using the REST API to obtain a list of reviewed GPCRs for both species, which was merged and used to select the most specific receptors in clusters of interest. For AGTR1 we included only mouse Agtr1a in the figure because Agtr1b was not expressed in mouse. We also excluded Npy2r, which was nearly absent in the human snRNA-seq data but detected robustly in the spatial transcriptomic data of the hypothalamus.
10x Genomics Visium CytAssist spatial transcriptomics
FFPE sections (5 μm) were prepared using a microtome (Leica) in an RNase-free environment and mounted onto positively charged slides. The sections were then stored at room temperature until use. Slides were processed for spatial transcriptomics according to 10x Genomics Visium CytAssist v.2 protocols. Briefly, samples were deparaffinized in xylene and a series of concentrations of ethanol solutions (100% to 70%) and immersed in water before haematoxylin and eosin staining. Once stained, samples were cover-slipped using a glycerol mountant and imaged using a VS200 slide scanner (Olympus Life Science) at ×20 magnification (air objective, 0.8 numerical aperture). Coverslips were removed and samples underwent destaining and decrosslinking, and were incubated overnight with 10x Genomics Visium Human WT Probes v.2 (Pleasanton). Following this, slides were loaded at the appropriate orientation, along with the Visium 11 × 11-mm gene expression slide, onto a CytAssist (10x Genomics), where hybridized probes were released from the tissue and ligated to spatially barcoded oligonucleotides on the Visium Gene expression slide. A tissue image was taken on the CytAssist at ×10 magnification for downstream alignment of library to the tissue section. Barcoded ligation products were then amplified to create a cDNA library for sequencing.
Libraries from the nine samples were pooled and sequenced on a NovaSeq 6000 sequencing platform (Illumina), using a NovaSeq 6000 S2 Reagent Kit v.1.5 (Illumina) according to the manufacturer’s instructions. Subsequently, fastq files were generated for each sample, reads were aligned to their corresponding probe-sequences (Visium human transcriptome probe set v.2, based on GRCh38 2020-A), mapped back to the Visium spot where a given probe was originally captured and finally aligned to the original HE-stained image of the tissue section using SpaceRanger v.2.0.0 (10x Genomics).
Atlas location of each spatial transcriptomics section was determined by consulting the Atlas of the Human Brain (4th edn)61 (Supplementary Table 10).
Spatial transcriptomics data analysis
Across the nine samples, the median number of counts per Visium spot was 7,105, and the median number of detected genes per spot was 3,560. The average sequencing saturation was 0.68. Furthermore, for each individual sample, graphs with (1) sequencing saturation and (2) detected number of genes plotted as a function of median number of reads per spot revealed the plateau phase was either obtained or clearly approached, that is, very little benefit would be gained from even deeper sequencing.
Spatial transcriptomics data pre-processing
The number of genes per spot and counts per spot was inspected for each tissue section individually using the Loupe browser to identify whether there were areas of the sample that had unusually low/high counts that are probably artefacts from the experimental procedures. These spots were identified and removed from downstream analysis.
For visualization of gene expression in the spatial transcriptomics data, data were analysed using Seurat (v. 4.3.0)62. Raw count matrices along with spatial barcode coordinates for each sample were loaded, and data was log-normalized for visualization of transcript expression.
Integration of snRNA-seq and spatial transcriptomic data: cell2location
We used cell2location (v.0.1.2)18 to predict the locations of snRNA-seq cell populations in the spatial transcriptomics data. We utilized the entire snRNA-seq dataset as a reference, and estimated reference cell-type signatures for clustering levels C1–C4. We included genes that were expressed in at least 8% of cells, and genes expressed in at least 0.05% of cells if the non-zero mean was greater than 1.4. We estimated reference signatures using the negative binomial regression model, accounting for the effects of donor, sex, batch and dataset.
For each cluster level, we trained the cell2location model with a detection α of 20 and three cells per location as hyperparameters, and trained for 30,000 epochs, with the final gene list including genes expressed in both the snRNA-seq and spatial transcriptomics dataset. Results were visualized using scanpy and Seurat. The plots represent the estimated abundance of cell types at each location.
To cluster the spatial transcriptomics spots, we used k-nearest neighbours and Leiden clustering on a matrix of cell abundance scores for each C3 neuronal snRNA-seq cluster and C2 non-neuronal snRNA-seq cluster. We used the C3 neuronal and C2 non-neuronal abundance mappings as these levels provided greater number of clusters mapping confidently to regions in the spatial transcriptomics dataset. We annotated each cluster based on the hypothalamic region in which most spots were present, and by the top marker genes for each cluster. If several spatial transcriptomics clusters originated from the same hypothalamic region, then these were grouped together for regional annotation of the spatial transcriptomics dataset.
Assigning regional annotations to snRNA-seq clusters
To assign snRNA-seq clusters to spatial transcriptomics regional clusters, we identified the (ungrouped) region in which the adjusted mean abundance score (median regional abundance subtracted from the mean abundance score for a snRNA-seq cluster in a region) for each C3 neuronal cluster and C2 non-neuronal cluster was the highest. We then calculated the median absolute deviation (MAD) for each cluster in each spatial region (ungrouped) and normalized the adjusted abundance for each snRNA-seq cluster in each region by dividing it by the MAD (we call this ‘mad_x’). If the region with the highest adjusted mean abundance score for a particular cluster also had a mad_x > 10, then this region was assigned to this cluster. A mad_x < 10 indicated low confidence mapping to any region and these snRNA-seq clusters were not assigned to a regional cluster. The regional annotation for some clusters were adjusted manually if the regional assignment did not match biology (for example, some clusters mapping to the LTN were generally thought to be anterior or pre-hypothalamus and so were manually assigned ‘NA’), or if mad_x < 10 but the cluster showed good abundance in the appropriate region. Overall, we found the C3 neuronal and C2 non-neuronal abundance estimates to be very robust and therefore assigned C4 snRNA-seq clusters to regional clusters by using their C3 parent’s assignment. We used C3-propogated assignments to generally label all C4 clusters, but showed C4 abundances in some specific cases. An overview of the region assignments can be found in Supplementary Table 12. The mean cell abundance score for C3 and C4 clusters can be found in Supplementary Tables 11 and 13, respectively.
Software and packages used for snRNA-seq and spatial transcriptomics analysis
The following R and Python packages were used for the analysis and plotting of snRNA-seq and spatial transcriptomics datasets: Python v.3.10.8–v.3.10.12, scvi v.0.19.0, scanpy v.1.9.8, pandas v.1.4.4, numpy v.1.26.4, cell2location v.0.1.2, cellbender v.0.1–v.0.2, cellex v.1.2.2, CELLECT v.1.3.0, R v.4.3.1, future.apply v.1.11.1-9001, future v.1.33.1-9009, pbapply v.1.7-2, Matrix v.1.6-1.1, scUtils v.0.0.1, magrittr v.2.0.3, igraph v.1.5.1, treeio v.1.26.0, ggh4x v.0.2.6, scales v.1.2.1, edgeR v.4.0.16, limma v.3.58.1, ggtree v.3.10.1, lubridate v.1.9.3, forcats v.1.0.0, stringr v.1.5.0, dplyr v.1.1.3, purrr v.1.0.2, readr v.2.1.4, tidyr v.1.3.0, tibble v.3.2.1, ggplot2 v.3.4.4, tidyverse v.2.0.0, SeuratObject v.4.1.4, Seurat v.4.4.0, RcppAnnoy v.0.0.22, cellranger v.4-5, spaceranger v.2 and bolt-lmm v.2.3.6.
Single-molecule fluorescence in situ hybridization
FFPE sections (5 μm) from the same tissue blocks used for spatial transcriptomics (see Supplementary Table 1 for donor information) were cut and mounted onto positively charged slides. Multiplex fluorescence RNAScope (ACDBio) was performed using a Bond RX fully automated research stainer (Leica), the RNAScope LS multiplex fluorescent reagent kit (Advanced Cell Diagnostics (ACD), Bio-Techne) and probes specific for GLP1R (catalogue no. 519828), GIPR (catalogue no. 471348), SST (catalogue no. 310598), POMC (catalogue no. 429908) and AVP (catalogue no. 401368; Advanced CellH Diagnostics, Bio-Techne). Slides were baked and deparaffinized before heat-induced epitope retrieval at 95 °C for 30 min using Bond ER Solution 2. Next ACD enzyme (ACDBio) was added, and slides were incubated at 40 °C for 15 min. Samples were hybridized, amplified and detected according to the ACD Multiplex Protocol P1. Final detection was achieved with the Opal 570 and Opal 690 fluorophore reagent packs (Akoya BioSciences, Inc., diluted 1:1,000), and samples were counterstained with 4′,6-diamidino-2-phenylindole (ACD) to mark cell nuclei and cover-slipped with ProLong Diamond antifade mountant (ThermoFisher Scientific) before being imaged using the VS200 slide scanner (Olympus Life Science) at ×20 magnification (air objective, 0.8 numerical aperture).
Three independent human samples (see Supplementary Table 1 for donor information) were used to assess ependymal and tanycyte expression markers. Fresh post-mortem human hypothalamus 2 × 3 × 1-cm blocks (less than 24 h post-mortem) were incubated for 16 h in 10% neutral buffered formalin and then further fixed for 48–72 h in 4% paraformaldehyde. Brain blocks were dehydrated in a series of ethanol treatments (70% (16 h, 2 × 4 h), 80% (16 h, 2 × 4 h), 96% (16 h, 2 × 4 h) and 100% (16 h, 1 × 4 h)). The blocks were then incubated for 3.5 days in xylol, followed by two incubations in fresh paraffin (5 h, 16 h) before placing the blocks into forms. Brain blocks were sliced (5 µm) and mounted on Superfrost (ThermoFisher) glass slides and stored at room temperature.
We performed smFISH on human hypothalamic slices as recommended for FFPE-embedded tissue by the manufacturer (RNAScope Multiplex Fluorescent Reagent Kit v.2 Assay, catalogue no. 323100-USM, ACD). Briefly, slides were incubated for 1 h at 60 °C, followed by two 5-min incubations in xylene at room temperature, and two 2-min incubation steps in 100% ethanol. Slides were air-dried and subjected to target retrieval for 15 min. Protease Plus (ACD) was applied for 25 min at 40 °C. After the pre-treatment, the standard protocol was continued. The following RNAScope probes were used: DIO2 (catalogue no. 562211), FZD5 (catalogue no. 414051), STOML3 (custom made) and LPAR3 (catalogue no. 428811). For controls, 3-plex positive (catalogue no. 320861) and 3-plex negative (catalogue no. 320871) were used. Probes were detected with Opal fluorophores from Perkin Elmer, Opal 690 (catalogue no. FP1497001KT); Opal 620 (catalogue no. FP1495001KT) and Opal 570 (catalogue no. FP1488001KT) at a dilution of 1:1,000. Images were captured using a Leica TCS confocal microscope, equipped with ×20/0.75 liquid immersion and ×40/1.30 oil objectives, and LasX software. Images of the hypothalamus were captured at the hypothalamus and median eminence areas from the anterior to posterior hypothalamus.
Cell-type enrichment and BMI associations
Cell-type specificity matrices were generated using CELLEX software v.1.2.2 (ref. 28). Due to memory limits, we performed bootstrapping by sampling the HYPOMAP dataset randomly into ten smaller datasets, each containing 100,000 cells. CELLEX was then performed on each of the subsets, and the mean values were taken forward for the subsequent enrichment analysis.
Using the resulting cell-type specificity matrices, we ran CELLECT28 with MAGMA29, alongside GWAS data from the GIANT BMI meta-analysis (Nmax = 806,834)27, to prioritize hypothalamic cell types that showed enrichment in the BMI GWAS. CELLECT-MAGMA (v.1.3.0) was run with default parameters across the 452 tested hypothalamic cell types, setting the multiple-test corrected significance threshold at P < 0.05/452 and followed-up by CELLECT-GENES, but setting the percentile cutoff to 95. CELLECT-MAGMA was also run on reference signature values from cell2location and the above-mentioned subsets as a sensitivity analysis (Extended Data Fig. 9).
We analysed exome-sequencing-based rare variant burden, as described in Gardner et al.63 using data from up to 454,787 individuals from the UK Biobank study31 through the UK Biobank Research Access Platform (https://ukbiobank.dnanexus.com). Variants were then annotated with the ENSEMBL Variant Effect Predictor (VEP)64 v.10448 with the ‘everything’ flag and the LOFTEE plugin65 and prioritized a single MANE v.0.97 or VEP canonical ENSEMBL transcript and most damaging consequence as defined by VEP defaults. To define PTVs, we grouped high-confidence (as defined by LOFTEE) stop gained, splice donor/acceptor and frameshift consequences. All variants were subsequently annotated using CADD (v.1.650)66. BMI for all participants was obtained from the UK Biobank data showcase (field 21001). After excluding people with missing data, 419,692 people with BMI measures remained for downstream analysis. To assess the association between rare variant burden and BMI, we implemented BOLT-LMM (v.2.3.551)67, using a set of dummy genotypes representing the per gene carrier status. For the latter, we collapsed variants with a MAF < 0.1% across each gene and defined carriers of variants as those with a qualifying high-confidence PTV (HC–PTV) as defined by VEP and LOFTEE or ‘damaging’ variants (DMG), including missense variants with a CADD score greater than or equal to 25 and the aforementioned HC–PTVs. Genes with fewer than ten carriers were excluded. BOLT-LMM was run with default settings and the ‘lmmInfOnly’ flag and all analyses were controlled for sex, age, age2, WES batch and the first ten genetic ancestral principal components as calculated31. Gene-level BOLT association summary statistics were then extracted for the 426 identified effector genes, setting the multiple-test corrected threshold at P < 0.05/426.
Finally, to identify which GWAS signals were proximal to the identified effector genes, we also performed signal selection on the GIANT BMI GWAS meta-analysis. GWAS summary statistics were filtered to retain variants with a MAF > 0.1% and that were present in at least half the contributing studies. Quasi-independent genome-wide significant (P < 5 × 10−8) signals were initially selected in 1-Mb windows and secondary signals within these loci were further selected by conditional analysis in GCTA68, using a linkage disequilibrium reference derived from the UK Biobank study. Primary signals were then supplemented with unlinked (R2 < 5%) secondary signals, whose association statistics did not overtly change in the conditional models. Signals were mapped to proximal effector genes, within 500-kb windows. For genes within 500 kb of multiple GWAS signals, the most significant signal is shown in Supplementary Table 19.
Results from CELLECT and exome associations were visualized using ggplot2 (v.3.4.2) in R (v.4.2.1).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.