Repeated head trauma causes neuron loss and inflammation in young athletes

September 17, 2025

72

Neuropathological Diagnosis

Brain tissue was obtained from the CTE and the National Center for PTSD Brain Banks. Identical intake and tissue processing procedures occur with both brain banks. Four controls included in Nissl quantification were provided by the Iowa Neuropathology Resource Laboratory. Neuropathological examination was performed by board certified neuropathologists as described previously^10,44. Diagnosis of CTE was determined using published consensus criteria^10,44. Demographics such as athletic history, military history, traumatic brain injury history, and RHI history were queried during telephone interview with next of kin as detailed previously^10,44. Institutional review board approval for brain donation and informed consent for research was obtained through the Boston University Alzheimer’s Disease and CTE Center, Human Subjects Institutional Review Board of the Boston University School of Medicine, and VA Boston Healthcare System (Boston, MA). Individuals were included in the snRNA-seq and single-molecule fluorescence in situ hybridization (smFISH) experiments based on frozen tissue availability, quality (RNA integrity number (RIN) > 4) and diagnosis. Those used for immunohistochemistry were included based on the same criteria except frozen tissue availability and RIN. Exclusion criteria included neuropathological diagnosis other than CTE, moderate to severe traumatic brain injury directly prior to death, age of death greater than 51 or less than 25. Control cases did not have exposure to any RHI, were negative for any neurodegenerative disease, and did not carry any diagnosis of a neuropsychological disorder.

snRNA-seq

Fresh frozen brain tissue was collected from the dorsolateral frontal cortex of each donor at the depth of the cortical sulcus. Visual delineation of grey and white matter was used to collect 50 mg of grey matter tissue. Tissue was processed and cleaned of white matter prior to homogenization at two levels. First, when removing samples from frozen coronal slabs, the unbiased technician visually inspected and avoided white matter that could be adjacent to target grey matter. Second, immediately before tissue homogenization, a second technician inspects the tissue and removes any remaining white matter. This preparation allows for a highly specific grey matter enrichment. Nuclei isolation and sorting were performed on two donor samples per day, randomizing for diagnosis and age. Tissue was kept on ice throughout nuclei isolation. Tissue was homogenized and lysed in NST Buffer with DAPI (146 mM NaCl, 10 mM Tris, 1 mM CaCl₂, 21 mM MgCl₂, 0.1% BSA, 0.1% NP-40, 40 U ml⁻¹ Protector RNase Inhibitor and DAPI) and snipped with scissors on ice for 10 min. Debris was removed using a 70-μm filter. Cells were spun down and resuspended in nuclei storage buffer (2% BSA, 400 U ml⁻¹ Protector RNase Inhibitor) to reach a concentration of 500–1,000 nuclei per μl. Nuclei were purified for DAPI-positive cells with a FACS Aria flow cytometer to remove debris and processed using the Chromium Next GEM Single Cell 3′ Reagents Kit V2 (10x Genomics) to create cDNA libraries. Samples were pooled in two batches sequenced with Azenta to a read depth of 30,000 reads per cell on an Illumina NovaSeq.

Processing, quality control and clustering of snRNA-seq data

CellRanger v.6.0.1 was used to align reads to the GRCH38 reference and generate filtered count matrices containing 233,555 cells across all samples. The runCellQC function in the singleCellTK R package was used to generate quality control metrics and doublet calls^45,46. Contamination from ambient RNA was identified using decontx using the full raw matrix as the ‘background’ for each sample⁴⁷. Nuclei were removed if they had ambient RNA contamination fraction greater than 0.3, mitochondrial or ribosomal percentage greater than 5%, total counts less than 750, or genes detected less than 500. The data were not down sampled to maximize capture of rare populations. The Seurat workflow within the singleCellTK package was used for clustering starting with the decontaminated counts from decontx⁴⁸. In brief, the data were normalized and scaled using runSeuratNormalizeData and runSeuratScaleData. Highly variable genes were identified using runSeuratFindHVG with the method vst. Principal components were determined using runSeuratPCA. UMAP dimensionality reduction was calculated using runSeuratUMAP. Clusters across all cell types were identified using the runSeuratFindClusters function at a resolution of 0.3. After initial clustering all the cells, clusters that were predominantly doublets (>50%) were removed and produced the final dataset of 170,717 nuclei (Extended Data Fig. 1h–k). Associations with post-mortem interval (PMI), age at death and sequencing batch were performed using Pearson’s correlation analysis in R (Supplementary Fig. 4). Age at death was associated with only excitatory neuron L5_FEZF2_PCP4_RPRM and inhibitory neuron PVALB_SCUBE_PTPRK proportions. Therefore, age was not included in regressions performed with sequencing data. PMI correlated with only one microglial subtype (RHIM1), perivascular macrophages, an excitatory neuron subtype (L2_4CUX2_COL5A2) and several oligodendrocyte subtypes. Sequencing batch was associated with one cluster of OPCs and was therefore not included in analyses.

All GO analysis was performed using MetaScape default settings⁴⁹. DEG lists for all comparisons available in Supplementary Tables 6–16.

Cell-type identification

Cell-type markers verified by previous human snRNA-seq studies were used to identify clusters that belonged to individual cell types (Extended Data Fig. 1m,n). Cell types were subsetted out using subsetSCEColData and reclustered by the same Seurat method described above with the addition of running Harmony to account for sample-to-sample variability⁵⁰. Clusters expressing high levels of >1 cell-type marker were removed. Excitatory and inhibitory neurons identified from the full dataset were clustered together to determine neuronal subtypes. Four clusters (1, 2, 19 and 21) were found to express low levels of neuronal genes and astrocytic genes (SLC1A2 and SLC1A3), and were single-batch enriched (80–90%) therefore these clusters were not included in downstream analysis (Extended Data Fig. 8a–d).

Module analysis

Celda

Gene co-expression modules were identified using Celda⁵¹. Celda utilized Bayesian hierarchical linear mixed effects models to identify modules of genes that are expressed together. A workflow overview can be found in Extended Data Fig. 4. Celda was run on cellular subtypes to determine module scores on a cell-wise basis and plotted across cellular subtypes. Statistical analysis of module enrichment was performed using a linear mixed effects model using sample ID as a covariate. For microglia, cell subtypes were compared to homeostatic microglia as a baseline, for endothelial cells Cap1 was used, for astrocytes Astro1 (homeostatic astrocytes) were used as a baseline. Celda module analysis was plotted as Violin plots as these types of plots demonstrate statistical differences and also allow for visualization of the variance within the data (Supplementary Figs. 1, 7, 8). Additionally, to help further validate findings, radar plots for each Celda module were also provided to help visualize module distribution among all groups (Supplementary Figs. 3, 5 and 10).

hdWGCNA

hdWGCNA (v.0.4.5) was also run to validate gene co-expression modules in astrocytes, microglia, and endothelial cells. The hdWGCNA workflow was run with default parameters except min_cells was set to 25 and k was set to 10 for the metacells analysis performed by the MetacellsByGroups function. Additionally, minModuleSize was set to 25 in the ConstructNetwork function for astrocytes and microglia. Radar plots were provided to demonstrate cell-type distribution. Metascape⁴⁹ was used to generate GO analyses for Fig. 2f. Statistics for GO were generated with GSEA and single-tailed hypergeometric test with Benjamini–Hochberg multiple hypothesis correction.

hdWGCNA and Celda Modules were compared against each other for further validation. All major modules of interest could be observed in both module analyses (Supplementary Figs. 2c, 6d and 9d). The discrepancy between module numbers with hdWGCNA and Celda was the result of how each technique process data. Celda clusters every gene into a module, in contrast to hdWGCNA that does not. Celda also captures modules that are broadly expressed across many clusters rather than modules only expressed in small numbers of clusters. Biological function of each module was assessed with the EnrichR package to validate functional significance. Finally, in order to efficiently run hdWCGNA on single cell data, a prior step must be performed that reduces the cells to ‘metacells’. According the hdWGCNA tutorial, “metacell aggregation approach does not yield good results for extremely underrepresented cell types”, which probably also contributes to the reduced module number. Although module numbers may differ, important modules of interest were preserved through both datasets.

All module genes and statistical analysis can be viewed in Supplementary Tables 17–19, analysis code is available on GitHub at www.github.com/morganebutler/singleCellScripts.

External dataset comparison

The Sun et al. dataset²³ was downloaded from https://compbio.mit.edu/microglia_states/. Another Sun et al.²⁵ dataset was downloaded at http://compbio.mit.edu/scADbbb/. For the microglia, bootstrapping was performed by randomly sampling 80% of the Sun dataset with replacement for 50 iterations. For each iteration, FindTransferAnchors from the Seurat package was used to project the current microglia dataset onto the Sun UMAP space, and MapQuery to predict microglia labels. Label calls were recorded for each iteration and the label consistency was reported as the percentage of iterations the same label was called in Extended Data Fig. 5d,e.

For Visium projection of astrocyte subtype genes, publicly available Visium data from human cortex (Adult Human Brain 1) were downloaded from the 10x Genomics website. The Seurat function AddModuleScore was used to create a per-spot score for astrocyte subtype gene expression (significantly upregulated genes in each subtype). Plots were created with SpatialFeaturePlot and displayed in Extended Data Fig. 6i.

MultiNicheNet

Ligand–receptor pair analysis was performed using MultiNicheNet, an adaptation of nichenet that allows for comparison across more than two condition groups. In brief, this method uses known datasets of ligand–receptor pairs and their downstream targets to identify potentially upregulated cell signalling pathways across cell types accounting for differential expression of genes across groups. MultiNicheNet also uses prioritization of top ligand–receptor pairs to help identify signalling pathways of interest. Contrasts for differential gene expression were set as RHI versus control, and CTE versus RHI to determine RHI and CTE-specific signalling pathways. Finalized cell-type objects were combined and run through the MultiNicheNet pipeline with the exclusion of T cells due to low cell numbers. Analysis was performed without alteration to publicly available code, save for the contrasts used.

Histological tissue processing

Formalin-fixed, paraffin-embedded tissue was sectioned and labelled as described⁵². In brief, 10-μm sections were allowed to dry, baked, dewaxed and rehydrated prior to antibody labelling. For immunofluorescent staining, epitope retrieval was performed using a pH 6 or pH 9 buffer and boiling for 15 min in the microwave. Sections were blocked for 30 min at room temperature with 3% donkey serum and primary antibodies (Supplementary Table 4) were conjugated for 1 h at room temperature. Secondary antibodies were conjugated for 30 min, and Opal TSA dyes were incubated for 10 min. Slides were coverslipped with ProLong Gold Antifade mounting medium (Invitrogen) and imaged at 20× or 40× on a Vectra Polaris whole-slide scanner with the appropriate filters. Images were spectrally unmixed using inForm software prior to image analysis. For Nissl staining, sections were hydrated and stained in 0.01% thionin for 20–40 s and dehydrated back to xylene before coverslipping in Permount mounting media and imaging on an Aperio GT450 scanner at 40×. As formalin-fixed histologic tissue was more readily available than frozen samples, more samples could be utilized for immunohistochemistry and in situ hybridization experiments. A full list of samples that were included in each immunohistochemistry experiment is shown in Supplementary Tables 2 and 3.

smFISH and immunohistochemistry codetection

Tissue was embedded in Optimal Cutting Temperature medium (Sakura Tissue-Tek) and was brought to cryostat temperature (−20 °C) before cutting. Chuck temperature was raised to −12°/−10 °C for optimal cutting conditions. Tissue was sectioned at 16 µm thickness onto Fisher SuperFrost slides. Direction of tissue orientation relative to the depth of the cortical sulcus was randomized across samples. Sections were fixed in cold 4 °C 10% neutral buffered formalin for 60 min and dehydrated in 50%, 70%, 100% and 100% ethanol for 5 min each at room temperature. Fluorescent in situ hybridization was performed using RNAScope kits (Advanced Cell Diagnostics) (Supplementary Table 5) optimized on the Leica BOND Rx automated slide staining system. Slides were pretreated with protease for 15 min. Opal TSA dyes were used for visualization at a concentration of 1:300–1:500. A positive and negative control probe was run for each block before staining with targeted probes. For immunohistochemical codetection of p-tau and GLUT1, sections were run through the RNAScope protocol as described and then manually stained with the AT8 or GLUT1 antibody (Supplementary Table 4) with the immunohistochemical protocol described in ‘Histological analysis’ without the antigen retrieval. Samples included in each smFISH experiment are listed in Supplementary Table 2. Not all samples were used across every smFISH experiment due to exhaustion of sample blocks.

Image analysis

Analysis of fluorescent RNAScope fluorescence in situ hybridization (FISH) was performed in Indica Labs HALO using the FISH v.3.2.3 algorithm or the FISH-IF v.2.2.5 algorithm. Thresholds for FISH probe positivity for was set manually for each probe (HIF1A, SPP1, P2RY12, ITGAV, TGFB1, TGFBR2, LAMP5 and CUX2) and kept consistent across samples. It should be noted that SPP1 is not exclusively expressed by microglia, and DEG analysis demonstrated that only oligodendrocytes showed elevated expression of SPP1 in our dataset (Supplementary Table 6b). However, colocalization with microglia markers allows for a microglia-specific count of SPP1 activity. Gene expression was determined by the ‘probe cell intensity’ in HALO because this measure is agnostic to manual single copy intensity settings. The background on GLUT1 staining in FISH sections was variable due to protease treatment from RNAScope and thresholds were manually adjusted to remove background staining. Vessel proximity analysis was performed by evaluating TGFB1+P2RY12+ cells and GLUT1⁺ITGAV+TGFBR2+ cells and using the ‘proximity analysis’ algorithm in the HALO spatial analysis settings. The number of unique marker-positive microglia/vessel pairs within 25 µm were evaluated. Density heat maps for CUX2+LAMP5+ staining were created using the ‘density heatmap’ function within HALO spatial analysis. Depiction of how the sulcus and crest were annotated can be found in Extended Data Fig. 10d. To validate consistency between image analyses methods and snRNA-seq results, seven samples that were included in both RNAScope and snRNA-seq methods were compared and cellular proportions of CUX2+LAMP5+ neurons significantly correlated (P = 0.02; Extended Data Fig. 10c).

Analysis of immunohistochemistry protein staining was performed using the HALO Object Colocalization v.2.1.4 and HighPlex v.4.3.2 algorithm. Microglial P2RY12 was assessed by DAPI⁺IBA1⁺ nuclei and P2RY12^hi^/low thresholds were set manually. High P2RY12 was defined as having at least 70% of the nucleus stained, low P2RY12 was defined as less than 70% of the nucleus stained as demonstrated visually in Fig. 2n. Only 5.4% of all IBA1⁺ or P2RY12⁺ cells were P2RY12⁺IBA1⁻, suggesting that 94.6% of labelled microglia were assessed. IBA1⁺P2RY12⁻ cells may have been captured in our P2RY12^low categorization, however previous studies have shown that these cells are low in abundance and likely represent infiltrating macrophages which have been shown to be present mainly at lesioned vessels in CTE which are also sparse in our cohort^53,54.

Analysis of Nissl staining was performed using the HALO Nuclei Segmentation AI algorithm. Neurons were selected for training based on previously published criteria⁵⁵. In brief, the classifier was given examples of brain parenchyma annotated for neurons which were considered cells with a Nissl-positive cytoplasm and a visible nucleus (Extended Data Fig. 9h). Nissl+ densities across batches were not significantly different and statistical tests of Nissl densities were corrected for staining batch. For FISH and Nissl sections, the depth of the cortical sulcus was defined and annotated as the bottom third of a gyral crest and sulcus pair. Layer 2/3 and layers 4–6 were annotated using layer-specific FISH markers or for Nissl by an expert observer.

Software and code

The following code and software was used for the analyses: CellRanger v.6.0.1 was used to align reads to the GRCH38 reference and generate filtered count matrices. All other analyses were performed in R v.4.2.1 and Python v.3.10.12 using standard functions unless otherwise stated. Specific versions of packages used are listed in available GitHub code. The following packages were used: CellRanger v.6.0.1, singleCellTK v.2.8.0, Seurat v.4.3.0, scater v.1.24.0, harmony v.0.1.1, RColorBrewer v.1.1.3, ComplexHeatmap v.2.14.0, ArchR v.1.0.2, muscat v.1.12.1, readr v.2.1.4, ggplot2 v.3.4.2, ggsignif v.0.6.4, ggpubr v.0.6.0, magrittr v.2.0.3, scCoda v.0.1.9 Python package, celda v.1.19.1 and hdWGCNA v.0.4.5.

HALO v.3.6.4134.193, HALO AI v.3.6.4134, HALO Object Colocalization v.2.1.4 algorithm and FISH v.3.2.3 algorithm were used to analyse the histological and Nissl images. InForm v.2.5.1 was used to spectrally unmix fluorescent in situ hybridization images.

Inclusion and ethics statement

The research has included local researchers through the research process and is locally relevant with collaborators. All roles and responsibilities were agreed amongst collaborators ahead of the research. The research was not severely restricted in the setting of researchers. The study was approved by the Institutional review board through the Boston University Alzheimer’s Disease and CTE Center, Human Subjects Institutional Review Board of the Boston University School of Medicine, VA Bedford Healthcare System, VA Boston Healthcare System, and Iowa Neuropathology Resource Laboratory. The research did not result in stigmatization, incrimination, discrimination, or risk to donors or research staff. No materials have been transferred out of the country. Local and regional research relevant to the study has been included in the citations.

Statistics and reproducibility

Analyses were performed using GraphPad Prism 10, SPSS v.29 and R (v.4.2.1) packages ggsignif, muscat, scater, and the Python (v.3.10.12) package scCoda. Dirichlet multinomial regression was used to test for cell type and excitatory neuron cell-type enrichment using the scCoda v.0.1.9 Python package¹³. Celda module expression was evaluated using linear mixed effects modelling, accounting for individual sample differences. Comparisons of cell-type proportions across the three pathological groups were performed using ANOVA with Bonferroni correction, Brown Forsyth with Dunnett post-hoc test, or chi-squared test as indicated in figure legends. Comparison across control and RHI-exposed groups was performed with a t-test with Welch correction or Mann–Whitney U-test, as indicated in the figure legends. Bar plots denote error with s.e.m. Scatter plots denote error with a grey outline of the 95% confidence interval. Evaluation of in situ hybridization analysis was performed using linear regression. P-tau burden was normalized using log₁₀ transformation of positive area density. Nissl⁺ neuron count comparisons to years of exposure were assessed using linear regression and correcting for age at death and staining batch. Jaccard similarity scoring was performed using the GeneOverlap package by comparing lists of DEGs. All DEGs were filtered by a log₂-transformed fold change of 0.15 and false discovery rate (FDR) of <0.05. Chi-squared tests for cellular abundance were performed using the base R chisq.tst function. GO analysis P values were acquired through MetaScape analysis. GO statistics were calculated with GSEA and single-tailed hypergeometric test with Benjamini–Hochberg multiple hypothesis correction. Years of football play was used as a variable for exposure throughout the text instead of total years of play (which includes exposure from all sports) played because it was a more consistent predictor of cellular changes.

snRNA-seq tissue isolation was performed once per each individual. Reproducibility was assessed through comparison to other published datasets^23,25. As detailed in Extended Data Figs. 3, 5 and 7, there was significant overlap between our subtypes and other previously published subtypes, highlighting that our results are highly reproducible. For all histological antibody, Nissl and in situ hybridization staining, individual cases were stained and analysed once per each experiment. Histologic methods were validated and optimized prior to the start of the experiment to ensure proper labelling and accurate downstream analyses as discussed in the previous sections.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Repeated head trauma causes neuron loss and inflammation in young athletes

Neuropathological Diagnosis

snRNA-seq

Processing, quality control and clustering of snRNA-seq data

Cell-type identification

Module analysis

Celda

hdWGCNA

External dataset comparison

MultiNicheNet

Histological tissue processing

smFISH and immunohistochemistry codetection

Image analysis

Software and code

Inclusion and ethics statement

Statistics and reproducibility

Reporting summary

Elusive ‘nuclear clocks’ tick closer to reality — after decades in the making

Static electricity is still a mystery — here’s what we know

I paused my PhD for 11 years to help save Madagascar’s seas

Most Popular

Amid legal turmoil, Kalshi is temporarily banned in Nevada

What happened at Nvidia GTC: NemoClaw, Robot Olaf, and a $1 trillion bet

How gig apps like Kled AI, Silencio, Neon Mobile, and Luel AI pay users for data that AI companies can use to train models,...

Why Wall Street wasn’t won over by Nvidia’s big conference

Recent Comments

ABOUT US

POPULAR POSTS

Amid legal turmoil, Kalshi is temporarily banned in Nevada

What happened at Nvidia GTC: NemoClaw, Robot Olaf, and a $1 trillion bet

How gig apps like Kled AI, Silencio, Neon Mobile, and Luel AI pay users for data that AI companies can use to train models,...

POPULAR CATEGORY