Data
The PF data were extracted from the FORCIS4,21 and ForCenS23 databases and covered the post-industrial and pre-industrial time periods, respectively. Seasons for both databases were distinguished between the Northern and Southern Hemispheres and followed the meteorological seasons. For the Northern Hemisphere, autumn was defined as September, October and November, winter as December, January and February, spring as March, April and May, and summer as June, July and August. For the Southern Hemisphere, spring was defined as September, October and November, summer as December, January and February, autumn as March, April and May, and winter as June, July and August.
FORCIS database
More than 188,000 subsamples (each comprising one single plankton aliquot collected within a certain depth range, time interval and size-fraction range, at a single location), taken from around 163,000 samples collected from different oceanographic environments using plankton nets (approximately 22,000 subsamples from about 6,000 samples), continuous plankton recorders (CPRs) (approximately 157,000 subsamples), sediment traps (approximately 9,000 subsamples) and plankton pumps (approximately 400 subsamples) from the global ocean (Extended Data Fig. 1), were extracted from the FORCIS database4,21. These data included published and unpublished post-industrial data from the literature between 1910 and 2018. The total abundances of the PF, and species-specific counts for these samples, were extracted from FORCIS.
ForCenS database
Around 5,000 samples, covering the pre-industrial period, form the basis of the ForCenS database23 of PF census counts from marine surface-sediment samples. This database only includes PF tests larger than 150âµm in diameter. In our study, the species abundances were normalized to 0 and 1 to represent species absence and presence, respectively.
Data and taxonomic harmonization
To perform data harmonization and address methodological and taxonomic biases, we converted the abundance data extracted from FORCIS into abundance (individuals per m3). Specifically, where coarse fractions were sampled using a mesh size greater than 100âμm, we employed the approach described in Chaabane et al.60 for standardization. This approach involves converting the abundance data extracted from different sampling devices, such as plankton nets and pumps, into a common unit of individuals per cubic metre (individuals per m3). To achieve this, we used size-normalized catch model equations, obtained from the sampling depths, on the total PF for cytoplasm-filled and empty tests. We were able to accurately quantify the abundances in the coarse fractions down to 100âμm, by applying the following equations:
$${C}_{{\rm{sz}}\_{\rm{norm}}}^{\infty }=\,{C}_{{\rm{sz}}\_\inf }^{{\rm{sz}}\_\sup }\,\frac{{f}_{\max }-{f}_{{\rm{sz}}\_{\rm{norm}}}}{{f}_{{\rm{sz}}\_\sup }-{f}_{{\rm{sz}}\_\inf }}$$
where sz_inf and sz_sup denote the lower and upper size limits of the measured size fraction, respectively, sz_norm represents the normalization size, and \({f}_{{\rm{Sz}}}\) is the multiplication factor associated with sz, calculated as follows:
$${f}_{Sz}=1+({f}_{\max }-1)\frac{({S}_{\sup ,k}-{S}_{\sup ,1})}{({S}_{\sup ,k}-{S}_{\sup ,1})+({S}_{{\rm{half}}}-{S}_{\sup ,1})}$$
with \({S}_{\sup ,1}\) and \({S}_{\sup ,{k}}\) being the upper size limits of size classes 1 and k, respectively, while Shalf and fmax are reported in ref. 60.
This facilitated computation of the assemblages as they would have appeared if all the material had been sampled using a 100âμm net, thus ensuring consistency across different sampling devices and conditions. Predicting the abundances of very small and rare species is still challenging, and so those data are not interpreted here.
The FORCIS dataset has been assessed and corrected for taxonomic discrepancies over the past century4, and although some species might have been misidentified, this appears relatively unlikely because, overall, mostly large PF, whose taxonomic determination is easier, were collected in the early part of the record (that is, due to large net mesh sizes in the 1960s). All species counts were generated for both the âlumpedâ and âvalidatedâ taxonomies proposed by the FORCIS group4 and based on the analytical reasons. For this study, several morphospecies were grouped (lumped) together for data analysis, such as G. ruber (Globigerinoides ruber albusâ+âGlobigerinoides elongatus), Globorotalia truncatulinoides (G. truncatulinoides leftâ+âG. truncatulinoides right) and T. sacculifer (T. sacculifer no sacâ+âT. sacculifer with sac). However, G. ruber ruber (pink) was analysed separately from G. ruber (white). Based on their species-specific habitat preferences, the PF assemblages were grouped into six provinces (Supplementary Table 1), that is, tropical, subtropical, temperate, subpolar, polar and global34. Additionally, they were split into three groups based on their food preference regime, that is, symbiont-bearing, symbiont-barren and facultatively symbiont-bearing, following the work of Takagi et al.61 and Hemleben et al.62.
Data analyses
To account for sampling biases, the following corrections were applied: (1) some analyses were focused on latitudinal bands to dilute the effect of under-sampled regions through time. The latitudinal bands were selected based on the assemblagesâ provinces and on Bé and Tolderlund63 and Schiebel and Hemleben34; (2) the sample depth selection was limited for most of the analyses to the depth where most of the individuals were sampled alive so as to not confuse between depth habitat and post-mortem assemblages; (3) most of the analyses were limited to the spring and summer blooms in the North Atlantic Ocean, where the sampling coverage and number of samples were highest (Extended Data Fig. 1); (4) the abundances were standardized from 0 to 1 to correct for the number of samples and to enhance their visualization in heatmaps; and (5) to analyse the first and second halves of the dataset, distinct cut-off points were established. To separate the dataset in time, into two fractions of equal size, the cut-off date was set at 1990. In Fig. 2, the cut-off year is set at 1997 because this figure is based solely on multinet data, which have only been available since 1989. This allowed the comparison of two datasets with similar amounts of information.
Environmental data
Temperature and Ωcalcite estimates associated with each PF from the North Atlantic samples (Fig. 4 and Extended Data Fig. 9) were taken from the Institut Pierre-Simon Laplace global climate model IPSL-CM6A-LR (ref. 64), in its Coupled Model Intercomparison Project Phase 6 (CMIP6) historical simulation, version r22i1p1f1, with a monthly resolution. The model output was converted to the World Ocean Atlas spatial grid, with a 1âÃâ1° resolution. The Ωcalcite was computed by dividing the modelled in situ carbonate-ion concentration (denoted âco3â in the CMIP6 nomenclature) by the carbonate-ion concentration at saturation with respect to calcite (denoted âco3satcalcâ). The temperature and Ωcalcite were then associated with each sample in the FORCIS database by linearly interpolating the model output temperature to the latitude, longitude, water depth and time of sampling. For the multinet, CPR and pump data, we used the averaged sampled depth. For the time series (Fig. 4aâf and Extended Data Fig. 9), we averaged the data annually, and over the latitudinal ranges in the Atlantic Ocean, but only considering those values that corresponded to locations where a PF sample was present. We extracted the locations and water depths corresponding to each PF sample from the IPSL-CM6A-LR SSP2-4.5 simulation, which corresponds to future predictions in response to the atmospheric CO2 growth trajectories of the SSP2-4.5 middle-of-the-road scenario, to calculate how temperature and Ωcalcite would change throughout the 21st century.
Biodiversity change
The species richness was calculated for each 3° latitudeâÃâ6° longitude box and for time for each data point from the plankton net, pump and CPR profile and sediment trap sample in FORCIS (Fig. 1a and Supplementary Figs. 1A and 2). To address the potential disparity between the sediment trap and plankton net data, the sediment trap samples, which were collected over an average period of 15 days, were compared to the plankton net profiles that consisted of multiple samples gathered from the same location and within the same time interval. The sediment trap collection periods captured a representative range of species comparable to those found in the plankton net profiles taken from similar locations (Supplementary Fig. 3). In addition, the latitudinal diversity gradient for the ForCenS data was determined by calculating the number of species present in each sample and at each latitude, and comparing these to those obtained from the FORCIS plankton net, pump and CPR profiles and sediment trap samples (Fig. 1b and Supplementary Fig. 1B). Then, for each 10° latitudinal bin, the 95th percentile of species richness was selected, assuming discrete sampling using nets, CPRs and pumps. These selected data points were then fitted with a generalized additive model to capture the variability in species richness along the latitudinal gradient. These analyses focused on 26 species that were present in both the ForCenS and FORCIS databases, using the same taxonomic criteria. Species that were absent from ForCenS, such as Bolivina variabilis, Tenuitellita fleisheri and Tenuitellita parkerae, were not considered because comparison between the datasets would not be possible. In a second step, our assessment aimed to determine any species loss or gain from the pre-industrial to the post-industrial time periods. For this analysis, the diversities in FORCIS and ForCenS were evaluated for each 4.5° latitudeâÃâ9° longitude box, considering data from the plankton net, pump and CPR profiles and sediment trap samples in FORCIS and each sample in ForCenS. The species richness from FORCIS was subtracted from that from ForCenS within each grid to identify any changes (losses or gains) in species richness. This analysis focused on the major species (26 species) on a presence/absence basis (Extended Data Fig. 3). To contrast the distribution patterns between the post-industrial (FORCIS) and pre-industrial (ForCenS) samples, the species counts were transformed into presence and absence data, ensuring uniform taxonomy across both databases. The data were visualized on a grid map, with each grid cell representing 2.8° of latitude and 5.6° of longitude. From the 26 species under consideration, nine were specifically chosen based on their main environmental niches (including polar and subpolar, tropical to subtropical, globally distributed and deep-sea species), significant relevance to palaeoceanography and the introduction of new insights (for example, G. uvula) (Extended Data Fig. 4).
Spatial and vertical migration of planktonic foraminifera
The speciesâ poleward migration was first assessed from the FORCIS database. Thus, the loss and gain of species in the northernmost limit of the North Atlantic and Arctic Oceans was assessed on the PF living at depth ranges of between 0 and 100âm during spring and summer before and after the cut-off year of 1990. The maximum latitude of the northernmost 5% of samples was calculated for all species before and after 1990. The difference between the maximum latitude before and after 1990 was evaluated to assess the direction and magnitude of the latitudinal migration of the different species (Extended Data Fig. 7). The selection of various cut-offs for spatial migration was influenced by the quantity of data available, both before and after applying specific filters, which were sometimes very few between 1910 and 1970.
The analyses were complemented by assessing the vertical occurrences of the PF through the water column. The vertical ranges of the PF were assessed for two latitudinal bandsâ0 to 30°âN and 30 to 50°âNâin the North Atlantic Ocean (Fig. 2). Only multinet data sampled across the upper 200âm and profiles with four or more samples and the same sampling resolution (for example, depth separations) during both spring and summer were selected. These data cover the period between 1980 and 2018. We focused on spring and summer due to the increased biological productivity at this time, the warmer sea-surface temperatures, and the greater data availability from the more-frequent research cruises occurring during these seasons. The depth of maximum abundance was calculated for each profile and for each species, and then the results before and after the cut-off year (1997) were compared (Fig. 2). To assess whether there were significant changes in depth associated with the maximum abundances of the different species in each latitudinal band and through time, an analysis of variance (ANOVA) was conducted (Fig. 2 and Supplementary Table 2).
Abundance changes
Changes through time
To determine the temporal variation in PF abundance, we analysed the PF per cubic metre (no. per m³) collected using plankton nets and pumps across different depths (0 to 200âm) and geographical regions (from the North Atlantic to the Arctic Oceans). These data were aggregated into three distinct latitudinal bands: 0 to 30°âN, 30 to 50°âN and 50 to 90°âN (Fig. 3). For each decade since 1940, we normalized the total abundance of PF for each species within these bands. Normalization from 0 to 1 was performed to facilitate comparison across species and regions by standardizing the data, removing the effects of differing scales of abundance. This approach allowed us to observe relative changes in abundance over time, making it easier to identify trends and patterns. To investigate whether there were significant changes in the abundances of different species in each latitudinal band, an ANOVA was conducted using the anova() function on the fitted regression models. The ANOVA helped to determine whether the observed variations in species abundance across time bins were statistically significant (Fig. 3 and Supplementary Table 3). The ANOVA calculated a P value that indicated the probability of observing the differences in abundance between species by chance alone. A low P value (below 0.05) suggested that the observed differences were unlikely to be due to random variation and more likely to represent real change. Then, similarly to the previous analysis, we assessed the abundances of the different species collected using plankton nets and pumps only, and over a larger depth range of between 0 and 300âm, from the North Atlantic to Arctic Oceans, and in each latitudinal band with a resolution from 20 to 40°âN and for each species (Extended Data Fig. 8).
Changes with latitude
For the data presented in Extended Data Fig. 2, only those studies that provided species counts for the Atlantic, Antarctic and/or Arctic Oceans, using plankton net data sourced from the FORCIS database, were considered because these presented the best temporal and spatial data coverage for statistically significant analysis. The selected studies covered the time period from 1980 to 2018 (Extended Data Fig. 2). The depth profiles of these data ranged from 0 to more than 100âm water depth. To examine the latitudinal abundance trend of PF species over time, the surface data (down to 100âm) and deep-water samples (below 100âm) from 1980 onwards were plotted against latitudes grouped in 10° intervals. This analysis specifically focused on the spring and summer seasons, which had the best-documented data.
Thermal habitat changes
To study the thermal preferences of the PF species in the global ocean, the abundances obtained from the plankton net, pump and CPR samples from above 100âm water depth were normalized from 0 to 1, and each species was assessed in each 1â°C binned and extracted in situ temperature from the Reanalysis Data Hadley EN 4.2.1 analyses g10 at 1âÃâ1° resolution, and assessed for the different time periods, before and after 1990 (Extended Data Fig. 5).
To test the effect of high temperatures on the PF assemblages at low latitudes (30°âS to 30°âN), the normalized abundances of the FORCIS samples, collected from data from net-tows during spring and summer over a depth range of 0 to 100âm and before and after the cut-off date (1990) (nâ=â1,000) (Extended Data Fig. 6), were plotted against their binned (at each 1â°C) in situ temperatures, between 15 and 32â°C, and sourced from the Hadley EN 4.2.1 analyses g10 at a 1âÃâ1° resolution. The mean values of the normalized abundances and standard errors between samples were assessed and plotted against the binned temperature to assess their evolution with increasing temperature. The number of profiles was assessed for each 1â°C temperature bin and per species (Supplementary Table 4). Cluster analyses were applied to the data, based on principal coordinate analyses before the Euclidean distance computation between the species, to cluster the species assemblages that had the same response. The Euclidean distance matrix was then computed from the scores of the two first principal components, with those consistently explaining more than 86% of the total variance (dim1â=â79.7%, dim2â=â6.9% before cut-off; dim1â=â66.6%, dim2â=â21.8%) (Extended Data Fig. 6).
R packages
The computational analysis conducted in this study utilized a variety of open-source tools in the R environment v.4.1.2 (ref. 65). In the comprehensive analysis of PF abundance variations, multiple high-performance R packages were deployed. For handling string manipulations and pattern matching, stringr was utilized66. dplyr allowed for robust data transformation and filtering67, while vegan was used to conduct the ecological multivariate data analyses68. Reading and writing the Excel files was made seamless using openxlsx69, whereas the phylogenetic and evolutionary analysis was facilitated by ape70. The package pheatmap allowed for the creation of heatmaps71. Clustering and partitioning of the data to identify patterns were executed using cluster72, while the multivariate data analysis results were extracted and visualized using factoextra73. The viridis package supplied colourblind-friendly colour palettes74, and tidyr enabled easier data cleaning and wrangling75. ggplot2 and ggpubr were used to create high-quality graphics76,77, with reshape2 and reshape facilitating the reshaping of the data structures78. The plotrix package provided additional plotting functionalities79. Performance and risk calculations were executed using PerformanceAnalytics80, while the correlation matrices were visualized using ggcorrplot81. For visualization scaling, scales was applied82. Spatial data visualization was carried out using ggspatial and ggmap83,84, and the geographical maps were drawn using maps85. The ggExtra package enriched the ggplot2 graphics86. Lastly, the gridExtra package enabled the arrangement of multiple grid-based plots87. This extensive usage of high-performance R packages significantly contributed to the robust, reproducible and efficient data analysis in this work.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.