Study areas
Twenty-one large forest dynamic plots of areas between 20 and 50 ha with similar numbers of tropical, subtropical and temperate forests were used in the current study (Extended Data Table 1 and Supplementary Table 1). The forest plots are part of the ForestGEO network6 and are located in Asia and the Americas ranging in latitudes from 6° 40′ N to 48° 08′ N. Tree species richness among these plots ranges from 36 to 468. All free-standing individuals with diameter at breast height (d.b.h.) ≥ 1 cm were mapped, their size measured and identified. We focused our analysis here on individuals with d.b.h. ≥ 10 cm (resulting in 313,434 individuals) and tree species with more than 50 individuals (resulting in initially 737 species). The 10-cm size threshold excluded most of the saplings and enabled comparisons with previous spatial analyses. Shrub species were excluded. We also excluded 15 species with low aggregation (that is, kff* < kfh*; Box 1), which would lead to negative growth rates at small abundance values: ten of them from BCI, two from MST, two from NBH and one from FS (for definitions of plot acronyms, see Extended Data Table 1). These (generally less abundant) species are probable relicts of an earlier successional episode when they were more abundant19,51. We also excluded the two species Picea mariana and Thuja occidentalis of the Wabikon forest that are restricted to a patch of successional forest that was logged approximately 40 years ago52.
Most forest plots (18 out of our 21 plots, those with more than 1 census) enabled the estimation of the average mortality risk of individuals with d.b.h. ≥ 10 cm within one census period. We estimated mortality across all species and obtained for each forest plot one average mortality rate for trees with d.b.h. ≥ 10 cm (Extended Data Table 1). We also determined for all species used in our analyses the mycorrhizal association types based on available global datasets53,54,55 and website sources (https://www.mycorrhizas.info/). To determine whether a species is mainly dispersed by animals (zoochory), we used the Seed Information Database (https://ser-sid.org/) of the Society for Ecological Restoration and the Royal Botanic Gardens Kew and available literature56. Species without descriptions of mycorrhizal associations and dispersal modes were assigned according to their congeneric species. The proportion of focal species with zoochory, with AM association and with both are shown in Extended Data Table 1.
Proxy for pairwise competition strength between species
Some of our analyses required the ratio βfj/βff (Box 1) that describes the relative competitive effect of individuals of species i on an individual of the focal species f at the neighbourhood scale36,57. In general, it is challenging to derive estimates for the pairwise competition coefficients37 because this would require unfeasibly large datasets to obtain a sufficient number of neighboured f–j species pairs for less abundant species. We therefore compared two scenarios. In scenario 1, we assumed that conspecific and heterospecific individuals compete equally, thus βfj/βff = 1. In scenario 2, we assumed that individuals that are close relatives compete more strongly or share more natural competitors or pathogens than distant relatives58 (that is, βfj/βff < 1). As proxy for this effect, we used phylogenetic distances59, given in millions of years (Myr), as a surrogate for the relative competition strength because they are available for the species in our plots based on molecular data or the Phylomatic informatics tool60. To obtain consistent measurements for the ratio βfj/βff among forest plots, phylogenetic similarities βfj/βff were scaled between 0 and 1, with conspecifics set to 1 and a similarity of 0 assumed for a phylogenetic distance of 1,200 Myr, which was larger than the maximal observed distance (1,059 Myr). This was necessary to avoid discounting crowding effects from the most distantly related neighbours58.
For plots without molecular data, we used the V.PhyloMaker2 package (v.0.1.0)61 to generate a phylogenetic tree for each plot using GBOTB.extented.WP.tre updated from the dated megaphylogeny GBOTB62 as a backbone. For the other eight plots with molecular data, we followed a previously reported method63 to build the phylogenetic tree based on DNA barcode data. We then used the cophenetic function in the picante package (v.1.8.2)64 to calculate phylogenetic distance for each plot. For this, we assumed that functional traits are phylogenetically conserved36,58,65. The analyses to generate phylogenetic trees and to calculate phylogenetic distances were performed using R (v.4.3.2)66.
Crowding indices describing competition of individual trees at the neighbourhood scale
We assumed in our example model that survival of a focal tree k is reduced in areas of high local density of conspecifics and heterospecifics (that is, neighbourhood crowding), for example, through competition for space, light or nutrients, or predators or pathogens17,18,39,67, whereas reproduction is density-independent with per capita rate rf. However, our approach was also able to deal with alternative assumptions on the processes driven by neighbourhood crowding. For example, analogous models were derived for crowding effects on the reproductive rate and/or the establishment of offspring (Supplementary Text).
We describe the neighbourhood crowding around a given tree o of a focal species f by commonly used neighbourhood crowding indices36,37,58,68,69,70 (Box 1), but used separate indices for conspecific and heterospecific trees. The conspecific crowding index Cof of a given individual o of a given focal species f counts the number nf of conspecific neighbours j that have a distances doj smaller than a given neighbourhood radius r, but weights each neighbour o by its inverse distance 1/doj, assuming that farther away neighbours compete less (equation (7a)). The heterospecific crowding index Hof does the same with all heterospecifics (equation (7b)), and the heterospecific interaction crowding index Iof weights heterospecifics additionally by their relative competitive strength βfj/βff (equation (7c); see above ‘Proxy for pairwise competition strength between species’). Thus, we estimated for each individual o three crowding indices:
conspecific crowding:
$${C}_{of}=\mathop{\sum }\limits_{j=1}^{{n}_{f}}\frac{1}{{d}_{oj}}$$
(7a)
heterospecific crowding:
$${H}_{{of}}=\sum _{i\ne f}\mathop{\sum }\limits_{j=1}^{{n}_{i}}\frac{1}{{d}_{{oj}}}$$
(7b)
with niche differences:
$${I}_{of}=\sum _{i\ne f}\mathop{\sum }\limits_{j=1}^{{n}_{i}}\frac{{\beta }_{fi}}{{\beta }_{ff}}\frac{1}{{d}_{oj}}$$
(7c)
where ni is the number of neighbours of species i within distance r of the focal individual, doj is the distance between the focal individual o and its jth neighbour of species i, and βfi/βff is the competitive effect of one individual of species i relative to that of the focal species f (refs. 36,37,68,69,70).
Survival probability of individual trees
To link the survival of an individual o to its crowding indices Cof and Iof, we followed earlier work on individual neighbourhood models36,37,58,69,70 and assumed that the survival probability sof of a tree o of species f is given by
$${s}_{{of}}={s}_{f}\,\exp (\,-\,{\beta }_{{ff}}({C}_{{of}}+{I}_{{of}})),$$
(8)
where sf is a density-independent background survival rate of species f and βff is the neighbourhood-scale conspecific competition coefficients of species f (ref. 36). Statistical analyses with neighbourhood crowding indices have shown that the growth and survival of trees depend on their neighbours mostly within distances r of up to 10 or 15 m (ref. 70). We therefore estimated all measures of spatial neighbourhood patterns with a neighbourhood radius of r = 15 m.
Average survival rate of species
We used scale transition theory42 and spatial point process theory25 to transfer the individual-based microscale information on the number and distance of conspecific and heterospecific neighbours of focal individuals, which are provided by the ForestGEO census maps, into macroscale models of community dynamics8. To this end, we averaged the survival probabilities sof of all individuals o of the focal species f (equation (8)), to obtain the average population-level survival rate \({\bar{s}}_{f}\), for which we derived a closed-form expression for gamma-distributed crowding indices8:
$${\bar{s}}_{f}={s}_{f}\,\exp (\,-\,{\beta }_{{ff}}({\gamma }_{{fC}}\bar{{C}_{f}}+{\gamma }_{{fI}}\bar{{I}_{f}}))$$
(9)
where \(\bar{{C}_{f}}\) and \(\bar{{I}_{f}}\) are the average crowding indices, and γfC = ln(1 + DfC βff)/(DfC βff) and γfI = ln(1 + DfI βff)/(DfI βff) arise through the averaging step because of the nonlinearity in equation (8)8, and are driven by the dispersion (that is, the variance-to-mean ratio) DfC and DfI of the distribution of the crowding indices \(\bar{{C}_{f}}\) and \(\bar{{I}_{f}}\), respectively. In our case of high survival, where DfC βff and DfI βff are both small, γfC and γfI are near one and can be neglected (that is, γfC ≈ 1 and γfI ≈ 1).
Link between average crowding indices and spatial patterns
To incorporate the population-level survival rate (equation (9)) into our population model, we decomposed the average crowding indices into species abundance and measures of spatial patterns (Box 1). In brief, we did this by expressing the crowding indices in terms of the pair correlation function, a basic summary function of spatial statistics25, and the mean density λf = Nf/A of the species f across the whole plot of area A (see equations (S1)–(S8) in the Supplementary Text). The resulting measures kff and kfh of spatial patterns quantify the increase or decrease in average conspecific and heterospecific neighbourhood crowding, respectively, relative to the reference case without spatial patterns. For conspecifics, we expect under a random distribution of the focal species a mean crowding index of \(\bar{{C}_{f}}={c}{N}_{f}\) (where c = 2π r/A is a scaling factor, with A being the area of the plot and r the radius of the neighbourhood; see equation (S7) in the Supplementary Text), and for heterospecifics, we expect under independent placement (of the focal species with respect to the heterospecifics) a mean crowding index of \(\bar{{H}_{f}}=c\sum _{i\ne f}{N}_{i}\) (ref. 25). We therefore obtained
$$\,\bar{{C}_{f}}={k}_{{ff}}({c}{N}_{f})$$
(10a)
$$\bar{{H}_{f}}={k}_{{fh}}(c\sum _{i\ne f}{N}_{i})$$
(10b)
$$\bar{{I}_{f}}=\mathop{\underbrace{\frac{\bar{{I}_{f}}}{\bar{{H}_{f}}}}}\limits_{{B}_{f}}\bar{{H}_{f}}\,=\,{B}_{f}\,\mathop{\underbrace{{k}_{{fh}}\,(c{\sum }_{i\ne f}{N}_{i})}}\limits_{\bar{{H}_{f}}},$$
(10c)
where we define Bf in equation (10c) as \({B}_{f}=\bar{{I}_{f}}/\bar{{H}_{f}}\) to be the average competitive strength of one heterospecific neighbour relative to that of one conspecific. In a subsequent step, we assumed that Bf is approximately constant in time (that is, our mean-field approximation).
The quantity kff in equation (10a) measures spatial patterns in conspecific crowding of species f (kff > 1 indicates aggregation, and kff < 1 regularity), and the quantity kfh in equation (10b) measures patterns of heterospecific association around the focal species f (kfh < 1 indicates segregation, and kfh > 1 attraction). Note that our measure of conspecific aggregation, which weights neighbours by distance, is highly correlated to Condit’s omega measure of aggregation20 that counts the number of neighbours without weighting by distance (Extended Data Fig. 1). We also found that the strength of the latitudinal gradient in the exponent of the aggregation–abundance relationship (expressed as the R2) was for a radius of, for example, r > 10 m basically independent of the neighbourhood area over which conspecific aggregation was measured (Extended Data Fig. 2c,f). This was expected because of the distance-weighting (Box 1), whereby distant neighbours contribute little to total neighbourhood crowding.
Mean-field assumption
A crucial insight used in our approach8 is that crowding competition of individual trees, as described by equation (8), leads to diffuse competition at the population level in species-rich communities. That is, when taking a mean-field approximation43,71, the species-specific competition strengths of heterospecifics can be replaced in the macroscale model by a temporally constant average heterospecific competition strength Bf (ref. 8) (equation (10c)), which summarizes the emerging effects of the pairwise neighbourhood-scale competition coefficients βfi/βff at the population level. For species-rich forests at or near a stationary state, Bf is a good approximation of a species-specific constant (see the supplementary text in ref. 8). As we will see, a constant Bf simplifies the matrix of the community-level competition coefficients (equation (11d)) and enables analytical expressions of the invasion condition and the equilibria Nf* of our multispecies model (equations (11a) and (11b)) for the case that aggregation is independent of abundance.
Zero-sum assumption
Local density dependence on survival as assumed here (equation (8)) controls local tree densities and causes approximate zero-sum dynamics2, in which the total number J of individuals remains approximately a constant J*. For example, zero-sum dynamics emerged in our individual-based simulations of the extended multispecies model (blue lines in Extended Data Fig. 5f–j). The number of heterospecifics is therefore given in good approximation by \(\sum _{i\ne f}{N}_{i}={J}^{* }-{N}_{f,t}\). Using the zero-sum approximation together with the mean-field approximation (equation (10c)) in equation (11f) decouples the multispecies dynamics and enabled us to investigate the dynamics of individual species in good approximation (equation (13)).
The corrected aggregation–abundance relationship
In Fig. 2 and Extended Data Fig. 3, we fitted a phenomenological power-law for the species of a given forest plot, where the x value was the logarithm of abundance Nf and the y value the corresponding logarithm of kff (refs. 21,27,28). However, this led to values of kff close to zero for large abundance values, which would indicate strongly regular patterns25 not found in the data. Instead, in the extreme case without an aggregation mechanism (that is, random placement of offspring), crowding competition will lead to the repulsion of conspecifics comparable to heterospecific association kfh of heterospecifics. Thus, to avoid a bias, we used the quantity kff – kfh as the y value in our fit (Extended Data Fig. 9). This required the assumption that kff > kff, which was satisfied for 98% of our species (722 out of a total of 737 species; see the section ‘Study areas’). We leave investigation of the specific cases where kff < kfh to future studies.
Basic multispecies model M1
Our basic multispecies model (M1) for the per capita growth rate of species f is given by
$$\frac{{N}_{f,t+1}-{N}_{f,t}}{\Delta t}\frac{1}{{N}_{f,t}}={\widetilde{\lambda }}_{f}({N}_{f,t})=({r}_{f}-1)+{s}_{f}\exp (-{\beta }_{ff}{W}_{f}({N}_{f,t})),$$
(11a)
where Nf,t is the abundance of species f at time step t, sf is a density-independent per capita background survival rate, rf is the per capita recruitment rate and βff is the neighbourhood-scale conspecific competition coefficients of species f. The biological information on neighbourhood crowding competition was incorporated into the fitness factor42 Wf, which results from combining equations (9) and (10):
$${W}_{f}({N}_{f})=c\{{k}_{{ff}}{N}_{f}+{B}_{f}{k}_{{fh}}\sum _{i\ne f}{N}_{i}\}.$$
(11b)
The community-level competition coefficients αfi of this model are therefore given by
$${\alpha }_{{ff}}\,=\,c\,{\beta }_{{ff}}\,{k}_{{ff}}$$
(11c)
$${\alpha }_{{fi}}\,=\,c\,{\beta }_{{ff}}\,({B}_{f}\,{k}_{{fh}})\,{\rm{for}}\,i\ne f.$$
(11d)
Thus, even if the conspecific neighbourhood-scale competition coefficients βff were constant, the corresponding community-level coefficients αff were not necessarily constant. Instead, they depended on abundance if aggregation kff depended on abundance8, as observed in many of our forest dynamics plots (Fig. 2). Not considering the effect that crowding can have on the competition coefficients αff and αfi is a common (implicit) assumption of models used in coexistence theory4,15,42,47,49. Note that equation (11a) together with the fitness factor of equation (11b) enabled us to construct a reference model for the constants kff and kfh, for which the equilibria Nf* and the conditions for feasibility and invasiveness can be analytically derived8,72, and the corresponding non-spatial model (that is, \({k}_{{ff}}^{* }={k}_{{fh}}^{* }=1\)) without niche differences (that is, Bf = 1) led to a per capita population growth rate of zero for all abundances.
Extended multispecies model M2
However, we wanted to extend our model M1 (equations (11a) and (11b)) to include a dependence of aggregation on abundance as new aspect. To introduce this new model, we first defined \({k}_{{ff}}^{* }\) and \({k}_{{fh}}^{* }\) as the observed value of aggregation and heterospecific association, respectively, and kff as aggregation that depended on abundance. We then assumed that heterospecific association \({k}_{{fh}}^{* }\) is independent of abundance (as suggested by Extended Data Fig. 4) and that the quantity (kff – kff) follows a power law with respect to abundance (Extended Data Fig. 9 and the section ‘The corrected aggregation–abundance relationship’). To formulate our extended multispecies model (M2), we therefore rewrote the fitness factor of equation (11b) by adding and subtracting the term \({k}_{{fh}}^{* }{{N}}_{f,t}\):
$${W}_{f}({N}_{f,t})=c\{({k}_{{ff}}^{* }-{k}_{{fh}}^{* }){N}_{f,t}+{k}_{{fh}}^{* }{N}_{f,t}+{B}_{f}{k}_{{fh}}^{* }\sum _{i\ne f}{N}_{i,t}\}.$$
(11e)
We obtained our extended fitness factor
$${W}_{f}({N}_{f,t})=c\left\{({k}_{ff}^{\ast }-{k}_{fh}^{\ast }){\left(\frac{{N}_{f,t}}{{N}_{f}^{\ast }}\right)}^{{b}_{f}}{N}_{f,t}+{k}_{fh}^{\ast }{N}_{f,t}+{B}_{f}{k}_{fh}^{\ast }\sum _{i\ne f}{N}_{i,t}\right\}$$
(11f)
by replacing the quantity (\({k}_{{ff}}^{* }-{k}_{{fh}}^{* }\)) in equation (11e) with a new function L(Nf,t) depending on abundance:
$$L({N}_{f,t})=({k}_{{ff}}^{* }-{k}_{{fh}}^{* }){\left(\frac{{N}_{f,t}}{{N}_{f}^{* }}\right)}^{{b}_{f}},$$
(12)
where Nf* is the observed species abundance. Equation (12) simplifies for the observed abundance (that is, Nf,t = Nf*) to \(L({N}_{f}^{* })=({k}_{{ff}}^{* }-{k}_{{fh}}^{* })\), but otherwise represents the desired power-law with respect to abundance.
Decoupled multispecies models M3
The overall objective of our extended model was to study the effect that a possible dependence of aggregation on abundance (Fig. 2) has on the ability of a newly invading (or an almost extinct) species to increase its abundance. Therefore, we decoupled our extended multispecies model M2 into multiple single-species models (M3) by assuming approximate zero-sum dynamics2 (see the section ‘Zero-sum assumption’ above), in which the number of heterospecifics is given by \(\sum _{i\ne f}{{N}}_{i}={J}^{* }-{N}_{f,t}\). We obtained from equation (11f) the fitness function
$${W}_{f}({N}_{f,t})=c\left\{({k}_{{ff}}^{* }-{k}_{{fh}}^{* }){\left(\frac{{N}_{f,t}}{{N}_{f}^{* }}\right)}^{{b}_{f}}{N}_{f,t}+(1-{B}_{f}){k}_{{fh}}^{* }{N}_{f,t}+{B}_{f}{k}_{{fh}}^{* }\,{J}^{* }\right\}$$
(13)
that results together with equation (11a) in a closed-form expression for the per capita population growth rate \({\widetilde{\lambda }}_{f}({N}_{f})\) of the focal species f at low abundance, which nevertheless includes the key information on crowding competition with heterospecifics through the parameters Bf and \({k}_{{fh}}^{* }\), and total community size J*.
Model parameterization
Our extended multispecies model M2 (equations (11a) and (11f)) approximated an underlying individual-based model in the tradition of earlier spatially explicit work7,35,44,46,73 (Extended Data Fig. 5), but used the empirically observed spatial patterns instead of explicitly modelling their dynamics8. In the most general case, the models M2 and M3 have therefore seven parameters per species: three demographic parameters (rf, sf and βff); three parameters that quantify the spatial patterns (that is, \({k}_{{ff}}^{* },{k}_{{fh}}^{* },{B}_{f}\)); and the exponent bf of the aggregation–abundance relationship. The parameterization of the models depended on the objective of the model application and the available data. In the example application of our theory, we wanted to feature the effects of spatial patterns (Fig. 2) on coexistence, and we derived a specific parameterization adapted to our data and objective.
We estimated the species-specific measures \({k}_{{ff}}^{* }\) and \({k}_{{fh}}^{* }\) of the observed spatial patterns directly from the ForestGEO plot data based on equations (7) and (10). The exponent bf of the corrected (power law) aggregation–abundance relationship (equation (12)) was estimated by linear regression, where the x value was ln(Nf) and the y value the corresponding ln(kff − kfh) (Extended Data Fig. 9). The parameter bf captured for a given forest plot the average species response of aggregation to abundance (Extended Data Fig. 9), and we used this value as a parameter for all focal species. This approach is based on a species-for-time substitution, which assumes that the power-law exponent bf derived for multiple species of one census would be the same as the bf derived for the same species but at multiple points in time. The results of our individual-based simulations (Extended Data Fig. 5a–f) supported this assumption. Note that effects of habitat association or details of dispersal will influence the values of kff* (and kfh*) and contribute to the observed departures from the power law aggregation–abundance relationship, particularly for tropical forests (compare Extended Data Figs. 3 and 5).
We assumed that all mortality was driven by neighbourhood crowding and therefore set the background survival rate to sf = 1. For estimation of the parameter Bf, we used the matrix of βfi/βff (see the section ‘Proxy for pairwise competition strength between species’ above) and derived the crowding indices Hof and Iof for each individual o (equations (7b) and (7c)). The value of Bf was then given by \({B}_{f}=\bar{{I}_{f}}/\bar{{H}_{f}}\), where \(\bar{{I}_{f}}\) and \(\bar{{H}_{f}}\) are the population-level averages of the individual crowding indices Hof and Iof.
To determine the unknown value of βff, the neighbourhood-scale conspecific competition coefficients of species f that determines the strength of crowding competition, we assumed that the focal species is close to equilibrium (that is, \({\widetilde{\lambda }}_{f}({N}_{f}^{* })\) = 0) and obtained by rewriting equation (11a):
$${\beta }_{{ff}}=-\,\mathrm{ln}((1-{r}_{f})/{s}_{f})/{W}_{f}({N}_{f}^{* })$$
(14)
At equilibrium, we found with equation (11b) that \({W}_{f}({N}_{f}^{* })\,=\)\(c{k}_{ff}{N}_{f}+{\rm{constant}}\), thus all else equal, βff is negatively related to the observed abundance: species with lower observed abundance experienced stronger negative impacts on conspecifics than more common species, a pattern frequently observed in plant communities22,67,73,74,75,76,77. The effect of departures from the equilibrium assumption on the invasion criterion can be assessed by using equation (S17) in the Supplementary Text.
Inserting equation (14) into equation (11a) led to our final equation for the per capita population growth rate, for which the fitness function is given by equation (13):
$$\frac{{N}_{f,t+1}-{N}_{f,t}}{\Delta t}\frac{1}{{N}_{f,t}}={\widetilde{\lambda }}_{f}({N}_{f,t})=({r}_{f}-1)+{s}_{f}{\left(\frac{1-{r}_{f}}{{s}_{f}}\right)}^{\frac{{W}_{f}({N}_{f,t})}{{W}_{f}({N}_{f}^{* })}}$$
(15a)
For small per capita recruitment rates rf and large background survival (sf = 1), we obtained
$$\frac{{N}_{f,t+1}-{N}_{f,t}}{\Delta t}\frac{1}{{N}_{f,t}}={\widetilde{\lambda }}_{f}({N}_{f,t})\approx {r}_{f}\left(1-\frac{{W}_{f}({N}_{f,t})}{{W}_{f}({N}_{f}^{* })}\right),$$
(15b)
which suggests the use of the scaled per capita population growth rate \({\widetilde{\lambda }}_{f}({N}_{f,t})/{r}_{f}\) to remove the deterministic effects of the recruitment rate rf. Note that the effect of individual variation in model parameters on the invasion criterion can be directly investigated by using equations (13) and (15).
Adding a small immigration rate
We reformulated our mathematical model (equation (15a)) in terms of the change in the number of individuals during one time step and added a small constant immigration rate rf vf (ref. 41) (rf is the reproduction rate), thus:
$$\frac{{N}_{f,t+1}-{N}_{f,t}}{\Delta t}={\widetilde{\lambda }}_{f}({N}_{f,t}){N}_{f,t}+{v}_{f}\,{r}_{f},$$
(16)
which is equivalent to adding the term vf rf/Nf,t to the per capita population growth rate (equation (15a)). This approach differs from the way immigration is usually modelled in neutral theory2,26,78. Note that the term vf rf/Nf,t decreases rapidly with increasing abundance Nf,t and does therefore influence the dynamics only for small abundance values, particularly if the values of vf is small as assumed here (Extended Data Fig. 7a–d).
The invasion criterion
Stable coexistence requires that the abundance of a newly invading (or an almost extinct) species increases1,5,8,9 (that is, a rare species advantage), and we wanted to investigate the effect of spatial patterns on the invasion criterion. On the basis of our closed-form expression for the per capita population growth rate \({\widetilde{\lambda }}_{f}({N}_{f})\) (equations (13) and (15a)), invasion analysis reduces to the task of identifying the conditions under which \({\widetilde{\lambda }}_{f}({N}_{s})\) maintains a sufficiently high value for low abundance Ns. We therefore demanded that \({\widetilde{\lambda }}_{f}({N}_{s})\) should be slightly larger than zero (\({\widetilde{\lambda }}_{\min }\)) so that species can escape the effects of demographic stochasticity; that is, our invasion criterion is given by \({\widetilde{\lambda }}_{f}({N}_{s}) > {\widetilde{\lambda }}_{\min }\).
From equation (15a) we found that the invasion condition \({\widetilde{\lambda }}_{f}({N}_{s}) > {\widetilde{\lambda }}_{\min }\) translated into the condition
$${W}_{f}({N}_{s})/{W}_{f}({N}_{f}^{* }) < 1-\delta ,$$
(17a)
with \(\delta =1-{\rm{ln}}(({\widetilde{\lambda }}_{\min }-{r}_{f}+1)/{s}_{f})/{\rm{ln}}((1-{r}_{s})/{s}_{f})\). For our case of small reproduction rates rf and high background survival (that is, sf = 1), we obtained \(\delta \approx {\widetilde{\lambda }}_{\min }/{r}_{f}\). We therefore investigated the ratio \({W}_{f}({N}_{s})/{W}_{f}({N}_{f}^{* })\) in more detail, which can be expressed as (equation (S14) in the Supplementary Text):
$$\frac{{W}_{f}({N}_{s})}{{W}_{f}({N}_{f}^{* })}=\frac{{\left(\frac{{N}_{s}}{{N}_{f}^{* }}\right)}^{{b}_{f}+1}+{\kappa }_{f}\,(\frac{{N}_{s}}{{N}_{f}^{* }})+{\rho }_{f}}{1+{\kappa }_{f}+{\rho }_{f}}$$
(17b)
with
$${\kappa }_{f}=(1-{B}_{f})\frac{{k}_{{fh}}^{* }}{{k}_{{ff}}^{* }-{k}_{{fh}}^{* }}$$
(17c)
$${\rho }_{f}={B}_{f}\frac{{k}_{{fh}}^{* }}{{k}_{{ff}}^{* }-{k}_{{fh}}^{* }}\frac{{J}^{* }}{{N}_{f}^{* }}$$
(17d)
The condition in equation (17a) led together with equation (17b) to the invasion criterion
$$1-\frac{{W}_{f}({N}_{s})}{{W}_{f}({N}_{f}^{* })}=\frac{1-{\left(\frac{{N}_{s}}{{N}_{f}^{* }}\right)}^{{b}_{f}+1}+{\kappa }_{f}(1-\frac{{N}_{s}}{{N}_{f}^{* }})}{1+{\rho }_{f}+{\kappa }_{f}} > \delta .$$
(18)
For the case without niche differences (that is, κf = 0), equation (18) suggests that two main mechanisms allow species to increase their rare species advantage: either a smaller negative value of the exponent bf of the aggregation–abundance relationship or a smaller value of ρf (Fig. 4c). We therefore name ρf ‘risk factor’, because larger values of ρf lead to smaller values of the per capita growth rate. The maximal value of the risk factor that satisfies the invasion criterion is given by
$${\rho }_{f,\max }=\frac{1}{\delta }\left\{(1-\delta )-{\left(\frac{{N}_{s}}{{N}_{f}^{* }}\right)}^{{b}_{f}+1}+{\kappa }_{f}\left[(1-\delta )-\left(\frac{{N}_{s}}{{N}_{f}^{* }}\right)\right]\right\}$$
(19)
Thus, for each combination of \({N}_{s}/{N}_{f}^{* }\), bf, κf and δ, we obtained one critical value of ρf,max in which the species can invade if ρf is smaller than this critical value (Fig. 4c).
It is instructive to investigate the biological mechanism that determines the values of the three factors ρf, \({({N}_{s}/{N}_{f}^{* })}^{{b}_{f}+1}\) and κf that determine the invasion criterion (equations (18) and (19)). First, the risk factor ρf (equation (17d)) becomes smaller if the observed relative species abundance Nf*/J* of the focal species in the community becomes larger (this favours forests at higher latitudes; Extended Data Fig. 8a), if aggregation increases (that is, \({k}_{{fh}}^{* }/({k}_{{ff}}^{* }-{k}_{{fh}}^{* })\) decreases; Extended Data Fig. 8d), and if niche differences become larger (that is, Bf decreases). Because of the overwhelming effect of the relative abundance, the risk factor ρf decreases strongly with latitude for our dataset (Extended Data Fig. 8c) and for species with lower abundance, aggregation \({k}_{{ff}}^{* }\) would increase and reduce ρf. Second, a smaller negative value of the exponent bf of the aggregation–abundance relationship leads to a smaller value of \({({N}_{s}/{N}_{f}^{* })}^{{b}_{f}+1}\) (this favours forests at lower latitudes; Extended Data Fig. 8b).
Finally, as expected, it is more likely that the invasion criterion is fulfilled if niche differences become larger (that is, Bf becomes smaller). In this case, ρf becomes smaller (equation (17d)) and κf becomes larger (equation (17c)), and because \({N}_{s}/{N}_{f}^{* }\) << (1 – δ), the inequality of equation (19) is easier to satisfy. The invasion criterion (equation (19)) can be satisfied for niche differences even if bf < −1. Indeed, some species of the CBS plot, which showed an exponent of bf = −1.077, fulfil the invasion criterion, mostly owing to weak aggregation that led to large values of kfh/(kff – kfh) and larger κf (Extended Data Fig. 7e,f). Thus, as expected, the stabilizing mechanism of niche differences (Bf < 1) has a key role if spatial patterns alone provide only weak stabilization.
Scenarios investigated
We considered four scenarios to investigate the effects of the spatial mechanism of neighbourhood crowding and immigration on the ability of the 720 study species to increase when having low abundance. Scenario 1 assumed that conspecifics and heterospecifics compete equally (that is, no niche differences; βfi = βff, Bf = 1), and scenario 2 considers niche differences between species approximated by phylogenetic dissimilarity (see the section ‘Proxy for pairwise competition strength between species’ above). Scenarios 3 and 4 are the same as scenarios 1 and 2, but also assume a small constant immigration with parameter vf = 0.1. For the mean reproduction rate of rf = 0.1 per time step across all plots, this results in an immigration rate of rf vf = 0.01, or 1 immigrant every 100 time steps.
Spatially explicit simulation model
Model description
This model description was adapted from a previous publication8. We used the individual-based simulations to verify that the observed patterns (that is, the power law of the abundance–aggregation relationship) can emerge in principle from the minimal mechanisms included in the spatial multispecies model (equations (8) and (11a)).
The individual-based model considered only reproductive (adult) trees, but no size differences. During a given 5-year time step, the model simulated first stochastic recruitment of reproductive trees and placement of recruits, and second, stochastic survival of adults as given by equation (8), depending on their neighbourhood crowding indices (equations (7a)–(7c)) estimated from the community of adult trees. In the next time step, the recruits counted as reproductive adults and were subject to mortality. No immigration from a metacommunity was considered (that is, vf = 0; equations (1a)). To avoid edge effects, torus geometry was assumed.
Each individual produced rf recruits on average, and their locations were determined by a type of Thomas process25 to obtain a clustered distribution of recruits. In our model, the spatial position of the recruits was determined using two independent mechanisms. First, a proportion 1 – pd of recruits was placed stochastically around randomly selected conspecific adults (parents) by using a two-dimensional kernel function (here a Gaussian with variance σ2). This is the most common way in most spatially explicit models to generate species clustering. Technically, we first selected for each of these recruits randomly one parent among the conspecific adults and then determined the position of the recruit by sampling from the kernel. Second, the remaining proportion pd of recruits was distributed in the same way, but around randomly placed cluster centres that are located independently of conspecific adults.
Parameterization of the simulation model
The simulation model used here is described in detail in a previous study8. However, we used here distance-weighting for the estimation of the crowding indices. Thus, in the source code (the supplementary information in ref. 8) we used DistanceWeighting = 1 instead of DistanceWeighting = 0.
The simulations of the individual-based forest model were conducted in 5-year time steps over 25,000 years (equivalent to 5,000 census periods) in an area of A = 200 ha, and comprised approximately 86,000 trees with initially 80 species. The model parameters were the same for all species, and all species followed exactly the same model rules. We selected βfi = βff to obtain no differences in conspecific and heterospecific interactions and sf = 1 (no background mortality), a standard deviation of σ = 10 m of the kernel function, and we adjusted the parameters βff = 0.02 and rf = 0.1 to produce tree densities (430 per ha) and an overall 5-year mortality rate (10%) similar to that of trees with d.b.h. ≥ 10 cm of the BCI plot. The radius of the neighbourhood used to estimate the crowding indices was r = 20 m, and the number of randomly assigned cluster centres was 16.
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.