A critical initialization for biological neural networks

May 21, 2026

5

All experimental procedures were conducted according to the Institutional Animal Care and Use Committee (IACUC) at Howard Hughes Medical Institute (HHMI) Janelia. Data analysis and simulations were performed in Python using pytorch and numpy, and figures were made using matplotlib and jupyter-notebooks^{61,62,63,64,65}.

Data acquisition

Animals

All experimental procedures were conducted according to IACUC ethics approval received from the IACUC board at HHMI Janelia Research Campus. We performed 18 recordings in cortex in: (1) 12 mice bred to express jGCaMP8s⁶⁶ in excitatory neurons: TetO-jGCaMP8s × Camk2a-tTA mice (available as JAX 037717 and JAX 007004); (2) 3 mice bred to express jGCaMP8s⁶⁶ in the somas of excitatory neurons: riboL1–jGCaMP8s × Slc17a7-Cre (similar to JAX 039267 without IRES; and JAX 037512) and (3) 3 mice bred to express tdTomato in inter-neurons VGAT-CRE × Ai14 (JAX 016962; JAX 007914) with injections of a dual virus Thy1s:TTA (AAV9, 1.64 × 10¹³ vector genomes ml⁻¹) and TRE3G:RiboL1–jGCaMP8s (AAV9, 2.45 × 10¹³ vector genomes ml⁻¹) as in ref. ⁶⁷, see also ref. ⁶⁸. We also performed eight recordings in hippocampal CA1 in six mice bred to express GCaMP6f in excitatory neurons: Thy1-GCaMP6f GP5.17 mice (JAX 025393)⁶⁹, as well as in two wild-type C57 mice (JAX 000664) with injections of the same RiboL1-jGCaMP8s virus combination described above. These mice were male and female, and ranged from 2 to 12 months of age. Mice were housed in reverse light cycle, and were pair-housed with their siblings before and after surgery. Holding rooms are set to a temperature of 70 °F ± 2 °F, and humidity of 50% relative humidity ± 20%.

Surgical procedures

Surgeries were performed in adult mice (P35–P125) following procedures outlined in refs. ^70,71. In brief, mice were anaesthetized with isoflurane while a craniotomy was performed. Marcaine (no more than 8 mg kg⁻¹) was injected subcutaneously beneath the incision area, and warmed fluids plus 5% dextrose and buprenorphine 0.1 mg kg⁻¹ (systemic analgesic) were administered subcutaneously along with dexamethasone 7 mg kg⁻¹ by the intramuscular route. In the canula implants, the same total dexamethasone dose was administered tapered over 3 days: 4 mg kg⁻¹ on the first day, 2 mg kg⁻¹ on the second day and 1 mg kg⁻¹ on the third day.

For the visual cortical windows (which included the posterior parietal cortex), measurements were taken to determine the bregma–lambda distance and location of a 4-mm circular window over the V1 cortex, as far lateral and caudal as possible without compromising the stability of the implant. For the sensorimotor windows, the craniotomy was centred at −0.75 mm anteroposterior (AP) and 2.2 mm mediolateral (ML) from bregma. A 4- and 5-mm double window was placed into the craniotomy so that the 4-mm window replaced the previously removed bone piece and the 5-mm window lay over the edge of the bone. For the hippocampal windows, the craniotomy was centred at 1.8 mm AP and 2.0 mm ML from bregma. Cortex was aspirated and a 3-mm glass coverslip attached to a stainless-steel was implanted over the dorsal CA1 region. CA1 surgeries were similar to those described in ref. ⁷¹.

After surgery, ketoprofen (5 mg kg⁻¹) was administered subcutaneously and the animal allowed to recover on heat. The mice were monitored for pain or distress and ketoprofen 5 mg kg⁻¹ was administered for 2 days following surgery.

Imaging acquisition

We used a custom-built 2p mesoscope⁷² to record neural activity, and ScanImage⁷³ for data acquisition. We used a custom online Z-correction module (now in ScanImage), to correct for Z and XY drift online during the recording. As described in ref. ⁷⁰, for the visual area and hippocampal recordings, we used an upgrade of the mesoscope that allowed us to approximately double the number of recorded neurons using temporal multiplexing⁷⁴.

The mice were free to run on a styrofoam cylinder. Mice were acclimatized to running on the ball for several sessions before imaging, and one mouse was trained on a virtual reality task for 2 weeks before the recording. The field of view was selected such that large numbers of neurons could be observed, with clear calcium transients. Recordings were performed for 100–150 min at a rate of 22 Hz. We performed one recording session in each of the 26 mice, and did not perform a sample size analysis. Blinding and randomization were not used. Recordings from refs. ^43,75 were acquired at a rate of 3 Hz. Recordings from refs. ^2,76 were acquired at a rate of 2.5–3 Hz, on a Thorlabs Bergamo microscope.

Processing of calcium imaging data

Calcium imaging data were processed using the Suite2p toolbox (v.0.9.3)⁷⁷, available at www.github.com/MouseLand/suite2p. Suite2p performs motion correction, ROI detection, neuropil correction and spike deconvolution as described elsewhere². We used a neuropil subtraction coefficient of 1.0. For the 22-Hz recordings, we used all ROIs output by Suite2p above a signal-to-noise ratio (SNR) threshold of 0.3, which included dendritic processes, to increase the number of units recorded. The SNR for the activity trace x for each ROI was defined as

$$\,\rmS\rmN\rmR=1-\frac\rmV\rma\rmr[x_t-x_t-1]2\,\rmV\rma\rmr[x_t]$$

(similar to ref. ⁷⁸). 61 ± 16% (mean ± s.d.) ROIs had an SNR greater than 0.3, resulting in a range of 3,981–10,595 ROIs across recordings.

We improved the spike deconvolution here by using the latest version of Suite2p, which will be described in an upcoming manuscript. Our approach was similar to that in ref. ⁷⁸, where a neural network is used to predict the ground-truth spikes from the noisy convolved traces. Unlike ref. ⁷⁸ we trained the model on a large number of simulations single-neuron spiking activity convolved with GCaMP-like dynamics and we used a Unet predictive model⁷⁹ with a style-vector to capture temporal context independently for each deconvolved trace⁸⁰. We verified that the deconvolved traces better capture neural activity on real data with less noise by evaluating the responses to visual stimuli presented at known times. To estimate the effect of binning in the GCaMP8s recordings (Extended Data Fig. 6e), we binned the fluorescence traces by a factor of 7 and then performed deconvolution.

Neuropixel recordings and processing

As described in ref. ², eight-neuropixel electrode arrays were used to record simultaneously from up to 3,000 neurons across the brain in three mice⁶⁰. On the day of recording, mice were anesthetized briefly with isoflurane while eight small craniotomies were made with a dental drill. After several hours of recovery, mice were head-fixed in the International Brain Laboratory (IBL) task setup: seated in a plastic tube with their forepaws on a wheel, surrounded by three computer screens in a light-isolated enclosure^55,81. The electrodes were advanced slowly (approximately 10 μm s⁻¹) to their final depth (4 mm or 5 mm deep), and allowed to settle for around 15 min before recording. During the spontaneous part of the recording, the computer screens were black. Data were pre-processed by re-referencing to the common median across all channels⁸². The probe locations were determined using https://github.com/cortex-lab/allenCCF, and the brain mesh in Fig. 2c was plotted using this tool, based on Allen Common Coordinate Framework data^83,84.

These recordings were re-processed with Kilosort4 (v.4.0.14), with default settings⁸⁵. The Kilosort4 spike sorter found 1,756, 2,837 and 2,962 neurons defined as ‘good’, with a refractory violation rate of less than 0.2 (default), from the three recordings. We excluded neurons with a firing rate of less than 0.01 Hz during the spontaneous recording period, resulting in 1,716, 2,787 and 2,914 neurons in total in each recording. The detected neurons were located across cortex, the hippocampal formation, the striatum and other subcortical areas. For the grouping in Extended Data Fig. 5b,c, the ‘visual cortex’ group consisted of primary and higher-order visual cortices; the ‘sensorimotor cortex’ group consisted of motor cortex and somatosensory cortex; the ‘striatum’ group consisted of the caudate putamen and lateral septal cortex and the ‘subcortical areas’ consisted of superior colliculus, thalamus and midbrain. Mice 1 and 2 had detected neurons in all five groups; mouse 3 had detected neurons in all groups except visual cortex. The spikes were binned at a rate of 22 Hz—the same acquisition rate as the calcium imaging. We also analysed bin sizes from 5 ms to 100 ms in Extended Data Fig. 5d,e. The period of spontaneous activity in each recording was 22–42 min long (only this period was used for analyses).

Data analysis

We normalized the neural activity to avoid fitting single-neuron statistics. We z-scored the activity of each neuron so that the mean activity of each neuron is 0 and its s.d. is 1. We ran Rastermap on the recordings with 100 clusters and 128 PCs, and visualized the sorted activity using 20 neurons per bin³⁸. The behavioural running state was estimated by interpolating the running trace to the recording frames, smoothing with a 25 frame Gaussian kernel, and setting a threshold of running/not running at 1/100 of the maximum smoothed running speed. We included sessions for analysis in which the mouse was running at least 10% of the time, and for the DMD analysis in which the mouse was running for at least 30 min (Extended Data Fig. 7).

Eigenspectrum estimation

We estimated the eigenvalues using the covariance between two halves of neurons from the recording. This is to avoid contaminating the eigenspectrum with single-neuron noise produced from the recording methods and from Poisson variability. We divided the recordings in half spatially, using a checkerboard of size 50 μm for the calcium imaging recordings, and using sections of 40 μm (eight contacts) on each Neuropixels probe in the electrophysiological recordings. We did not split the data into training and testing timepoints, as we found this inflated the power-law exponent estimates² (Extended Data Fig. 4).

We estimated the power-law exponent of the eigenspectrum decay using a weighted linear regression in log space from rank 10 to 500, with weights as the inverse of the log of the rank⁸⁶. The eigenvalue spectra are normalized by the value of the linear regression fit at rank 1. If the length of the spectrum was less than 500 due to limited numbers of neurons, then we estimated the spectrum using 10 to half the length of the spectrum.

We subsampled the number of neurons randomly in Extended Data Fig. 6f–h, and subsampled by brain region in the ephys data in Extended Data Fig. 5b,c. We subsampled timepoints in the recordings using chunks of length 50 to 4,000, spaced in time by 4,000 time points.

Estimating PC timescales from data

To estimate the timescales of the PCs, we must take into account the noise. A naive estimation of the ACGs would find that later PCs have smaller time-lagged correlations, but that could be due simply to these PCs having lower SNR overall, and thus all their time-lagged correlations would be lower. Instead we take a similar approach to that from SVCA: we split the data into two random subsets of neurons using a checkerboard grid with size 50 μm, and we calculate the components with a singular value decomposition (SVD) on the covariance between the two subsets of neurons. The resulting left and right singular vectors were used to project test data, and the correlograms were computed between the projections of one set of neurons and the other. The resulting estimates of the PC ACGs were then normalized to 1 at a time lag 0.

Estimating rotational components from data

When the dynamics matrix A is not symmetric, its eigenvalues are complex and the covariance of the multi-dimensional data is no longer related directly to these eigenvalues. Thus, to evaluate the complexity or rotational aspect of the dynamics directly, we cannot rely on the PCs alone. Instead, we fit linear predictive models that predict the neural population vector at time t + dt from the population vector at time t. Such models are referred to typically as DMD, with the small modification that we use a time lag dt = 0.23 s instead of the more common dt = 1 time sample. The reason to use a time lag is to make estimation of the rotational modes more robust and, in particular, to avoid the potential influence of short timescale artefacts arising from the deconvolution of the calcium imaging data.

To estimate DMD, we first reduced each dataset to 1,000 dimensions using PC analysis (PCA). We then used ridge regression with a penalty of 0.1 for the ephys and 0.01 for the 2p calcium imaging to predict X_t+dt from X_t in the reduced PCA space. Thus X_t+dt ≈ BX_t, with B a square matrix mapping PCs to PCs. As usual, the neural activity was z-scored on a per neuron basis before applying PCA, but it was not re-normalized afterwards. Since PCA is an orthonormal projection, the eigenvalues of B are the same as would be expected in the full neuronal space, other than estimation errors. In the model, the matrix exponential describes the relation between X_t+dt and X_t, regardless of whether A is symmetric or not:

$$X_t+dt=\exp ^(A-I)dt/\tau X_t.$$

Thus, the DMD matrix B we obtained from data is an estimate of $\exp ^(A-I)dt/\tau $, at least in the simulations. Looking at the complexity of the eigenvalues of B can thus indicate whether the dynamics are rotational or not. Note that the eigenvalues $\lambda ^\prime $ of $\exp ^(A-I)dt/\tau $ are related to the eigenvalues λ of A by $\lambda ^\prime =\exp ^(\lambda -1)dt/\tau $. Thus the higher the complex part of λ, the higher the complex part of $\lambda ^\prime $. We can also now see why taking a larger dt is beneficial: when dt is very small relative to the timescale of the dynamics, the eigenvalues $\lambda ^\prime $ approach 1, making it difficult to estimate their rotational component. The relation $\lambda ^\prime =\exp ^(\lambda -1)dt/\tau $ could be inverted to obtain estimates of λ. However, this multi-step process is likely to contain a lot of estimation error, and we preferred instead to directly compare the estimated distributions of $\lambda ^\prime $ from data with those from appropriately matched simulations. In particular, we compute the number n_rot10 of rotations per tenfold attenuation of the complex eigenvector $\lambda ^\prime $:

$$\beginarrayc\beginarraycn_\mathrmrot10=k\cdot \mathrmangle\,(\lambda ^\prime )\\ \,|\lambda ^\prime ^k=0.1\endarray\endarray.$$

Thus:

$$n_\mathrmrot10=\frac\log 0.1\log \cdot \mathrmangle\,(\lambda ^\prime ).$$

We used τ = 20 ms in the simulations with symmetric matrices to match approximately the timescales of the data. Longer or shorter τ in the simulations would simply contract all estimated eigenvalues towards 0, but otherwise leave the number of rotations per tenfold attenuation unchanged.

We also estimated eigenvalues of dynamics using DMD on various recordings in which rodents and monkeys were performing tasks (Extended Data Fig. 8). In terms of rodent experiments, we analysed one session from refs. ^87,88, in which rats ran down a linear track, and 137 neurons from CA1 were recorded using silicon probes. We analysed one session from refs. ^89,90 in which mice were detecting a reward location in two different virtual reality corridors, and neural activity was recorded from CA1 using 2p calcium imaging at a rate of 15 Hz. We analysed one session from refs. ^91,92 in which mice were performing a visual discrimination task in virtual reality, and neural activity was recorded from visual cortex using 2p calcium imaging at a rate of 3 Hz. For DMD analysis in each of these recordings, we used a ridge penalty of 0.1, and a time delay of 2 s. We also analysed ephys data collected from monkeys, compiled by the Neural Latents Benchmark challenge⁹³. We binned the spikes in each of these recordings at a rate of 50 Hz, and ran DMD on the peri-stimulus time histograms (PSTHs) computed from each of the recordings, aligned to movement onset. In refs. ^94,95, the monkey completed a maze task with one to many targets, with 108 different configurations—in the figure we plotted example PSTHs from single target trials. In refs. ^96,97, the monkey performed reaches between random elements of a grid—we binned the reaches based on movement direction angle into 15 bins. In refs. ^98,99, the monkey performed a centre-out reaching task in eight different directions (active trials), and on a subset of trials the joystick was moved (passive trials); recordings were performed in somatosensory Area 2. In each of these recordings, we performed DMD with a ridge penalty of 0.1 and a time delay of 200 ms.

Local correlation structure

In Fig. 4, we computed the correlations using the recording sampling rate (3 Hz). For the simulations, we used the correlation matrix derived from the eigenvectors. For all the analyses we excluded neuron pairs within 20 μm of each other. The pairwise correlations were binned in 200-μm bins (Fig. 4d).

We defined the top 1% of correlations per neuron as the ‘strong pairs’ for each neuron. We then computed the probability distribution of the strong pairs across spatial bins of 200 μm. This distribution was normalized by the distribution of all other correlations across bins, producing the strong pair odds versus chance shown in Fig. 4f. The strong pair odds, near versus far, was the ratio of the first bin of this curve (within 200 μm) to the average of the last four bins of this curve (2,200–3,000 μm).

Dynamical systems analysis

We assume that neural activity is governed by linear dynamics with independent stochastic inputs. In the case of normally distributed inputs, this model becomes the familiar Ornstein–Uhlenbeck process, with connectivity A³³:

$$\tau \fracd\bfxdt=-\bfx+A\bfx+\epsilon _\bft,$$

(1)

where the noise is a Wiener process.

The stationary distribution of the neural covariance matrix Σ is defined by the Lyapunov equation³⁷:

$$(A-I)\Sigma +\Sigma (A-I)^\rm\top =-I.$$

When A is symmetric, the solution is given by

$$\Sigma =\frac12(I-A)^-1.$$

(2)

For the non-symmetric case, the solution must be calculated numerically¹⁰⁰ (see also ref. ¹⁰¹ for an example of solving for A).

For the symmetric case, we can derive the decay of the eigenvalue spectrum directly, under the assumption that A is a random symmetric matrix with an eigenspectrum distribution of a semicircle from −1 to 1. We assume here that the number of units is large enough to treat the eigenvalue distribution as an exact semicircle distribution and ignore finite size effects. The rank n of eigenvalue $\lambda _A^n$ of A is defined by the integral of the semicircle distribution density p(λ_A) from $\lambda _A^n$ to 1. We scale p(λ_A) to a maximum of 1 for convenience of the calculations. If we define θ as the angle subtended by $p(\lambda _A^n)$ on the semicircle, we can use geometric arguments to show that:

$$\beginarrayc\beginarraycn=\pi \left(\frac\theta 2\pi \right)-\frac12\cos \theta \sin \theta \\ =\frac12(\theta -\cos \theta \sin \theta )\endarray\endarray.$$

The eigenvalues of the covariance Σ, denoted by λ, are related to the eigenvalues λ^A of the connectivity matrix A:

$$\lambda =\frac12(1-\lambda _A)\Rightarrow \,\lambda _A=1-\frac12\lambda .$$

Thus, we have

$$\theta =\cos ^-1(\lambda _A)=\cos ^-1\left(1-\frac12\lambda \right).$$

Plugging this into the equation for rank n and using $\sin (\theta )=\sqrt1-\cos ^2(\theta )$ we have:

$$\beginarrayl2n=\cos ^-1\left(1-\frac12\lambda \right)-\left(1-\frac12\lambda \right)\sqrt1-\left(1-\frac12\lambda \right)^2\\ \,=\,\cos ^-1\left(1-\frac12\lambda \right)-\left(1-\frac12\lambda \right)\sqrt\frac12\lambda \sqrt2-\frac12\lambda .\endarray$$

We next used Wolfram Alpha¹⁰² to obtain the Taylor expansion for $\sqrt2-x$ and Puiseux expansion for $\cos ^-1(1-x)$ where x is small. Keeping only terms in which x is raised to a power less than 2, gives

$$\beginarrayl2n\approx \sqrt\frac1\lambda +\frac124\lambda ^3/2-\sqrt\frac1\lambda +\frac58\lambda ^3/2+\mathcalO\left(\frac1\lambda ^5/2\right)\\ \,\approx \frac23\lambda ^3/2\,\Rightarrow \lambda \propto \frac1n^2/3.\endarray$$

Thus, the eigenvalues decay approximately as a power-law with exponent 2/3, which is very close to the value we found in simulations.

As mentioned in the text, this is different from the value of 4/3 found in ref. ¹⁰; the discrepancy is due to estimating the ‘long time window covariance’, which assumes that the data are binned in infinitely long windows. The authors of ref. ¹⁰ argue that the formula converges for short windows (greater than 50 ms), which seems to use the single-neuron timescales as a reference. However, when the dynamical systems are close to critical, the emergent timescales are much longer, and thus a much longer window is needed to reach the stable state. Thus, the derivation in ref. ¹⁰, although interesting, can capture only the infraslow timescales of neural activity and predicts a 4/3 power-law on a rank plot in this case.

We note that the 4/3 exponent, while not explicitly calculated there, can be derived easily from Eq. (16) in ref. ¹⁰:

$$p(x)=\frac\sqrt2\pi x^-\frac74,$$

which integrated gives:

$$\int _^Xp(x)dx\propto x^-\frac34,$$

which results in a power-law exponent of approximately 4/3 in a rank plot¹⁰.

Relating timescales to eigenvalues

In addition to estimating the decay of variances of the PCs, we also want to evaluate dynamical, temporal properties of the data and relate them to the model. In the model, the timescales of the system are related to the eigenvalues of A and therefore to the eigenvalues of Σ, following from the matrix exponential solution to the Lyapunov equation:

$$X(t)=e^(A-I)t/\tau X(0)+\int _^te^(I-A)(t-t^\prime )/\tau \sigma dW_t^\prime .$$

The second term on the right is a noise term that is independent of X(0). Using the SVD decomposition of A = UΛU^T, with Λ having the eigenvalues $\lambda _A^i=1-1/(2\lambda _i)$ on the diagonal, the multi-dimensional system can be divided into independent scalar equations:

$$U_i^TX(t)=e^(\lambda _A^i-1)t/\tau U_i^TX(0)+\rmnoise.$$

Thus, the PC component projections $U_i^TX$ have an auto-correlation that decays as

$$e^(\lambda _A^i-1)t/\tau =e^-\fract2\lambda _i\tau ,$$

and, thus, the timescales of the PCs are monotonic with the amplitude of the eigenvalue λ_i. For the purposes of the analysis in Fig. 3, we need only observe the cross-correlation at time lag t, although its exact decay with λ_i cannot be predicted owing to the unknown single-unit timescale τ.

Simulations of dynamical systems

We simulated 10,000 neurons governed by the dynamics in Eq. (1). For the dynamics simulations, we performed integration using the Euler–Maruyama method and used a step size of 2 ms and a timescale τ for each neuron of 20 ms using pytorch^62,103. The random noise was drawn from a Gaussian with a mean of zero and s.d. of one for each neuron at each time step. We ran 80 simulations on a graphics processing unit in parallel, with random initial conditions drawn from a Gaussian with mean 0 and s.d. = 1, each consisting of 60,000 time steps, and discarded the first 4,000 time steps. To replicate the sampling rate in the data, we binned every 23 timepoints (approximately 22 Hz). We also z-scored the unit activities, as in the data. When we visualized the rastermaps of simulations, we ran 40 simulations each consisting of 100,000 time steps, to obtain longer continuous segments of dynamics.

For testing the eigenspectrum estimation methods (Extended Data Fig. 4), we normalized the activity of each neuron by its s.d., applied a relu and then used the activity of the neuron as the mean of a Poisson process, scaled by different values to represent different levels of noise. We multiplied the activity by 0.7, 0.5 and 0.3, representing ‘low’, ‘medium’ and ‘high’ noise levels respectively. To simulate 2p recordings, we used the ‘medium’ noise Poisson activity traces and convolved each trace with an exponential filter with a decay time constant of 0.25 s. All the activity traces were then scaled by a factor of eight and 400 was added to each trace, and then each trace was scaled individually by a random number drawn from an exponential distribution with 0.001 added to it, to approximate a distribution of SNR values across neurons, with a minimum SNR close to 0.3. We applied shot noise to each trace, by using the trace as the mean of a Poisson process, and then applied deconvolution as in the real neural data. The exponential distribution had mean values of 0.5, 0.2 and 0.08, representing ‘low’, ‘medium’ and ‘high’ shot-noise levels, respectively. The ‘medium’ shot-noise level had a mean SNR across recordings of approximately 0.56, similar to our neural recordings.

Dense connectivity

We drew the excitatory connections between neurons from a uniform random distribution from zero to two. We subtracted off the mean connectivity (one). We set the diagonal values of the matrix to zero. When this matrix is symmetric, its eigenspectrum distribution follows the semicircle law; for the non-symmetric case, it follows the circular law¹⁰⁴. We divided the matrix by a scalar so that the largest real value of the eigenvalues of the matrix was 0.998, setting A so that it is critically normalized.

Sparse/varied connectivity

We created the sparse symmetric random matrices with random zero or one connections drawn from a Bernoulli with a mean varying from 2.4 × 10⁻⁴ to 0.25 (Fig. 4a). The mean of the Bernoulli was subtracted globally from the entire matrix, resulting in a matrix with zero mean connectivity. The diagonal of the matrix was set to zero. Random sparse symmetric matrices and graphs also follow the semicircle law, when the sparsity is not too high^{42,105,106,107}.

We created the clustered symmetric random matrices by setting the Bernoulli mean to 0.5 for within-cluster (local) connectivity, and the mean out-of-cluster (global) connectivity in a range from 2.4 × 10⁻⁴ to 0.5 (Fig. 4b). This results in a ratio of the probability of local connection to the probability of global connection (P(local)/P(global)) ranging from 1 to 2,048. Each cluster consisted of 500 neurons. The mean of the Bernoulli distribution for each entry was subtracted (0.5 within-cluster, smaller outside), resulting in mean zero connectivity across the matrix. The diagonal of the matrix was set to zero.

We created the locally connected symmetric random matrices using as the Bernoulli mean an exponential decay function of Euclidean distances d_ij in micrometres between neurons in the simulation (Fig. 4c). The exponential function was defined as $0.5e^-d_ij/250$. The neurons were placed randomly on a torus of size 8,000 × 8,000 μm. The minimum value of the mean of the Bernoulli was set to a value from 2.4 × 10⁻⁴ to 0.5, resulting in a range of P(local)/P(global) from 1 to 2,048. Again, the mean of the Bernoulli was subtracted from each entry of the matrix, and the diagonal of the matrix was set to zero. To quantify the recovery of true connectivity (Fig. 4e), we compute the fraction of true positive connections with P% of top pairwise correlations per neuron, and then normalize by the average probability of positive connection in the simulation (chance).

As in the dense case, we divided each matrix by a scalar so that the largest eigenvalue of the matrix was 0.998. Because these connectivity matrices were symmetric, we computed the covariance matrix and the eigenspectrum directly from A using Eq. (2).

Simulations with inputs

For Fig. 5 we added external inputs to the simulations described above. All simulation parameters were kept the same, and the input magnitude of m = 2.5 was chosen to be comparable with the amplitude of the ongoing, internally generated activity. Inputs were kept on for 50 ms ending at simulation time 0 s and the readout was averaged over 100 ms of neural activity ending at the delay time on the x axis of Fig. 5d–f. For time-independent decoding, the 100-ms readout ended at a randomized time in the range between 120 ms at a minimum and the time on the x axis of Fig. 5d–f at a maximum. Inputs were assumed to have 100 features, which were projected to the 10,000 neurons in the simulations either with completely random weights (Fig. 5d,e), or with weights corresponding to the top 100 eigenvectors of spontaneous activity (Fig. 5f). Inputs were presented in pseudo-random order at fixed intervals of 3 s, except for the analyses in Fig. 5f where the intervals were 9 s. Mathematically:

$$\beginarrayc\tau \fracd\bfxdt=-\bfx+A\bfx+d\bfW+\\ \,+\,m\cdot \sum _j\bfv_\rm\sigma (j)\delta _t\in (t_j-0.05s,t_j)\\ \,\bfv_i=B\bfu_i,\,\bfu_i\in \bfR^100\\ \bfy_j(t^\prime )=\int _t_j+t^\prime -0.1s^t_j+t^\prime \bfx(t)dt,\endarray$$

where dW is the Wiener process corresponding to the random noise simulation, v_σj is the random input added at time t_j and corresponding to a random mapping σ(j) to the range of inputs considered in each simulation. A is the critically normalized dynamics matrix as before and B is a 10,000 × 100 matrix of either random values or composed of the eigenvectors of the spontaneous activity generated in the absence of inputs. The readout $\bfy_j(t^\prime )$ for input j is calculated as the 100 ms integral of activity x(t) corresponding to a time offset of $t^\prime $ from the input time t_j. For the echo-state network version¹², we simply added a ReLu⁶² rectification on x at each Euler step of the same dynamical system. We scaled A to reach the edge of criticality, which in this case required an additional scaling factor of 1.4 to A.

The Rastermap visualizations in Fig. 5b,c were from 1,500 s of simulations with three inputs. There were two randomly drawn 100-dimensional inputs in Fig. 5d, with 1,000 repeats of each stimulus, half of which were used for training the decoder and half for testing. A one-nearest-neighbour decoder was used for this binary classification. There were 2,000 different, randomly drawn 100-dimensional inputs in Fig. 5e,f, with 1,000 inputs used for training and 1,000 used for testing. To decode the input, we fit a regression problem from the output of the network to the inputs, using ridge regression with L2 regularizer of 10. In this case, a classification was considered correct if the predicted input features were closest to the ground-truth input features, as opposed to any of the other 999 inputs used in the testing set.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

A critical initialization for biological neural networks

Data acquisition

Animals

Surgical procedures

Imaging acquisition

Processing of calcium imaging data

Neuropixel recordings and processing

Data analysis

Eigenspectrum estimation

Estimating PC timescales from data

Estimating rotational components from data

Local correlation structure

Dynamical systems analysis

Relating timescales to eigenvalues

Simulations of dynamical systems

Dense connectivity

Sparse/varied connectivity

Simulations with inputs

Reporting summary

Neural representation of action symbols in primate frontal cortex

Astrocyte glucocorticoid receptor signalling restricts neuronal plasticity

Feature-specific threat coding in lateral septum guides defensive action

Most Popular

Sam Altman makes ‘mic drop’ offer to every Y Combinator startup

Neural representation of action symbols in primate frontal cortex

Spencer Krug: Same Fangs Album Review

Anthropic will pay xAI $1.25B per month for compute

Recent Comments

ABOUT US

POPULAR POSTS

Sam Altman makes ‘mic drop’ offer to every Y Combinator startup

Neural representation of action symbols in primate frontal cortex

Spencer Krug: Same Fangs Album Review

POPULAR CATEGORY