Understanding the neural code of stress to control anhedonia

Mice

All procedures were conducted in accordance with the National Institutes of Healthâs Guide for the Care and Use of Laboratory Animals and the institutional guidelines of the University of California, San Francisco’s Institutional Animal Care and Use Committee. Adult (8â12 weeks old) male and female C57BL/6J mice were supplied by The Jackson Laboratory. Adult (5â6 months old) CD1 retired male breeder mice were supplied by Charles River. All mice were kept on a 12-h light/dark cycle, and all experiments were conducted during the light phase. We performed recordings in 60 mice for the original dataset, including 45 CSDS mice (30 males, 15 females) and 15 control mice (10 males, 5 females). The results shown are combined data using both males and females, as we did not observe significant differences between males and females. Mice were randomly assigned to control or CSDS groups before CSDS exposure. A separate cohort of 41 CSDS mice underwent chemogenetic manipulation experiments. Twenty-three of the mice received AAV-DIO-hM3Dq viral micro-infusion and 18 mice received AAV-DIO-mCherry infusion. Mice were randomly assigned to hM3Dq or mCherry groups at the time of surgery. From the hM3Dq and mCherry groups, we performed recordings in seven of the susceptible mice in each group. Experimenters were blind to the condition and group assignments of mice.

Surgery

Head bar and craniotomy surgery

One week before lick training, head bar surgeries were conducted on all mice (8â9 weeks old). According to a previously described protocol³¹, mice were anaesthetized with 1.5% isoflurane with an O₂ flow rate of 1âlâmin^â1, and head-fixed in a stereotaxic frame. A custom-made titanium head bar was then attached to the skull using Metabond adhesive cement (Parkell). Possible recording sites (see the section entitled Neuropixels recording and data preprocessing) were stereotaxically marked using a permanent marker on the skull surface, and the skull was covered using silicon (Smooth-On). At 3âdays before Neuropixels recording, craniotomy surgery was performed, in which, under anaesthesia, craniotomies were made at the previously marked coordinates. The skull surface was covered with Kwik-Sil (World Precision Instruments).

Viral micro-infusion surgery

For mice that underwent chemogenetic manipulations, adult mice (8â9 weeks old) received viral micro-infusion in the same surgery as head bar attachment, as in a previously described protocol³¹. Specifically, AAV8-hSyn-DIO-hM3D(Gq)-mCherry (Addgene, 44361-AAV8, 2.9âÃâ10¹³ viral genomes (vg) per millilitre) or AAV8-hSyn-DIO-mCherry (Addgene, 50459-AAV8, 1.0âÃâ10¹³âvgâml^â1) was micro-infused into vCA1 bilaterally (500ânl per hemisphere, â3.52âmm anteriorâposterior (AP), Â±3.1âmedialâlateral (ML), â4.2 (150ânl), â4.1 (200ânl) and â4.0 (150ânl) dorsalâventral (DV), from bregma according to ref.â⁶⁰), and AAV2retro-CAG-Cre (UNC Vector Core, Ed Boydenâs stock, 4.1âÃâ10¹²âvgâml^â1) was micro-infused into the BLA bilaterally (500ânl per hemisphere, â1.80âmm AP, Â±3.1âML, â5.0 (150ânl), â4.8 (200ânl) and â4.6 (150ânl) DV). Viral vectors were delivered using Nanoject 3 (Drummond Scientific). The needle was held in place for >5âmin after infusion at each DV site, and for 10âmin after the last DV site. Following viral micro-infusion, a head bar was attached to the skull as described above.

Behaviour

CSDS

The CSDS procedure was conducted according to a previously established protocol⁴. Briefly, CD1 male mice were singly housed following arrival for >1 week and were then pre-screened for aggression over 3 consecutive days. Each day, a CD1 mouse was placed in a cage with a new screener BL/6 mouse for 3âmin. An aggressive CD1 mouse is defined as one that attacked the BL/6 mouse within the first minute over a minimum of 2 consecutive days. Only aggressive CD1 mice were used in defeats and social interaction tests. Defeats occurred over 10 days, for which, each day, a BL/6 mouse was introduced to a new CD1 mouseâs cage for 10âmin. Defeats were terminated early if severe injuries on BL/6 mice were observed. After 10âmin, a clear plastic divider with perforations was placed in the middle of the defeat cage for 24âh, to physically separate the BL/6 and CD1 mice while allowing visual and odour cues to transmit and reinforce the defeat experience during co-housing. After the tenth day of defeat, BL/6 mice were singly housed in new cages (without CD1 mice) for 24âh before the social interaction test. For female defeats, female BL/6 mice were first coated with urine from other aggressive CD1 male mice (not used in defeats) before being introduced to the defeat CD1 mouse cage⁶¹, to minimize mounting behaviour and maximize defeats. Female defeats were terminated early if mounting was observed. For the control group, a BL/6 mouse was co-housed across from another conspecific across a divider for 10 days without any physical interaction or defeats. On each day, a new BL/6 mouse pairing was introduced.

Social interaction test

The social interaction test took place 1 day after termination of CSDS (or the control procedure). BL/6 mice were habituated to the social interaction test room for 1âh before the test. The test was performed under red light (10âlx) in a test arena (custom made, 42âcmâ(w)âÃâ42âcm (d)âÃâ42âcm (h)) in a sound attenuation chamber. During the first phase of the test, the BL/6 mouse was introduced to the test arena with an empty enclosure (10âcm (w)âÃâ6.5âcm (d)âÃâ42âcm (h)) at one end for 2.5âmin, and its activity patterns were tracked using Ethovision (Noldus Information Technology). At the end of 2.5âmin, the mouse was placed back in its home cage, and the empty enclosure was replaced with a second enclosure containing a new aggressive CD1 that had not been used in defeats. The BL/6 mouse was put back in the test arena for another 2.5âmin. The social interaction score, as a measure for social avoidance, was calculated as the time spent in the interaction zone (14âcmâÃâ24âcm) with the aggressor present versus absent. The lower the social interaction ratio, the more socially avoidant the animal was. The same test protocol was used for all experiments, except for when chemogenetic manipulations were performed during the social interaction test.

For chemogenetic manipulation during the social interaction test, the same surgery and social defeat procedures were used as before, and then we performed 2 days of social interaction tests. On day 1, mice were injected with saline (intraperitoneally (i.p.)) 20âmin before the social interaction test. On day 2, mice were injected with CNO (i.p.) 20âmin before the social interaction test. We performed the social interaction tests on two separate days to prevent habituation to the social interaction test chamber.

Elevated plus maze

The elevated plus maze assay was performed an hour after the end of the social interaction test using an established protocol¹¹. Briefly, mice were placed in a standard maze (height from the floor, 13.5âin; length of each arm type, 25âin; arm width, 2âin; closed arm height, 7âinches; height and width of ledges on the open arms, 0.5âin; light over the open arms, 650âlx). Mice were positioned in the central region of the maze and allowed to explore for 15âmin. Their behaviour was tracked and analysed using Ethovision (Noldus Information Technology). Open arm time, as a measure for anxiety-related behaviour, was calculated as the percentage of time spent in the open arms of the maze.

Head-fixed SPT

Following recovery from head bar surgery, mice were habituated to the experimenter and the head-fixed set-up for 15âmin a day for a week. After habituation, mice were water-restricted to about 85â90% their ad lib body weight and were trained for 3 days to lick on the custom-designed dual-spout head-fixed reward delivery apparatus. On day 1, mice were introduced to 1 lick spout, from which sucrose rewards (10% sucrose, about 3.5âml each) were intermittently delivered following licking (that is, rewards were lick contingent) with 8âs ITI, with a maximum of 150 rewards per session. Sucrose rewards were delivered using a solenoid-gated gravity feed. Licks were detected using a piezo element (SparkFun). Stimulus delivery and sensor reading were controlled using a custom Arduino MEGA board and recorded using CoolTerm software. On days 2 and 3, mice were introduced to 2 lick spouts, one on each side of the mouse, separated by about 50Â°. Sucrose rewards were delivered in both spouts following licking with 8âs ITI. The goal was to teach mice that rewards were delivered from both spouts. Thus, if a mouse showed preference for the spout on one side, that spout was temporarily removed so the mouse could learn to lick from the other spout. Once the animal showed similar preference for both spouts, lick training was completed and pre-defeat SPT was initiated on the following day. SPT occurred over the course of 2 consecutive days, during which one spout delivered water and the other delivered sucrose. Rewards were delivered following licking with 8âs ITI and a maximum of 150 rewards in total per day. The spout designation was randomized across mice on day 1 and counterbalanced on day 2. Sucrose preference was calculated as the averaged percentage of sucrose rewards obtained across 2 days. On completion of day 2 of pre-defeat SPT, mice were taken off water restriction and housed in a social defeat room for 3 days before CSDS began. Post-defeat SPT was performed using the same protocol, with the addition of Neuropixels recording. Post-defeat SPT was used for all analysis shown.

To control for the possibility that reward choice and intention signals were driven by differences in direction or action sequence (left versus right) coding, a separate cohort of mice were recorded using the same-reward SPT, in which, instead of delivering water or sucrose in the two lick spouts (different-reward SPT), both spouts delivered sucrose rewards. The rest of the experiments were the same as described above.

For chemogenetic manipulation during the SPT, 3âweeks after viral micro-infusion (see the section entitled Viral micro-infusion surgery), CSDS and control mice went through the same CSDS or control procedure and social interaction test. On SPT days, saline (i.p.) was injected 20âmin before the first half of the SPT (maximum 75 trials). Then, CNO (i.p.) was injected 20âmin before the second half of the SPT (maximum 75 trials). The design allowed for within-animal within-session comparisons of behaviour and neural activity patterns before and after CNO injection.

Neuropixels recording and data preprocessing

Recording

Mice were head-fixed to the SPT apparatus without lick spouts present. Kwik-Sil was removed from the skull surface. Before insertion, Neuropixels 1.0 probes (IMEC) were first coated with DiI, DiO or DiD dyes (ThermoFisher Scientific) and allowed to dry. Probes were inserted at about 1âmmâmin^â1 to the target coordinate using Sensapex manipulators. Probe targets and their coordinates are as follows: amygdala (â1.71âmm AP, â0.28âmmâML, â6.5âmm DV, at 31.3Â° ML) and ventral hippocampus (â3.9âmm AP, â2âmm ML, â4.5âmm DV, at 25.8Â°âML). One or two probes were inserted per session per mouse. Simultaneously recorded probes were coated in the same colour of dye but spaced at least several hundred micrometres apart to allow for unambiguous identification. Different colours of dyes were used across days to help differentiate probe tracks. After a probe reached the targeted DV site, it was left in place for 10âmin before the start of recording, which includes 10âmin of pre-task (no task stimulus) and SPT. Neuropixels action potential signals were recorded using Neuropixels acquisition system and SpikeGLX software (https://billkarsh.github.io/SpikeGLX/), at 30,000âHz with gain of 500. Behavioural signals were recorded using a separate data acquisition board (National Instruments), along with a synchronization signal that was also recorded by Neuropixels to help synchronize clocks between different data streams. After each session of SPT, probes were slowly removed from the brain and the skull was covered with Kwik-Sil. Probes were cleaned using Tergazyme solution (1%, Alconox) overnight and rinsed using deionized water before reusing or storage.

Histology and probe track registration

At the end of the experiments, mice were transcardially perfused with 1Ã PBS followed by 4% paraformaldehyde solution. Brains were fixed overnight at 4âÂ°C, and then transferred to 30% sucrose solution for 48âh. Brains were sectioned coronally using a microtome (Leica SM2000) at 50âÎ¼m thickness and mounted on glass slides with Fluoromount G with DAPI (Southern Biotech). Images were obtained using a confocal microscope (Nikon Ti2-E Crest LFOV Spinning Disk/C2 Confocal) with a 20Ã objective. Probe tracks were traced using the AllenCCF toolbox (https://github.com/cortex-lab/allenCCF).

Spike-sorting

Neuropixels action potential signals were preprocessed and spike-sorted offline using Kilosort 2 (ref.â⁶²) or Kilosort 4 (ref.â⁶³), and after sorting, the clusters were manually validated using Phy⁶⁴. Only well-isolated clusters (putative single units that are classified as âGoodâ using Phy) were analysed. All other clusters, including multi-unit activity and noise, were not analysed.

Data analysis

Animals were allowed to freely choose reward types after 8âs ITI had passed between trials, by licking at the spout of their choice. Reward deliveries were lick contingent. Trial types were defined as a Â±4âs time window around the time of reward delivery. For all analyses, only sessions with at least five neurons in the region of interest were used. For analysis during the pre-task period, we used min 2â8 of the 10âmin pre-task recording period. For analysis during the task period, we used time windows specified in each figure. All data analysis were performed using custom codes in MATLAB and Python.

Behavioural data analysis

Behavioural classification of mice

The relationship between sucrose preference and social interaction ratio was assessed using a Pearson correlation. To classify CSDS mice into subtypes, we applied unsupervised K-means clustering using both behavioural metrics, sucrose preference and social interaction ratio. The optimal number of clusters was determined by evaluating cluster numbers from 2 to 10 and maximizing the silhouette score.

Lick analysis

Lick rasters were generated by binning licks using 0.02-s bin size. Lick rates were calculated using 0.1-s bin size and averaged across trials per mouse for each trial type as specified in the figures. As mice tend to sample from both lick spouts in a trial (with them ultimately choosing and obtaining a reward from one), we computed the lick rate DI to assess their preference for licking at each spout. We first quantified the difference between lick rates on sucrose versus water lick spouts for sucrose choice trials (lick rate sucrose spoutâââlick rate water spout), and separately, the difference between lick rates on sucrose versus water lick spouts for water choice trials. The two values were then averaged to obtain the DI for that session. A DI value greater than 0 suggests a greater lick rate on the sucrose spout in comparison to the water spout, and vice versa for a DI value less than 0.

To take into account reward history and assess how it affects current behaviour, we further divided sucrose and water trials into sucroseâsucrose (SS), waterâsucrose (WS), waterâwater (WW) and sucroseâwater (SW) trials (previousâcurrent reward). The first trial of each session was discarded as it had no prior trial. To assess the probability of each trial type irrespective of the animalâs overall sucrose preference, we normalized the number of trials to the total number of previous trials of a specific type. For example, we defined the overall transition probability from a water trial to a sucrose trial as P(WS)â=âP(WS)/(P(WW)â+âP(WS)), and from a sucrose trial to a sucrose trial as P(SS)â=âP(SS)/(P(SW)â+âP(SS)), in which P(XY) is the transition probability from reward X to reward Y. We normalized the transition probabilities such that P(WW)â+âP(WS)â=â1, and P(SS)â+âP(SW)â=â1. Using this normalization, if P(SW) is not significantly different from P(WW)_, this would suggest that the current water reward choice is independent of the previous reward, because of the probability of switching from sucrose or staying on water is the same; otherwise, the current reward choice is dependent on the previous reward (that is, reward choices could be modelled as a first-order Markovian process).

We also computed the proportion of each of the four trial types normalized to the total number of trials per session, to assess how much each trial type contributes to the overall session. In this case, the percentages of trials of each of the four trial types were computed per session and averaged across sessions for each mouse. As the number of trials may be influenced by each animalâs innate preference for different rewards, we computed the chance probability of the occurrence of each trial type by calculating the joint probability of the previous and current trial. For example, chance P(SW)â=âP(S)âÃâP(W). The number of trials was then subtracted by the chance level in each mouse (ânumber of trials chance removedâ). Sucroseâsucrose and waterâwater trials were combined when analysing stay trials, and sucroseâwater and waterâsucrose trials were combined when analysing switch trials. The preference between stay versus switch trials in each mouse was calculated as: percentage of stay trialsâââpercentage of switch trials. To quantify the number of consecutive trials, we first obtained the average number of consecutive trials per trial type (sucrose or water) per session and then averaged across sessions for each mouse.

Decoding group identity using behavioural features

To examine whether group identity could be decoded using behavioural data, we defined a Mahalanobis-like binary decoder. Specifically, for each mouse, we considered four behavioural features: lick rate DI during pre-reward and post-reward, elevated plus maze open arm time (CSDS mice showed increased anxiety-like behaviour⁵; Extended Data Fig. 1c), sucrose preference, and social interaction ratio. Considering two groups at a time, we defined and constructed a Mahalanobis binary decoder to assign a single testing mouse to one of the two groups in the behavioural feature space. The input to the binary classifier consisted of an NâÃâF training matrix and a 1âÃâF testing matrix, in which N represents the total number of training mice between the two classes, and Fâ=â4 represents the total number of features. In each cross-validation, we first balanced the number of mice in each group by randomly subsampling the minimum number of mice between the groups. Next, we randomly selected one mouse as the testing sample and used the remaining mice as the training set, for a total of 1,000 cross-validations. We defined a Mahalanobis-like distance in the feature space as the Euclidean distance between the testing mouse and the centroid of the training groups, divided by the variance along the distance direction. The testing sample was assigned to the group identity with the minimum Mahalanobis-like distance. The performance of the decoder was evaluated by calculating the fraction of correct classifications out of the total 1,000 cross-validations, and the entire procedure was repeated for all possible pairs of the three groups (that is, control, susceptible and resilient mice).

Pre-task spontaneous facial and limb feature analysis

For a subset of mice, we recorded spontaneous facial and limb movements during the pre-task period using the Alvium 1800 U-158 camera (Allied Vision) with the 16âmmâC VIS-NIR Fixed Focal Length Lens (Edmund Optics), at frame rate of 114 frames per second, using the MATLAB Image Acquisition Toolbox. We tracked 12 keypoints using DeepLabCut⁶⁵. These include eye top, eye bottom, eye front, eye back, snout top, snout tip, snout bottom, whisker 1, whisker 2, mouth, left hand and right hand.

To quantify facial and limb movements, we calculated the following features from keypoints⁶⁶: eye opening ratio, snout angle, mouth position, whisker position, left limb X and Y coordinates. Eye opening ratio is defined as the ratio between the vertical and horizontal Euclidean distance of the eye (that is, (eye topâââeye bottom)/(eye frontâââeye back)). An eye opening ratio of 1 represents a perfectly spherical opened eye. Snout angle is calculated as the angle formed by the vector of snout tip to snout top, and the vector of snout tip to snout bottom. A smaller angle represents a more pointed snout. The mouth position is calculated as the Euclidean distance between the mouth and the eye front. The whisker position is calculated as the Euclidean distance between whisker 1 and the eye front.

Analysis of embedding dimensionality of face and limb features

We used PCA to assess the embedding dimensionality of facial and limb features over time for each mouse. We examined the facial and limb features in a 250-ms bin during the 6-min window (min 2â8) within the 10âmin pre-task recording period. We define the feature space as a six-dimensional space in which each axis is the value of one facial and limb feature. The PCA analysis allowed us to identify how much variance of these features in the feature space is accounted for by each principal component (PC). We applied PCA to the KâÃâT matrix, for which Kâ=â6 is the number of facial and limb features, and T is the number of bins, and we determined the cumulative curve of the variance explained by each PC. We subsequently used the cumulative variance values for the first three PCs as features to decode the group identity.

HMM for face and limb features

We fitted HMMs to facial and limb features recorded in a 250-ms bin during 6âmin (min 2â8) of pre-task recording. The HMM identifies patterns of behaviour along time, with each pattern corresponding to a specific behavioural state, defined by the combination of the six facial and limb features, that is not directly measurable. We fitted an HMM separately for each mouse using the same software framework developed by the Linderman Lab (https://github.com/lindermanlab/ssm) we used to analyse neural data. The input data for the HMM consisted of a KâÃâT matrix, for which Kâ=â6 represents the total number of facial and limb features in the session, and T represents the total number of time bins, and we assumed a Gaussian model as the observation model. For each time series, we fitted 5 models with a maximum of 100 iterations for each value of the total number of states ranging from 2 up to 100, using randomized initial conditions. The model with the smallest Akaike information criterion score was retained as the best model for further analyses.

Agglomerative clustering analysis for HMM behavioural states

To better characterize the spatial structure of the HMM states in the facial and limb features space, we examined the pairwise correlation between the states. For state 1 defined by Xâ=â(x₁, x₂,â¦, x_K), in which x_i is the value of the feature i, and state 2 defined by Yâ=â(y₁, y₂,â¦, y_K), we computed the Pearson correlation coefficient Ï(X,Y) to assess the distance between the states in the facial and limb feature space. We calculated the correlation coefficients for all pairs of total N states and stored them in an NâÃâN correlation matrix J. Subsequently, we performed agglomerative clustering on the correlation matrix. Specifically, we defined a new distance matrix D as 1âââJ, in which 1 is an NâÃâN matrix of ones. This matrix served as the input to the agglomerative clustering algorithm, which iteratively combines states to define new clusters according to the pairwise distance. The algorithm initialized each state as a separate cluster with minimum distance (maximum correlation) and iteratively merged two clusters v and u with the smallest distance into a new cluster. The new distance d assigned to the agglomerated clusters was defined as d(u,v)â=âmax(dist(u[p], v[q])), in which p and q represent all of the points in the merged clusters u and v, also known as the farthest point algorithm (sklearn.cluster.AgglomerativeClustering, built-in class in scikit-learn in Python⁶⁷). Agglomerative clustering has the advantage of producing a hierarchical structure of clusters, and this hierarchical representation allowed us to examine the relationships and similarities between states, specifically how behavioural states may be nested differently within large clusters in different groups (for clustering analysis on neural data, see also the section entitled Agglomerative clustering analysis). Agglomerative clustering does not require any assumption regarding the total number of clusters. It iteratively merges the closest states and clusters until all states are merged into one final cluster. We performed the clustering analysis separately for each mouse. After examining the clusters, we counted the total number of clusters at different levels of distance, or thresholds, for which the higher the levels of distance, the lower the number of clusters, until reaching only one cluster at the highest distance. We assessed the number of total clusters and the proportion of total clusters retained relative to the total number of states as a function of thresholds. A higher number of states at the same threshold value indicates a greater degree of dissimilarity among the inferred hidden states. We retained the proportion of total clusters along these curves from a threshold of 0.01 up to 0.05, because of the high facial and limb feature correlation values between inferred states, resulting in a total of five features that were subsequently used in the decoding of group identity.

Decoding group identity using facial and limb features

This analysis aimed to decode the group identity (that is, control, susceptible or resilient) on a single-mouse basis by analysing the facial and limb features recorded during 6âmin of the pre-task period. For each mouse, we assessed the embedding dimensionality using PCA (see the section entitled Analysis of embedding dimensionality of face and limb features), and we considered the cumulative variance explained by the first three PCs as features for decoding. Following the inference of hidden states and the clustering analysis, we calculated the proportion of clusters retained at different thresholds and extracted the values at five distinct thresholds (see the section entitled Agglomerative clustering analysis for HMM behavioural states). Additionally, we computed the mean and standard deviation of the facial and limb features as the last two features. Overall, we assessed a total of Fâ=â10 features for each mouse. We used the Mahalanobis binary decoder procedure, in which the input to the binary classifier consisted of an NâÃâF training matrix and a 1âÃâF testing matrix, in which N represents the total number of training mice between the two classes, and Fâ=â10 represents the total number of features (see the section entitled Decoding group identity using behavioural features). The decoder was trained and tested for 1,000 iterations, with a new random testing subject selected and removed from the training set for each of them.

Neural feature decoding when facial and limb feature decoding is at chance

We compared neural to facial and limb feature decoding accuracy during only the time bins when facial and limb decoding accuracy is at chance to assess how well a decoder using neural features (see the section entitled Decoding group identity using neural features) performs even when facial and limb feature decoding is at chance level. Specifically, we trained an SVM with a linear kernel on the six facial and limb features to differentiate between control versus susceptible mice in a subset of 10 randomly selected 1-s bins of training mice and tested on the pre-task time window of 6âmin (min 2â8, 360 time bins total) of one held-out testing mouse. We then selected the test time bins when facial and limb decoding accuracy is within 1 or 2âs.d. of the chance level (0.5), obtained from the distribution accuracies of 100 null models after shuffling the labels. In these same time bins for which the classification based on facial and limb features is at chance, we performed decoding of control versus susceptible mice using neural activity of BLA.

Contribution of facial and limb movements to neural activity in the BLA

We investigated whether the facial and limb features contributed to BLA neural activity during the pre-task period. We fitted facial and limb features to neural activity (firing rate) using linear regression in each mouse separately. We binned neural and facial and limb feature data using a 1âs time window (total of Tâ=â360 bins) and defined our model as Yâ=âAX^Tâ+âÎ², in which Y is an NâÃâT matrix with the firing rate of N recorded neurons, A is an NâÃâK matrix with the regression coefficients of Kâ=â6 facial and limb features, and X is a TâÃâK matrix with the K facial and limb features values. Î² is the intercept (a constant). Before fitting, the data were centred to zero. We used the linear least square error as a loss function and added an L2-norm regularization term to prevent overfitting. We tried a range of values for the L2-norm regularization term, ranging from 0 (equivalent to ordinary least squares) to 10³, with no significant difference in the final coefficient of determination (R²) estimate.

We did not find a positive R² from any of the linear models, suggesting that using facial and limb features that we recorded, we could not predict BLA neural activity better than chance. In other words, these facial and limb features are unlikely to contribute significantly to BLA neural activity, and consequently, any group differences that we observed.

Single-neuron analysis

Firing rate

For task period, spike trains were aligned at the time of reward delivery (time 0) and neurons within the same region were pooled across animals of the same group to construct pseudo-populations. Only neurons with at least ten trials per trial type (sucrose and water) were included. For peristimulus time histograms, spikes were binned at 10-ms resolution, z-scored to pre-reward (â1 to 0âs), and smoothed with a 50-ms moving average filter. For analysis of raw firing rates, spikes were binned at 500-ms resolution.

Reward-choice-selective neurons

Analysis was performed using pseudo-population and only neurons with at least ten trials per trial type (sucrose and water) were included. Mice with fewer than five neurons in regions of interest were excluded. Reward-choice-selective cells were identified^68,69, and the magnitude of the selectivity was quantified, using the auROC method, which compares single-neuron firing rates between trial types, across levels of response thresholds for each time bin. Spikes were binned at 500-ms resolution. Shuffled distributions were computed for each time bin by randomly shuffling trial type ten times per neuron. A neuron is deemed reward choice selective if its auROC is >2âs.d. of the shuffled distribution for that neuron. The fraction of selective neurons in a region was calculated as: number of selective neurons/total number of neurons. Differences in the fraction of selective neurons across groups were assessed using Fisherâs exact tests.

Intention-modulated neurons

Analysis was performed using pseudo-population and only neurons with at least ten trials per trial type (switch and stay) were included. Intention-modulated neurons were identified using a similar method as reward-modulated neurons. Mice with fewer than five neurons in regions of interest were excluded. In this case, a cell is deemed intention-modulated if the distribution of firing rates during the 4âs pre-reward period (â4 to 0âs) in switch trials is significantly different from stay trials, as identified using Wilcoxon rank-sum test followed by false discovery rate correction across all neurons in that group (Pâ<â0.05). As the fraction of neurons was small and did not meet the criteria for using Chi-squared test, Fisherâs exact tests were used to perform statistical comparisons between percentages of intention-modulated neurons across groups.

Population analysis

Analysis of embedding dimensionality

PCA was used to evaluate the embedding dimensionality of population activity of simultaneously recorded neurons over time. The method aims to identify how much variance of the population representation in the firing rate space is accounted for by each PC. We chose this method because the pre-task period lacks behavioural labels. PCA has the advantage of allowing us to compare neural data between animals because the method is invariant for rotations and global stretching, transformations normally needed to align a neural representation of one subject into another. We examined the activity of each neuron in 1-s bins during the 6âmin time window (min 2â8) within the 10âmin pre-task recording period, resulting in 360 bins. The ensemble activity across these bins can be represented as a geometrical object in the firing space, with each axis representing the firing rate of a neuron and each point representing the ensembleâs activity in a time bin. We calculated the embedding dimensionality of this geometrical object for each mouse. We included only mice with at least five simultaneously recorded neurons in the region of interest during the pre-task recording. We randomly selected five neurons for each mouse and calculated the z-scored firing rate matrix NâÃâT, in which N is the number of neurons, and T is the number of time bins. We applied PCA to this matrix and determined the cumulative curve of the variance explained by each PC. We repeated this procedure 1,000 times and averaged the results across the subsamples for each mouse. Our goal was to compare cumulative variance curves across groups and determine whether a group had a higher cumulative value at M PCs (Mââ¤â5), indicating a lower dimensionality of the geometrical object. We subsequently used the cumulative variance values for the first three PCs as features to decode the group identity.

We also assessed the participation ratio (PR), which is a normalized measure of dimensionality based on the full distribution of PCA eigenvalues (that is, how much variance is explained by each PC), and it is defined as:

$${\rm{P}}{\rm{R}}=\frac{{(\mathop{\sum }\limits_{i=1}^{N}\lambda i)}^{2}}{\mathop{\sum }\limits_{i=1}^{N}\,({\lambda i}^{2})}$$

in which Î»_i are the eigenvalues of the covariance matrix of the neural activity, and Nâ=â5. If only one eigenvalue explains all of the variance (Î»_iââ â0 for iâ=â1 and Î»_iâ=â0 for all iââ¥â2), then PRâ=â1. On the other hand, if all eigenvalues are equal, the dimensionality is maximum, PRâ=âN (refs.â^70,71).

During the task period, the same analysis was repeated during the 1âs of pre-reward and post-reward periods, using a z-scored firing rate with 0.2-s bins (5 bins for each period).

HMM

We used HMMs to identify patterns of population activity in the time series, with each pattern corresponding to a specific neural state that is not directly measurable^38,40,72. We fitted an HMM separately for each mouse for the pre-task and task period. For the DREADD dataset, HMMs were fitted for saline and CNO periods of each mouse separately. To perform model fitting, we used the software framework developed by the Linderman Lab (https://github.com/lindermanlab/ssm).

To prepare the data for the HMM analysis, we binned the 6-min pre-task recordings of each session into 1-s bins, resulting in 360 bins. We computed the spike count of each neuron in each bin. The input data for the HMM consisted of an NâÃâT matrix, in which N represents the total number of simultaneously recorded neurons in the session, and T represents the total number of time bins.

For the analysis during the task in the pre-reward and post-reward periods, we computed the spike count in 0.2-s time bins. We fitted separate HMMs for the pre-reward and post-reward periods for sucrose and water trials. To accomplish this, we concatenated the M trials within a single session and arranged the input data in an NâÃâTâÃâM matrix, for which Tâ=â5. We chose the bin size of 0.2âs, because this bin size balanced the inference of maximum possible transition states and total spike count used to fit HMMs.

For decoding of switch versus stay using HMM states, we focused on the 4âs pre-reward period. Spike counts were binned using a 1-s bin size, and concatenated across the 4-s window of all trial types. This resulted in an NâÃâTâÃâM input matrix, for which Tâ=â4, and M represents the total number of recorded trials in the session. Consistent with previous analyses, in our analysis, we retained only sessions with at least five simultaneously recorded neurons.

Given the recorded (observed) spike count over time, we modelled the neuronal activity as a Poisson process, with the mean value dependent on the current neural state. We represented the probability of observing the spike count vector n(t) of N neurons at time bin t, given the hidden neural state S_tâ=âj, as being distributed as a multivariate Poisson process: P (n_tâ|âS_tâ=âj)â~âPoisson(Î; n_t), where â¼ denotes âdistributed asâ. Here, Îâ=â{Î»₁, Î»₂, â¦ Î»_N}, and Î»_i represents the estimated mean activity for the ith neuron in state j. The vector Î corresponds to the column of the NâÃâK âemission matrixâ E, which provides the firing rates or activation probabilities of observing a specific neuronal pattern when the population activity is in a particular state.

We assumed the dynamics of the neural states to evolve according to a first-order Markovian process, for which the probability of transitioning from one state to another depends only on the current state. This process is summarized by the KâÃâK âtransition probabilityâ matrix T. Additionally, we incorporated an initialization vector A, which provides the probability of starting in each state. The HMM was fully described by the set of parameters {E, T, A}, which were inferred by fitting the model to the recorded neuronal spike counts⁷³. We used the BaumâWelch expectation-maximization algorithm to update the model parameters and maximize the likelihood of the observed data. For each time series, we fitted 5 models with a maximum of 100 iterations for each value of the total number of states ranging from 2 up to 50, using randomized initial conditions. The model with the smallest Akaike information criterion score was retained as the best model for further analyses³⁸. Subsequently, we used the Viterbi algorithm to estimate the most likely sequence of states over time.

Agglomerative clustering analysis

To better characterize the spatial structure of the hidden states, we examined the pairwise correlation between the inferred activity of the states. For state 1 with an activity vector Xâ=â(x₁, x₂, â¦, x_N), in which x_i represents the activity of neuron i, and state 2 with an activity vector Yâ=â(y₁, y₂,â¦, y_N), we computed the Pearson correlation coefficient Ï(X,Y) to assess the distance between the states in the neuronal activity space. We calculated the correlation coefficients for all pairs of states and stored them in an NâÃâN correlation matrix K. Subsequently, we performed agglomerative clustering on the correlation matrix.

Specifically, we defined a new distance matrix D as 1âââK, in which 1 is an NâÃâN matrix of ones. This matrix served as the input to the agglomerative clustering algorithm, which iteratively combines states to define new clusters according to the pairwise distance. The algorithm initialized each state as a separate cluster with minimum distance (maximum correlation) and iteratively merged two clusters v and u with the smallest distance into a new cluster. The new distance d assigned to the agglomerated clusters was defined as d(u,v) = max(dist(u[p], v[q])), in which p and q represent all of the points in the merged clusters u and v, also known as the farthest point algorithm. Agglomerative clustering has the advantage of producing a hierarchical structure of clusters, which we represented as a dendrogram. This hierarchical representation allowed us to examine the relationships and similarities between states, specifically how neural states may be nested differently within large clusters in different groups. Agglomerative clustering does not require any assumption regarding the total number of clusters. It iteratively merges the closest states and clusters until all states are merged into one final cluster. We performed the clustering analysis separately for each mouse, visualizing the results with a dendrogram that summarizes the merging of clusters at different levels of distance, ranging from 0 (original states) to 1 (a single cluster).

After examining the clusters, we counted the total number of clusters at different levels of distance, or thresholds, for which the higher the levels of distance, the lower the number of clusters, until reaching only one cluster at the highest distance. We assessed the curves of the number of total clusters and the proportion of total clusters retained relative to the total number of states as a function of thresholds. Comparing these curves between two groups, a higher number of states at the same threshold value indicates a greater degree of dissimilarity among the inferred states. We retained the proportion of total clusters along these curves from a threshold of 0.1 up to 0.5, resulting in a total of five features that were subsequently used in the decoding of group identity.

We applied the clustering analysis to the pre-task activity using the previously inferred states described in the section entitled HMM, as well as to the pre-reward and post-reward task periods for water and sucrose trials separately.

Correlation of population activity across time

To examine how variable population activity was across time during the pre-task period, we performed Pearson correlation on population vectors of neuron firing rates across all time bins (1-s bins). The correlation values were then averaged to assess differences between groups.

Decoding group identity using neural features (dimensionality, hidden states, firing rates)

This analysis aimed to decode the group identity (that is, control, susceptible or resilient groups, or saline versus CNO groups for the DREADD experiment) on a single-mouse basis by analysing the pre-task activity, for which no behavioural labels were available. As described in the section entitled Analysis of embedding dimensionality, the pre-task activity can be represented as a geometrical object in the firing space, with each axis representing the firing rate of a neuron and each point in the space representing the activity of the neuronal ensemble in each time bin. We sought features that characterized the representational object and were invariant to rotations and scaling transformations, or a subset of these transformations, ensuring shape invariance of the object.

We included only mice with at least five neurons simultaneously recorded during the pre-task period. For each mouse, we computed the cumulative variance explained across the PCs (for more details, see the section entitled Analysis of embedding dimensionality). We considered the cumulative values of the first three PCs as features for decoding. Following the inference of hidden states and the clustering analysis, we calculated the proportion of clusters retained at different thresholds and extracted the values at five distinct thresholds (see the section entitled Agglomerative clustering analysis). Additionally, we computed the mean and standard deviation of the spike count as the last two features. All of the neural features were computed using 1-s bins to optimize the final decoding performance. Overall, we assessed a total of 10 neural features for each mouse.

We used the same Mahalanobis binary decoder procedure as previously described in the section entitled Decoding group identity using behavioural features. Specifically in this case, the input to the binary classifier consisted of an NâÃâF training matrix and a 1âÃâF testing matrix, in which N represents the total number of training mice between the two classes, and Fâ=â10 represents the total number of features. Before running the classification algorithm, we preprocessed the input matrices by applying a minimumâmaximum scaler to the mean and standard deviation of the spike count, ensuring that all features were scaled between 0 and 1 (because the PC cumulative variance and fraction of HMM clusters are defined between 0 and 1 by construction). The decoder was trained and tested for 1,000 iterations, with a new random testing subject selected and removed from the training set for each of them.

The same decoder procedure was also applied during the pre-reward and post-reward periods of the task. For the decoding using vCA1 activity, the training set was defined as 20% of the total number of mice owing to the initial larger sample size.

Neural population decoding

As in a previously described method³¹, a linear SVM classifier was trained to classify patterns of activity into two discrete categories. Results are reported as the generalized performance of the decoder using cross-validation with a 80:20 training/testing split. Patterns of activity are defined as the mean firing rate during 0.5-s non-overlapping time bins. Pseudo-population recordings were generated by combining all neurons within the same region and the same group. As it is well known that neural activity in previous trials could strongly influence activity in current trials⁷⁴, for all pseudo-population decoding analyses, we balanced the number of trials of each trial type by taking into account both the previous and current trial types. In other words, we have equal numbers of waterâwater, sucroseâsucrose, waterâsucrose and sucroseâwater trials (previousâcurrent trials, respectively). Only neurons with at least eight trials per each of the four trial types were included.

To decode current reward, we combined equal numbers of waterâwater and sucroseâwater trials for water trials, and similarly, equal numbers of sucroseâsucrose and waterâsucrose trials for sucrose trials. To decode previous reward, we combined equal numbers of waterâwater and waterâsucrose trials for water trials, and similarly, equal numbers of sucroseâwater and sucroseâsucrose trials for sucrose trials. To decode intention (stay versus switch), we combined equal numbers of sucroseâsucrose and waterâwater trials for stay trials, and similarly, equal numbers of sucroseâwater and waterâsucrose trials for switch trials. We balanced the previous and current reward values when defining switch and stay trials to rule out the confound of reward choices on intention. In other words, the intention signal that we define here is an intention to switch away or stay on the same reward as the previous trial, irrespective of the specific reward value.

To control for the possibility that differences in direction or action sequence (left versus right) coding contributed to reward choice or intention coding, we performed decoding in mice that were given the same value reward in the two lick spouts (same-reward SPT; for more details, see the section entitled Head-fixed SPT). All decoding procedures are the same.

As each group may have different number of cells and trials, we used subsampling procedures to randomly subsample cells (60 neurons for both BLA and vCA1), and within those cells, randomly subsample trials equal to the group with the smallest number of trials. The resulting dataset was used to train SVM and obtain cross-validated decoding accuracies. For each set of subsampled cells, decoding accuracies across random subsampling of trials (repeated ten times) were averaged to obtain a single sample of decoding accuracy. We repeated the whole procedure ten times to obtain statistical comparisons across groups and against shuffled distribution.

For within-time-bin decoding, SVMs were trained using data from one time bin and tested using held-out data from the same time bin. For cross-time-bin decoding, SVMs were trained using data from one time bin and tested using data from the other time bins.

To control for the possibility that differences in lick rates contributed to differences in decoding accuracy for reward choice, we performed additional analysis in which we equalized the lick rates by using only trials with the same lick rates between groups for decoding. Specifically, for susceptible and resilient mice, we analysed only those trials with lick rates within 3â14âHz in both groups, whereas for saline and CNO mice, we analysed those trials with lick rates within 3â10âHz in both groups.

For statistical comparisons, decoding accuracy during pre-reward (â4 to â3âs) and post-reward (0 to 1âs) periods was averaged. If the mean decoding accuracy in a group was significantly higher than 2âs.d. of its respective mean shuffled distribution, we then performed additional between-group comparisons (two-way comparison: MannâWhitney test; three-way comparison: KruskalâWallis test followed by Dunnâs multiple comparisons test).

Decoding switch versus stay using HMM states

In addition to using recorded firing rates during the 4âs pre-reward window to decode switch versus stay, we also trained separate decoders using the smoothed activity of the hidden states inferred by the HMMs. This approach uniquely allowed us to identify population hidden states within this time window, and specifically those states that may be intention selective, which can then be artificially manipulated to assess their necessity in decoding. It is important to note that the training of the HMM was performed on concatenated trials, which includes the four 1-s bins pre-reward across all trial types. We then rearranged the sequence of hidden states in each trial type a posteriori.

Once the parameters of the HMMs were inferred, the models could smooth the observed data by computing the mean observed activity under the posterior distribution of hidden states³⁹. For instance, given the observed activity vector X during a time bin of a trial pre-reward, the HMM inferred a 0.2 probability of being in state Sâ=â1 and a 0.8 probability of staying in state Sâ=â2. More precisely, P(Sâ=â1â|âX)â=â0.2, and P(Sâ=â2â|âX)â=â0.8. The smoothed observations used to train and test the linear decoder were calculated as Yâ=â0.2Î¼₁â+â0.8Î¼₂, in which Î¼_j represents the inferred mean for the observations in state j. Figure 3h is an example spike raster of 15 simultaneously recorded neurons in two switch and two stay trials during 4âs pre-reward from one representative mouse. The different colour-shaded areas are different HMM hidden states, with coloured lines showing the posterior probability for each state.

To ensure robustness, we randomly sampled 60 neurons from each mouse for 10 neuronal subsamples. We generated 1,000 pseudo-trials for each of the 4 trial types, resulting in a total of 4,000 pseudo-trials for the training and testing sets, separately. The input data to train and test the decoder consisted of the smoothed activity assigned to each time bin. We trained and tested a SVM classifier with a linear kernel, similar to the approach used in the population decoding using original firing rates, to decode switch versus stay. In each cross-validation iteration, we randomly selected 100 pseudo-trials as the training set and 20 pseudo-trials as the testing set, for a total of 100 cross-validations. The final decoder accuracy was computed as the average across neuronal subsamples and cross-validations.

To assess the significance of the decoding signal, we compared it to a chance level, defined as 2âs.d. around the theoretical mean of the distribution of accuracies obtained after 100 shuffles of the labels.

Defining intention-selective states

We conducted a detailed analysis of the distribution of hidden states across trial types to identify intention-selective states. For each mouse, we computed the fraction of occurrence of each hidden state within the 4-s bins pre-reward across all trials. This distribution was then normalized to the total number of trials multiplied by the number of bins. We assessed this normalized distribution separately for each trial type.

Consistent with the decoding results, we observed that certain states appeared exclusively in either the stay or switch trials, with no occurrences in the other trial types. To quantify the amount of information each state held for the intention value (that is, stay or switch), we computed the Shannon entropy⁷⁵. Specifically, for a given state, we normalized its occurrence frequency in each trial type to the total number of trials. The entropy of each state for the intention value was calculated using the following formula:

$${H}_{{\rm{s}}{\rm{t}}{\rm{a}}{\rm{t}}{\rm{e}}}=-[{P}_{{\rm{s}}{\rm{w}}{\rm{i}}{\rm{t}}{\rm{c}}{\rm{h}}}\times \text{log}({P}_{{\rm{s}}{\rm{w}}{\rm{i}}{\rm{t}}{\rm{c}}{\rm{h}}})+{P}_{{\rm{s}}{\rm{t}}{\rm{a}}{\rm{y}}}\times \text{log}({P}_{{\rm{s}}{\rm{t}}{\rm{a}}{\rm{y}}})]$$

in which P_switch is the occurrence frequency of the state in switch trials (waterâsucrose, sucroseâwater) and P_stayâ=â1âââP_switch. An entropy value of 0 indicates that the state provides highly informative signals for the intention to switch or stay. Therefore, we defined an intention-selective state as one with an entropy value of 0 for the intention value.

To decode the intention of switch/stay using hidden states, we first examined the distribution of the fraction of intention-selective states at different clustering thresholds for each mouse, and selected a threshold that yielded the highest number of intention-selective states. We then used the inferred firing rates from these identified intention-selective states to train a linear decoder for classifying the intention of mice to switch or stay.

To compare the fraction of intention-selective states across groups, we calculated the fraction of the intention-selective states out of the total number of hidden states using the first four clustering thresholds (ranging from 0.1 up to 0.4, stepped by 0.1), and compared the resulting distribution.

To examine the necessity and sufficiency of intention-selective states, we first excluded trials that contained intention-selective states in at least three time bins pre-reward. In the opposite approach, we enhanced the presence of intention-selective states in the decoding procedure by considering only those trials that included intention states in at least three time bins before the reward delivery.

Generalization of susceptible versus resilient decoder to saline versus CNO

We trained an SVM with a linear kernel to classify whether an animal is susceptible or resilient in the feature space defined by the three behavioural features (sucrose preference, and lick rate DI in the pre- and post-reward) and the four neural features (reward decoding accuracy in the pre- and post-reward, the intention decoding accuracy pre-reward using raw firing rates, and the intention accuracy pre-reward using HMM states). We used one held-out mouse as a testing sample, and the remaining ones as the training set, after balancing the number of training samples per each class. We repeated this procedure for a total of 1,000 cross-validations. We subsequently tested the generalization performance of the decoder in classifying new susceptible mice, not used for training, before and after the treatment of CNO. We assessed the significance of the average decoding performance across the 1,000 cross-validations with respect to a chance interval defined as 2âs.d. around the chance level of 0.5 of the distribution accuracies obtained from 100 shuffles of the labels.

Decoding group identity of susceptible versus resilient mice using behavioural and neural features

We trained an SVM with a linear kernel to classify the group identity (control, susceptible and resilient) using neural signatures of the task phase, specifically reward choice decoding performance during the pre- and post-reward period, and the fraction of intention-selective states (see the section entitled Defining intention-selective states). We used one held-out mouse as a testing sample, and the remaining ones as the training set, after balancing the number of training samples per group. We repeated this procedure for 1,000 cross-validations. We compared the average decoding performance across the 1,000 cross-validations to a chance interval defined as 2âs.d. around the chance level of 0.5 of the distribution accuracies obtained from 100 shuffles of the labels.

MDS

To visualize the geometric structure of the data, we used multi-dimensional scaling (MDS) transformation to obtain a low-dimensional representation of the data. For pre-task data, we started with the NâÃâF matrix used for the Mahalanobis decoder, in which N represents the total number of subjects across all three groups, and F denotes the number of features used for decoding the group identity. Before the dimensionality reduction analysis, we normalized each groupâs data by its variance to reduce noise and enhance the clarity of the final visualization. Next, we performed a diagonalization of the dissimilarity matrix NâÃâN, which contained the Euclidean distances between each pair of subjects in the feature space. We used the same procedure for the task period. In these cases, the input matrix was a TâÃâN matrix, in which T represents the total number of pseudo-trials, and N denotes the number of neurons. In the example MDS plots, each point is the average firing rate across neurons during the specified time window for the specified trial type, with nâ=â1 subsampling, 60 neurons, 1,000 pseudo-trials per condition.

Inter-regional connectivity

vCA1âBLA connectivity during pre-task period

We computed the firing rates of each recorded neuron in 1-s bins within the same region (BLA and vCA1) during a 6-min window (min 2â8) of pre-task recording. We set a minimum of five neurons simultaneously recorded in both BLA and vCA1. Given the matrix NâÃâT, in which N denotes the number of neurons, and T denotes the number of time bins, we computed the PCs of this matrix for each area, separately. We subsequently aligned the neural dynamics of the first PC between the two simultaneously recorded signals by using canonical correlation analysis (CCA). CCA is a linear transformation used to find common patterns between two signals defined in two different spaces, with the goal of maximizing their correlation. Given two matrices Xââ^NxT and Yââ^MxT, in which N and M are the numbers of variables and T is the number of time bins, CCA finds linear combinations U and V of the features in X and Y such that,

for which the coefficients aââ^Nx1 and bââ^Mx1 are chosen to maximize the correlation between U and V. We refer to U and V as canonical components for BLA and vCA1, respectively. We subsequently analysed the cross-correlogram between the first canonical component of BLA and vCA1 with time lags from (â50, +50)âs and its corresponding power spectral density. We computed the power spectral density from the squared magnitude of the fast Fourier transform coefficients⁷⁶ divided by the length of the input signal. We used the frequency at which the power spectral density peaked as an estimate of the dominant frequency of the oscillations between BLA and vCA1. The highest frequency we could access is determined by the Nyquist frequency fâ=âf_s/2â=â0.5âHz, in which f_s is the sampling frequency that in our case is 1âHz. We tested smaller time bin sizes and chose 1-s bins (hence 1âHz sampling frequency) owing to low firing rates during the pre-task period, which would otherwise result in many bins with 0 spikes per second.

vCA1âBLA correlation during pre-reward period

Given the shorter time window during the task period, we could not use the same CCA analysis. Therefore, to analyse the vCA1âBLA interaction before reward, we computed the correlation of regional average firing rates between simultaneously recorded neurons in the two regions. Specifically, firing rates (10-ms bins) were averaged across all simultaneously recorded neurons in each mouse within the same region (BLA and vCA1). Then Pearson correlation was computed across simultaneously recorded regions within each 1-s time window. The correlation was performed for each trial type (sucrose, water, switch, stay) separately, and Pearson correlation r was transformed to Fisher z to make it normally distributed. To assess how different the inter-regional correlation is in sucrose versus water trials for each animal, we calculated the change in correlation (corr_sucroseâââcorr_water).

vCA1âBLA correlation during intention-selective states

We subsequently studied the functional connectivity between BLA and vCA1 in susceptible mice during the presence of intention states in the 4âs before reward delivery. We started by selecting time bins (1âs) for which the intention states were detected in BLA (âintention-selectiveâ) (see the section entitled Defining intention-selective states), and those bins without intention states (ânon-intention-selectiveâ). We then analysed the neural activity of simultaneously recorded BLA and vCA1 neurons during these inferred states, comparing the correlation between the two regions during intention-selective versus non-intention-selective states. We randomly sampled five neurons from each state in BLA and vCA1 and defined the activity matrices X_(area)ââ^NxK and Y_(area)ââ^NxL, for which Nâ=â5 is the number of simultaneously recorded neurons, area is BLA or vCA1, and K and L are the number of intention and no-intention bins, respectively, for a total of four activity matrices. We computed the PCs of each of the four matrices as a denoising procedure and subsequently assessed the Pearson correlation between BLA and vCA1 in each of the first five PCs for each mouse, for intention-selective and non-intention-selective bins separately. We repeated the above procedure 1,000 times, each iteration with different neuron sampling from each brain area, and we computed the average correlation across different sampling.

Statistical analysis

No statistical tests were used to predetermine sample size, but the sample sizes used are similar to those generally used within the field⁵. All tests were two-tailed. Data were analysed using parametric one- or two-way repeated measures ANOVA, or paired t-test. In cases in which it was appropriate, ANOVA was followed by post hoc pairwise comparisons with corrections for multiple comparisons. If data were significantly non-normal (with Î±â=â0.05), non-parametric tests were used, including the KruskalâWallis test or the MannâWhitney test (between-group comparisons) and Wilcoxon signed-rank test (within-group comparisons), and if appropriate, followed by post hoc comparisons with corrections for multiple comparisons. Categorical data were assessed using chi-squared, or Fisherâs exact test if sample size was <5. When comparing to chance, data were considered significant if they were outside 2âs.d. of chance distribution centred around the theoretical chance level (marked by hash symbols on figures). Statistical comparisons between groups were performed for groups that were significantly different from respective chance distribution. Statistical analyses were performed using Graphpad Prism V10.

Statistics and reproducibility

All experiments were repeated across a minimum of two independent cohorts and showed similar results.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.