Thursday, June 25, 2026
No menu items!
HomeNatureAn ECG biomarker for sudden cardiac death discovered with deep learning

An ECG biomarker for sudden cardiac death discovered with deep learning

Study cohort and outcomes: Sweden

We obtained all 441,614 ECGs done from 2010 to 2016 in Region Halland, a public regional health system in Sweden. (Twelve patients in the region opted out of participation in research, so we did not include their ECGs.) We linked these to death certificates and patient electronic health records, which capture all interactions between patients and the national health-care system that oversees all care in Sweden51. ECGs were sampled at 500 Hz and retrieved in XML format from a Philips IntelliSpace system. This research was approved by the ethical review board of Lund University (protocol 2016/517 and amendment 2024-02316-02).

Before performing any analysis, we created strict random splits in our dataset to safeguard against overfitting (see Supplementary Information section VII.A for a CONSORT-style diagram). We first created a data lockbox by randomly sampling 40% of patients and all of their ECGs. The lockbox remained untouched from model development through peer review, until provisional acceptance of the manuscript. The remaining 60% was split in half at the patient level, one half for training and the other half for validation, and used for initial submission and the usual peer-review process. On provisional acceptance of the manuscript, we retrained the model on the 60% of data we had accessed, 262,554 ECGs from 75,157 patients, and applied the resulting model with no modification to generate predictions in the 40% lockbox: 179,060 ECGs from 51,481 patients. Those results are shown here. Supplementary Information section VII.A records changes between the initial submitted version and the present version. Model performance improved, consistent with a larger training set size (for example, AUC went from 0.837 to 0.872), providing additional reassurance with regard to overfitting.

We perform all analyses at the ECG level, to account for risk variation over time within a patient, and account for within-patient correlation by clustering standard errors by patient. We view this as preferable to selecting one ECG per patient, which reduces sample size and can introduce bias (for example, the most recent ECG selects on those who survived past initial ECGs). All statistical tests are two-sided.

Our primary outcome, sudden cardiac death in the year after ECGs, was censored for ECGs in the final year of our dataset (2016). Although we do not have death certificates after 2016, we do have access to full electronic health records from 2017 onwards, which indicate whether a clinical encounter occurred. If we observe such an encounter in the year after the ECG, we label the outcome as absent, not missing. The result is that only 12,969 out of 247,286 ECG records in the training set and 6,446 out of 125,987 ECGs in the lockbox are censored, and are thus excluded from outcome evaluation metrics (they are used selectively in training, as detailed in Supplementary Information section IX).

Our primary definition of sudden cardiac death is based on death certificates, using standard epidemiological criteria24: deaths (i) from cardiac or ill-defined causes, and (ii) occurring outside the hospital or in the first 24 h of hospital stays. Details are in Supplementary Information section VII.B. There are many approaches for measuring sudden cardiac death, each with trade-offs. An idealized definition is ‘arrhythmic death’: death preceded by an arrhythmia that can be terminated by defibrillation (ventricular fibrillation or ventricular tachycardia: VF/VT). Of course, measuring this would require continuous premortem ECG monitoring, which is rare. Most studies thus rely on other data: diagnosis codes from death certificates, medical chart review or autopsy. Detailed chart review and autopsy might provide more certainty about arrhythmic causes of death, but exist only for small samples; death-certificate data achieve larger scale, at the expense of detail.

A large body of research has investigated how well our primary definition agrees with more detailed investigations of arrhythmic deaths (for example, in-depth case review, autopsy). Some studies find close agreement52, whereas others find that death certificates are more sensitive than specific for arrhythmic deaths24,25,26,27,28. Low specificity would mean that our definition—and thus predictions—might capture a mix of arrhythmic and non-arrhythmic deaths. Assessing model performance on the basis of death certificates only could thus be misleading: the model would seem to perform well, but some fraction of deaths in the high-risk group would be non-arrhythmic deaths, and thus not preventable with defibrillators.

The limitations of any one definition of arrhythmic death make it crucial to use multiple sources of data to validate model predictions. The experiments described above use three such data sources. First, diagnosed ventricular arrhythmias, the mechanism for sudden cardiac death, as documented in health records from both Sweden and an independent US cohort. Second, detailed investigation into the cause of individual cardiac arrests, in our hospital-based registry from Taiwan. Third, direct estimation of potential mortality reductions from defibrillators, comparing patients with and without defibrillators, as a measure of preventability of deaths in high-risk patients. All of this means that we do not rely on death certificates alone to validate predictions, but also incorporate a range of other information across several independent datasets, to isolate preventable arrhythmic deaths.

Our analysis focuses on younger, healthier high-risk patients who could be good candidates for defibrillators, because our ultimate goal is the prevention of arrhythmic deaths. There is no formal age restriction for defibrillator placement8, but benefit probably decreases with age. Older patients have more complications from the surgical implantation procedure and are more likely to die of competing causes. Physicians think that benefit diminishes for those over 80 years old53, and empirically, only 10% of US defibrillators are implanted in this age group54. We thus focus on ECGs done in patients under 80 years old in the main results—74.6% of all ECGs, and 25.7% of all sudden cardiac deaths. Supplementary Information section VIII replicates all main analyses in the entire cohort, and finds that performance is comparable overall when those over 80 are included.

Summary statistics for the lockbox sample are in Table 1. The median follow-up period was 2,010 days. The overall rate of sudden cardiac death in the year after ECGs was 0.6%; 43.9% of these deaths had LVEF measured premortem, and 36.4% of measured LVEFs were low (LVEF ≤ 35%). Nearly half (41.2%) of sudden cardiac deaths had no obvious risk factors at the time of their ECG—no coronary artery disease or myocardial infarction, heart failure or prior ventricular arrhythmias; 10.0% had a recent myocardial infarction (within 40 days before the ECG, versus 3.3% base rate); and 7.7% had defibrillators implanted (versus 5.4% base rate) but nonetheless experienced sudden cardiac death.

Study cohort and outcomes: USA and Taiwan

We rely on two external datasets to validate model predictions in diverse populations outside of its training context in Sweden. The first is a US cohort of ECGs provided by Dandelion Health, a company that aggregates deidentified medical imaging and health outcomes data from a consortium of health systems across the USA. We obtain all 251,858 ECGs performed in 2021 and 2022, from 139,613 patients under the age of 80 years, sampled at either 500 Hz (83.3%) or 250 Hz (16.7%) and drawn from a GE MUSE storage system. Of note, this is a different manufacturer and format to that of the Philips system from which the Swedish training data were obtained. All 250-Hz ECGs were linearly interpolated to 500 Hz. Supplementary Information section VI.C contains additional details on the population.

The second external dataset is a Taiwanese hospital registry available through the Nightingale Open Science platform55, and described in more detail elsewhere56. Patients were enrolled after being brought to the hospital ED in cardiac arrest (that is, without measurable cardiac output), or after experiencing arrest in the ED, from 2011 to 2019. Data were entered using an Utstein-style reporting template, and linked to all patient ECGs in hospital records, including those before the arrest. The dataset also contains ECGs for a random sample of control patients who visited the same ED (without arrest). We identify all ECGs before the ED visit, and exclude those done in the two days before visits, which might have been done in the context of the same acute event that precipitated arrest, making them less useful for prevention. The final sample includes 4,268 patients, 257 arrests and 4,011 controls. ECGs were sampled at 500 Hz and retrieved from a GE MUSE ECG storage system. Supplementary Information section VI.D contains further details on the population.

Predictive model training

Before model development, we surveyed the literature and online competition platforms (for example, PhysioNet) to review deep-learning architectures used for ECG waveforms. After experimenting with several models (convolutional and residual neural networks, transformers and long short-term memory (LSTM) models) and hyperparameter choices (number of layers, dropout probability, learning rate and so on), we settled on a 64-layer ResNet model consisting of 32 residual blocks, with each block made up of 2 convolutional layers with 128 filters and a kernel of size 16. We verified that this model was able to achieve performance matching or exceeding published benchmarks, both for human-visible ECG features (for example, QRS and corrected QT (QTc) intervals, atrial fibrillation and flutter, and so on57) as well as less obvious patient characteristics (for example, age, sex58 and cardiovascular outcomes23,59,60).

The model’s primary training objective was to predict the probability of sudden cardiac death, on the basis of death certificates, in the year after the ECG. To do so, we developed a multitask learning set-up with three main components, each developed on a different subset of training data: (i) in the entire cohort, we predicted sudden cardiac death (over several time frames, as well as other outcomes; see Supplementary Information section IX); (ii) in patients who died within one year of the ECG, we predicted sudden cardiac death versus other causes of mortality; and (iii) in patients with measured LVEF, we predicted reduced LVEF (≤35%). We then used logit to calibrate predictions (formed in patients of all ages) using only patients aged under 80 years old, all in the Swedish training set, to generate final predictions on one-year sudden cardiac death. Further details are provided in Supplementary Information section IX. In both the Swedish lockbox and external validation samples, model predictions are not modified or fine-tuned in any way, to measure ‘zero-shot’ performance in a new dataset (and, in the USA and Taiwan, on a new label).

Generative model and morphing procedure

Both the generative model and the predictive model that guides it are trained on a dataset of individual beats, viewed across all 12 leads, that are segmented from the full sample of 10-s ECGs using standard methods38. Visualizing individual beats rather than 10-s ECGs makes it easier to understand the focused changes in beat morphology identified by the predictive model. To retrain the predictive model, we use the same network architecture and training procedure as those used for the model trained on 10-s ECGs. To train the generative model, we implement a variational auto-encoder (VAE) with 512-dimension latent space that encodes these individual beats. Of note, all results from the generative model and morphing procedure are drawn from the initial dataset (30% training, 30% validation) rather than the lockbox: the process was computationally intensive, and subsequent results involved several rounds of human review (waveforms, blinded interpretation of linked MRIs) that were difficult to perform again in the lockbox. We see this as acceptable because the emphasis of these results is hypothesis generation, rather than predictive performance.

The morphing procedure begins by randomly sampling 56 patient beats for the VAE to encode (the number was chosen given computational constraints, because the full pipeline here took hours to execute for each beat). These beats serve as our point of entry for exploration of the model’s latent space: we identify the gradient of predicted risk around each beat, and perturb its latent vector to follow the gradient. This produces a higher-risk vector, which we then pass through the decoder to reconstruct, resulting in a counterfactual, higher-risk ECG waveform. The new vector is the starting point for another round of perturbation and reconstruction, which we repeat 2,000 times, or until the risk of the generated synthetic beat reaches the 90th risk percentile. Further details on this procedure can be found in Supplementary Information section VII.B, and the accompanying codebase is available at https://github.com/alexmschubert/ECG-SCD.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

RELATED ARTICLES

Most Popular

Recent Comments