Saturday, November 1, 2025
No menu items!
HomeNatureTraining of physical neural networks

Training of physical neural networks

  • Samborska, V. Scaling up: how increasing inputs has made artificial intelligence more capable. Our World in Data https://ourworldindata.org/scaling-up-ai (2025).

  • Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R. & Eleftheriou, E. Memory devices and applications for in-memory computing. Nat. Nanotechnol. 15, 529–544 (2020).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Wetzstein, G. et al. Inference in artificial intelligence with deep optics and photonics. Nature 588, 39–47 (2020).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Wright, L. G. et al. Deep physical neural networks trained with backpropagation. Nature 601, 549–555 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tanaka, G. et al. Recent advances in physical reservoir computing: a review. Neural Netw. 115, 100–123 (2019).

    Article 
    PubMed 

    Google Scholar
     

  • Hughes, T. W., Williamson, I. A., Minkov, M. & Fan, S. Wave physics as an analog recurrent neural network. Sci. Adv. 5, eaay6946 (2019).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Onodera, T. et al. Scaling on-chip photonic neural processors using arbitrarily programmable wave propagation. Preprint at https://arxiv.org/abs/2402.17750 (2024).

  • Momeni, A., Rahmani, B., Malléjac, M., del Hougne, P. & Fleury, R. Backpropagation-free training of deep physical neural networks. Science 382, 1297–1303 (2023).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar
     

  • Xu, Z. et al. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science 384, 202–209 (2024).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Rumelhart, D. E., Hinton, G. E. & Williams, R. J. Learning representations by back-propagating errors. Nature 323, 533–536 (1986).

    Article 
    ADS 

    Google Scholar
     

  • Lin, X. et al. All-optical machine learning using diffractive deep neural networks. Science 361, 1004–1008 (2018).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar
     

  • Le Gallo, M. et al. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron. 6, 680–693 (2023).

    Article 

    Google Scholar
     

  • Chen, Z. et al. Deep learning with coherent VCSEL neural networks. Nat. Photon. 17, 723–730 (2023).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Mengu, D. et al. Misalignment resilient diffractive optical networks. Nanophotonics 9, 4207–4219 (2020).

    Article 

    Google Scholar
     

  • Matsushima, K. & Shimobaba, T. Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields. Opt. Express 17, 19662–19673 (2009).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Launay, J., Poli, I., Boniface, F. & Krzakala, F. Direct feedback alignment scales to modern deep learning tasks and architectures. Adv. Neural Inf. Process. Syst. 33, 9346–9360 (2020).


    Google Scholar
     

  • Cramer, B. et al. Surrogate gradients for analog neuromorphic computing. Proc. Natl Acad. Sci. 119, e2109194119 (2022).

    Article 
    MathSciNet 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Spall, J., Guo, X. & Lvovsky, A. I. Hybrid training of optical neural networks. Optica 9, 803–811 (2022).

    Article 
    ADS 

    Google Scholar
     

  • Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random synaptic feedback weights support error backpropagation for deep learning. Nat. Commun. 7, 13276 (2016).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Brunton, S. L. & Kutz, J. N. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control (Cambridge Univ. Press, 2022).

  • Hinton, G. The forward-forward algorithm: some preliminary investigations. Preprint at https://arxiv.org/abs/2212.13345 (2022).

  • Laydevant, J., Lott, A., Venturelli, D. & McMahon, P. L. The benefits of self-supervised learning for training physical neural networks. In Proc. 37th First Workshop on Machine Learning with New Compute Paradigms at NeurIPS 2023 (MLNPCP 2023) https://openreview.net/forum?id=Fik4cO7FXd (OpenReview, 2023).

  • Refinetti, M., d’Ascoli, S., Ohana, R. & Goldt, S. Align, then memorise: the dynamics of learning with feedback alignment. In Proc. 38th International Conference on Machine Learning, 8925–8935 (MLR Press, 2021).

  • Lillicrap, T. P., Cownden, D., Tweed, D. B. & Akerman, C. J. Random feedback weights support learning in deep neural networks. Preprint at https://arxiv.org/abs/1411.0247 (2014).

  • Launay, J. et al. Hardware beyond backpropagation: a photonic co-processor for direct feedback alignment. Preprint at https://arxiv.org/abs/2012.06373 (2020).

  • Nakajima, M. et al. Physical deep learning with biologically inspired training method: gradient-free approach for physical hardware. Nat. Commun. 13, 7847 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hinton, G. E., Dayan, P., Frey, B. J. & Neal, R. M. The “wake-sleep” algorithm for unsupervised neural networks. Science 268, 1158–1161 (1995).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Löwe, S., O’Connor, P. & Veeling, B. Putting an end to end-to-end: gradient-isolated learning of representations. In Proc. Advances in Neural Information Processing Systems 32 (NeuroIPS 2019), 3039–3051 (ACM, 2019).

  • Nøkland, A. & Eidnes, L. H. Training neural networks with local error signals. In Proc. 36th International Conference on Machine Learning, 4839–4850 (MLR Press, 2019).

  • Siddiqui, S. A., Krueger, D., LeCun, Y. & Deny, S. Blockwise self-supervised learning at scale. Preprint at https://arxiv.org/abs/2302.01647v1 (2023).

  • Oguz, I. et al. Forward–forward training of an optical neural network. Opt. Lett. 48, 5249–5252 (2023).

    Article 
    ADS 
    PubMed 

    Google Scholar
     

  • Xue, Z. et al. Fully forward mode training for optical neural networks. Nature 632, 280–286 (2024).

    Article 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Spall, J. C. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Trans. Autom. Control 37, 332–341 (1992).

    Article 
    MathSciNet 

    Google Scholar
     

  • McCaughan, A. N. et al. Multiplexed gradient descent: fast online training of modern datasets on hardware neural networks without backpropagation. APL Mach. Learn. 1, 026118 (2023).

    Article 

    Google Scholar
     

  • Bandyopadhyay, S. et al. Single-chip photonic deep neural network with forward-only training. Nat. Photon. 18, 1335–1343 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Oguz, I. et al. Programming nonlinear propagation for efficient optical learning machines. Adv. Photonics 6, 016002 (2024).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Skalli, A. et al. Annealing-inspired training of an optical neural network with ternary weights. Commun. Phys. 8, 68 (2025).

    Article 

    Google Scholar
     

  • Bueno, J. et al. Reinforcement learning in a large-scale photonic recurrent neural network. Optica 5, 756–760 (2018).

    Article 
    ADS 

    Google Scholar
     

  • Kanno, K., Naruse, M. & Uchida, A. Adaptive model selection in photonic reservoir computing by reinforcement learning. Sci. Rep. 10, 10062 (2020).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Hermans, M., Burm, M., Van Vaerenbergh, T., Dambre, J. & Bienstman, P. Trainable hardware for dynamical computing using error backpropagation through physical media. Nat. Commun. 6, 6729 (2015).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys. X 2, 034092 (2017).


    Google Scholar
     

  • Pai, S. et al. Experimentally realized in situ backpropagation for deep learning in photonic neural networks. Science 380, 398–404 (2023).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Morichetti, F. et al. Non-invasive on-chip light observation by contactless waveguide conductivity monitoring. IEEE J. Sel. Top. Quantum Electron. 20, 292–301 (2014).

    Article 
    ADS 

    Google Scholar
     

  • Zhou, T. et al. In situ optical backpropagation training of diffractive optical neural networks. Photonics Res. 8, 940–953 (2020).

    Article 
    CAS 

    Google Scholar
     

  • Guo, X., Barrett, T. D., Wang, Z. M. & Lvovsky, A. Backpropagation through nonlinear units for the all-optical training of neural networks. Photonics Res. 9, B71–B80 (2021).

    Article 

    Google Scholar
     

  • Wanjura, C. C. & Marquardt, F. Fully nonlinear neuromorphic computing with linear wave scattering. Nat. Phys. 20, 1434–1440 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Yildirim, M., Dinc, N. U., Oguz, I., Psaltis, D. & Moser, C. Nonlinear processing with linear optics. Nat. Photon. 18, 1076–1082 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Xia, F. et al. Nonlinear optical encoding enabled by recurrent linear scattering. Nat. Photon. 18, 1067–1075 (2024).

  • Scellier, B. & Bengio, Y. Equilibrium propagation: bridging the gap between energy-based models and backpropagation. Front. Comput. Neurosci. 11, 24 (2017).

    Article 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Ackley, D. H., Hinton, G. E. & Sejnowski, T. J. A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169 (1985).


    Google Scholar
     

  • Stern, M., Hexner, D., Rocks, J. W. & Liu, A. J. Supervised learning in physical networks: from machine learning to learning machines. Phys. Rev. X 11, 021045 (2021).

    CAS 

    Google Scholar
     

  • Scellier, B., Ernoult, M., Kendall, J. & Kumar, S. Energy-based learning algorithms for analog computing: a comparative study. In Proc. 37th International Conference on Neural Information Processing Systems (NIPS ’23), 52705–52731 (ACM, 2023).

  • Kendall, J., Pantone, R., Manickavasagam, K., Bengio, Y. & Scellier, B. Training end-to-end analog neural networks with equilibrium propagation. Preprint at https://arxiv.org/abs/2006.01981 (2020).

  • Wang, Q., Wanjura, C. C. & Marquardt, F. Training coupled phase oscillators as a neuromorphic platform using equilibrium propagation. Neuromorph. Comput. Eng. 4, 034014 (2024).

    Article 

    Google Scholar
     

  • Yi, S.-i, Kendall, J. D., Williams, R. S. & Kumar, S. Activity-difference training of deep neural networks using memristor crossbars. Nat. Electron. 6, 45–51 (2023).


    Google Scholar
     

  • Laydevant, J., Marković, D. & Grollier, J. Training an Ising machine with equilibrium propagation. Nat. Commun. 15, 3671 (2024).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Altman, L. E., Stern, M., Liu, A. J. & Durian, D. J. Experimental demonstration of coupled learning in elastic networks. Phys. Rev. Appl. 22, 024053 (2024).

    Article 
    CAS 

    Google Scholar
     

  • Dillavou, S., Stern, M., Liu, A. J. & Durian, D. J. Demonstration of decentralized physics-driven learning. Phys. Rev. Appl. 18, 014040 (2022).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Dillavou, S. et al. Machine learning without a processor: emergent learning in a nonlinear analog network. Proc. Natl Acad. Sci. 121, e2319718121 (2024).

  • Stern, M., Dillavou, S., Jayaraman, D., Duria, D. J. & Liu, A. J. Training self-learning circuits for power-efficient solutions. APL Mach. Learn. 2, 016114 (2024).

    Article 

    Google Scholar
     

  • Anisetti, V. R., Kandala, A., Scellier, B. & Schwarz, J. Frequency propagation: multimechanism learning in nonlinear physical networks. Neural Comput. 36, 596–620 (2024).

    Article 
    MathSciNet 
    PubMed 

    Google Scholar
     

  • Murugan, A., Strupp, A., Scellier, B. & Falk, M. Contrastive learning through non-equilibrium memory. In APS March Meeting Abstracts 2023, F02.005 (APS, 2023).

  • Laborieux, A. & Zenke, F. Holomorphic equilibrium propagation computes exact gradients through finite size oscillations. In Proc. 36th International Conference on Neural Information Processing Systems (NIPS ’22), 12950–12963 (ACM, 2022).

  • Scellier, B., Mishra, S., Bengio, Y. & Ollivier, Y. Agnostic physics-driven deep learning. Preprint at https://arxiv.org/abs/2205.15021 (2022).

  • Lopez-Pastor, V. & Marquardt, F. Self-learning machines based on Hamiltonian echo backpropagation. Phys. Rev. X 13, 031020 (2023).

    CAS 

    Google Scholar
     

  • Touvron, H. et al. LLaMA: open and efficient foundation language models. Preprint at https://arxiv.org/abs/2302.13971 (2023).

  • Chowdhery, A. et al. PaLM: scaling language modeling with pathways. J. Mach. Learn. Res. 24, 1–113 (2023).


    Google Scholar
     

  • Achiam, J. et al. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774v1 (2023).

  • Team, G. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805v1 (2024).

  • Radford, A. et al. Learning transferable visual models from natural language supervision. In Proc. 38th International Conference on Machine Learning, 8748–8763 (MLR Press, 2021).

  • Liu, H., Li, C., Wu, Q. & Lee, Y. J. Visual instruction tuning. In Proc. 37th Conference on Neural Information Processing Systems (NeurIPS 2023) https://openreview.net/forum?id=w0H2xGHlkw (OpenReview, 2023).

  • Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog 1, 9 (2019).


    Google Scholar
     

  • Katharopoulos, A., Vyas, A., Pappas, N. & Fleuret, F. Transformers are RNNs: fast autoregressive transformers with linear attention. In Proc. 37th International Conference on Machine Learning, 5156–5165 (MLR Press, 2020).

  • Gu, A. & Dao, T. Mamba: linear-time sequence modeling with selective state spaces. Preprint at https://arxiv.org/abs/2312.00752v1 (2023).

  • Wang, H. et al. BitNet: scaling 1-bit transformers for large language models. Preprint at https://arxiv.org/abs/2310.11453 (2023).

  • Hu, E. J. et al. LoRA: low-rank adaptation of large language models. Preprint at https://arxiv.org/abs/2106.09685 (2021).

  • Dao, T., Fu, D., Ermon, S., Rudra, A. & Ré, C. FLASHATTENTION: fast and memory-efficient exact attention with IO-awareness. In Proc. 36th Conference on Neural Information Processing Systems (NeurIPS 2022) 35, 16344–16359 (ACM, 2022).

  • Juravsky, J. et al. Hydragen: high-throughput LLM inference with shared prefixes. Preprint at https://arxiv.org/abs/2402.05099 (2024).

  • Anderson, M. G., Ma, S.-Y., Wang, T., Wright, L. G. & McMahon, P. L. Optical transformers. Preprint at https://arxiv.org/abs/2302.10360 (2023).

  • Shen, Y. et al. Deep learning with coherent nanophotonic circuits. Nat. Photon. 11, 441–446 (2017).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Hamerly, R., Bernstein, L., Sludds, A., Soljačić, M. & Englund, D. Large-scale optical neural networks based on photoelectric multiplication. Phys. Rev. X 9, 021032 (2019).

    CAS 

    Google Scholar
     

  • Tait, A. N. Quantifying power in silicon photonic neural networks. Phys. Rev. Appl. 17, 054029 (2022).

    Article 
    ADS 

    Google Scholar
     

  • Laydevant, J., Wright, L. G., Wang, T. & McMahon, P. L. The hardware is the software. Neuron 112, 180–183 (2024).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Hooker, S. The hardware lottery. Commun. ACM 64, 58–65 (2021).

    Article 

    Google Scholar
     

  • Stroev, N. & Berloff, N. G. Analog photonics computing for information processing, inference, and optimization. Adv. Quantum Technol. 6, 2300055 (2023).

    Article 

    Google Scholar
     

  • Cerezo, M., Verdon, G., Huang, H.-Y., Cincio, L. & Coles, P. J. Challenges and opportunities in quantum machine learning. Nat. Comput. Sci. 2, 567–576 (2022).

    Article 
    CAS 
    PubMed 

    Google Scholar
     

  • Kashif, M. & Shafique, M. Hqnet: harnessing quantum noise for effective training of quantum neural networks in NISQ era. Preprint at https://arxiv.org/abs/2402.08475v1 (2024).

  • Zhou, M.-G. et al. Quantum neural network for quantum neural computing. Research 6, 0134 (2023).

    Article 
    ADS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Tian, J. et al. Recent advances for quantum neural networks in generative learning. IEEE Trans. Pattern. Anal. Mach. Intell. 45, 12321–12340 (2023).

    Article 
    PubMed 

    Google Scholar
     

  • Cerezo, M. et al. Variational quantum algorithms. Nat. Rev. Phys. 3, 625–644 (2021).

    Article 

    Google Scholar
     

  • Niazi, S. et al. Training deep Boltzmann networks with sparse Ising machines. Nat. Electron. 7, 610–619 (2024).

    Article 

    Google Scholar
     

  • Ma, S. Y., Wang, T., Laydevant, J., Wright, L. G. & McMahon, P. L. Quantum-limited stochastic optical neural networks operating at a few quanta per activation. Nat. Commun. 16, 359 (2025).

  • Pierangeli, D., Marcucci, G., Brunner, D. & Conti, C. Noise-enhanced spatial-photonic Ising machine. Nanophotonics 9, 4109–4116 (2020).

    Article 

    Google Scholar
     

  • McMahon, P. L. The physics of optical computing. Nat. Rev. Phys. 5, 717–734 (2023).

    Article 

    Google Scholar
     

  • Keeling, J. & Berloff, N. G. Exciton–polariton condensation. Contemp. Phys. 52, 131–151 (2011).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Berloff, N. G. et al. Realizing the classical XY Hamiltonian in polariton simulators. Nat. Mater. 16, 1120–1126 (2017).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Johnston, A. & Berloff, N. G. Macroscopic noise amplification by asymmetric dyads in non-Hermitian optical systems for generative diffusion models. Phys. Rev. Lett. 132, 096901 (2024).

    Article 
    ADS 
    MathSciNet 
    CAS 
    PubMed 

    Google Scholar
     

  • Wang, T. et al. Image sensing with multilayer nonlinear optical neural networks. Nat. Photon. 17, 408–415 (2023).

    Article 
    ADS 
    CAS 

    Google Scholar
     

  • Zhou, F. & Chai, Y. Near-sensor and in-sensor computing. Nat. Electron. 3, 664–671 (2020).

    Article 

    Google Scholar
     

  • del Hougne, P., F. Imani, M., Diebold, A. V., Horstmeyer, R. & Smith, D. R. Learned integrated sensing pipeline: reconfigurable metasurface transceivers as trainable physical layer in an artificial neural network. Adv. Sci. 7, 1901913 (2020).

    Article 

    Google Scholar
     

  • Vaswani, A. et al. Attention is all you need. In Proc. 31st International Conference on Neural Information Processing Systems (NIPS ’17), 6000–6010 (ACM, 2017).

  • Wu, C. et al. Harnessing optoelectronic noises in a photonic generative network. Sci. Adv. 8, eabm2956 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Bonnet, D. et al. Bringing uncertainty quantification to the extreme-edge with memristor-based Bayesian neural networks. Nat. Commun. 14, 7530 (2023).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 

    Google Scholar
     

  • Olin-Ammentorp, W., Beckmann, K., Schuman, C. D., Plank, J. S. & Cady, N. C. Stochasticity and robustness in spiking neural networks. Neurocomputing 419, 23–36 (2021).

    Article 

    Google Scholar
     

  • RELATED ARTICLES

    Most Popular

    Recent Comments