Sunday, March 9, 2025
No menu items!
HomeNatureA mixed-precision memristor and SRAM compute-in-memory AI processor

A mixed-precision memristor and SRAM compute-in-memory AI processor

  • Prabhu, K. et al. CHIMERA: a 0.92-TOPS, 2.2-TOPS/W edge AI accelerator with 2-Mbyte on-chip foundry resistive RAM for efficient training and inference. IEEE J. Solid-State Circuits 57, 1013–1026 (2022).

    Article 
    ADS 
    MATH 

    Google Scholar
     

  • Jain, V. et al. TinyVers: A 0.8-17 TOPS/W, 1.7 μW-20 mW, tiny versatile system-on-chip with state-retentive eMRAM for machine learning inference at the extreme edge. In Proc. 2022 IEEE Symposium on VLSI Technology and Circuits 20–21 (IEEE, 2022).

  • Rossi, D. et al. 4.4 A 1.3TOPS/W @ 32GOPS fully integrated 10-core SoC for IoT end-nodes with 1.7μW cognitive wake-up from MRAM-based state-retentive sleep mode. In Proc. 2021 IEEE International Solid-State Circuits Conference (ISSCC) 60–62 (IEEE, 2021).

  • Ueyoshi, K. et al. DIANA: an end-to-end energy-efficient digital and analog hybrid neural network SoC. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).

  • Yue, J. et al. A 28nm 16.9-300TOPS/W computing-in-memory processor supporting floating-point NN inference/training with intensive-CIM sparse-digital architecture. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2023).

  • Yue, J. et al. 15.2 A 2.75-to-75.9TOPS/W computing-in-memory NN processor supporting set-associate block-wise zero skipping and ping-pong CIM with simultaneous computation and weight updating. In Proc. 2021 IEEE International Solid-State Circuits Conference (ISSCC) 238–240 (IEEE, 2021).

  • Liu, S. et al. 16.2 A 28nm 53.8TOPS/W 8b sparse transformer accelerator with in-memory butterfly zero skipper for unstructured-pruned NN and CIM-based local-attention-reusable engine. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 250–252 (IEEE, 2023).

  • Tu, F. et al. 16.1 MuITCIM: a 28nm 2.24μJ/token attention-token-bit hybrid sparse digital CIM-based accelerator for multimodal transformers. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 248–250 (IEEE, 2023).

  • Huang, W.-H. et al. A nonvolatile Al-edge processor with 4MB SLC-MLC hybrid-mode ReRAM compute-in-memory macro and 51.4-251 TOPS/W. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 15–17 (IEEE, 2023).

  • Wen, T.-H. et al. A 28nm nonvolatile AI edge processor using 4Mb analog-based near-memory-compute ReRAM with 27.2 TOPS/W for tiny AI edge devices. In Proc. 2023 IEEE Symposium on VLSI Technology and Circuits 1–2 (IEEE, 2023).

  • Wen, T.-H. et al. Fusion of memristor and digital compute-in-memory processing for energy-efficient edge computing. Science 384, 325–332 (2024).

    Article 
    ADS 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Lele, A. S. et al. A heterogeneous RRAM in-memory and SRAM near-memory SoC for fused frame and event-based target identification and tracking. IEEE J. Solid-State Circuits 59, 52–64 (2024).

    Article 
    ADS 
    MATH 

    Google Scholar
     

  • Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).

    Article 
    ADS 
    CAS 
    PubMed 

    Google Scholar
     

  • Khwa, W.-S. et al. A 40-nm, 2M-cell, 8b-precision, hybrid SLC-MLC PCM computing-in-memory macro with 20.5-65.0TOPS/W for tiny-Al edge devices. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).

  • Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).

    Article 
    ADS 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Wu, P.-C. et al. A 22nm 832Kb hybrid-domain floating-point SRAM in-memory-compute macro with 16.2-70.2TFLOPS/W for high-accuracy AI-edge devices. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 126–128 (IEEE, 2023).

  • Guo, A. et al. A 28-nm 64-kb 31.6-TFLOPS/W digital-domain floating-point-computing-unit and double-bit 6T-SRAM computing-in-memory macro for floating-point CNNs. IEEE J. Solid-State Circuits 59, 3032–3044 (2024).

    Article 
    MATH 

    Google Scholar
     

  • Sinangil, M. E. et al. A 7-nm compute-in-memory SRAM macro supporting multi-bit input, weight and output and achieving 351 TOPS/W and 372.4 GOPS. IEEE J. Solid-State Circuits 56, 188–198 (2021).

    Article 
    ADS 

    Google Scholar
     

  • Mori, H. et al. A 4nm 6163-TOPS/W/b 4790−TOPS/mm2/b SRAM based digital-computing-in-memory macro supporting bit-width flexibility and simultaneous MAC and weight update. In Proc. 2023 IEEE International Solid-State Circuits Conference (ISSCC) 132–134 (IEEE, 2023).

  • Chih, Y.-D. et al. 16.4 An 89TOPS/W and 16.3TOPS/mm2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications. In Proc. 2021 IEEE International Solid-State Circuits Conference (ISSCC) 252–254 (IEEE, 2021).

  • Fujiwara, H. et al. A 5-nm 254-TOPS/W 221-TOPS/mm2 fully-digital computing-in-memory macro supporting wide-range dynamic-voltage-frequency scaling and simultaneous MAC and write operations. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).

  • Lee, C.-F. et al. A 12nm 121-TOPS/W 41.6-TOPS/mm2 all digital full precision SRAM-based compute-in-memory with configurable bit-width for AI edge applications. In Proc. 2022 IEEE Symposium on VLSI Technology and Circuits 24–25 (IEEE, 2022).

  • Chiu, Y. C. et al. A CMOS-integrated spintronic compute-in-memory macro for secure AI edge devices. Nat. Electron. 6, 534–543 (2023).

    Article 
    MATH 

    Google Scholar
     

  • Jung, S. et al. A crossbar array of magnetoresistive memory devices for in-memory computing. Nature 601, 211–216 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    MATH 

    Google Scholar
     

  • Wen, T.-H. et al. 34.8 A 22nm 16Mb floating-point ReRAM compute-in-memory macro with 31.2TFLOPS/W for AI edge devices. In Proc. 2024 IEEE International Solid-State Circuits Conference (ISSCC) 580–582 (IEEE, 2024).

  • Chang, M. et al. A 40nm 60.64TOPS/W ECC-capable compute-in-memory/digital 2.25MB/768KB RRAM/SRAM system with embedded cortex M3 microprocessor for edge recommendation systems. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).

  • Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).

    Article 
    ADS 
    CAS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Hung, J. M. et al. A four-megabit compute-in-memory macro with eight-bit precision based on CMOS and resistive random-access memory for AI edge devices. Nat. Electron. 4, 921–930 (2021).

    Article 
    MATH 

    Google Scholar
     

  • Hung, J.-M. 8-b precision 8-Mb ReRAM compute-in-memory macro using direct-current-free time-domain readout scheme for AI edge devices. IEEE J. Solid-State Circuits 58, 303–315 (2022).

    Article 
    ADS 
    MATH 

    Google Scholar
     

  • Chiu, Y.-C. et al. A 22nm 4Mb STT-MRAM data-encrypted near-memory computation macro with a 192GB/s read-and-decryption bandwidth and 25.1-55.1TOPS/W 8b MAC for AI operations. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 178–180 (IEEE, 2022).

  • Spetalnick, S. D. et al. A 40nm 64kb 26.56TOPS/W 2.37Mb/mm2 RRAM binary/compute-in-memory macro with 4.23x improvement in density and >75% use of sensing dynamic range. In Proc. 2022 IEEE International Solid-State Circuits Conference (ISSCC) 1–3 (IEEE, 2022).

  • Yoon, J.-H. et al. A 40nm 100Kb 118.44TOPS/W ternary-weight compute-in-memory RRAM macro with voltage-sensing read and write verification for reliable multi-bit RRAM operation. In Proc. 2021 IEEE Custom Integrated Circuits Conference (CICC) 1–2 (IEEE, 2022).

  • Khwa, W. S. et al. MLC PCM techniques to improve nerual network inference retention time by 105X and reduce accuracy degradation by 10.8X. In Proc. 2021 Symposium on VLSI Technology 1–2 (IEEE, 2021).

  • Xue, C.-X. et al. 15.4 A 22nm 2Mb ReRAM compute-in-memory macro with 121-28TOPS/W for multibit MAC computing for tiny AI edge devices. In Proc. 2020 IEEE International Solid-State Circuits Conference (ISSCC) 244–246 (IEEE, 2020).

  • Xue, C.-X. et al. 24.1 A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN based AI edge processors. In Proc. 2019 IEEE International Solid-State Circuits Conference (ISSCC) 388–390 (IEEE, 2019).

  • Xue, C.-X. et al. A CMOS-integrated compute-in-memory macro based on resistive random-access memory for AI edge devices. Nat. Electron. 4, 81–90 (2020).

    Article 
    MATH 

    Google Scholar
     

  • Chen, W.-H. et al. A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In Proc. 2018 IEEE International Solid-State Circuits Conference (ISSCC) 494–496 (IEEE, 2018).

  • Chen, W. H. et al. CMOS-integrated memristive non-volatile computing-in-memory for AI edge processors. Nat. Electron. 2, 420–428 (2019).

    Article 
    CAS 
    MATH 

    Google Scholar
     

  • Mochida, R. et al. A 4M synapses integrated analog ReRAM based 66.5 TOPS/W neural-network processor with cell current controlled writing and flexible network architecture. In Proc. 2018 IEEE Symposium on VLSI Technology 175–176 (IEEE, 2018).

  • Wan, W. et al. 33.1 A 74 TMACS/W CMOS-RRAM neurosynaptic core with dynamically reconfigurable dataflow and in-situ transposable weights for probabilistic graphical models. In Proc. 2020 IEEE International Solid-State Circuits Conference (ISSCC) 498–500 (IEEE, 2020).

  • Liu, Q. et al. 33.2 A fully integrated analog ReRAM based 78.4TOPS/W compute-in-memory chip with fully parallel MAC computing. In Proc. 2020 IEEE International Solid-State Circuits Conference (ISSCC) 500–502 (IEEE, 2020).

  • Cai, F. et al. A fully integrated reprogrammable memristor–CMOS system for efficient multiply–accumulate operations. Nat. Electron. 2, 290–299 (2019).

    Article 
    CAS 
    MATH 

    Google Scholar
     

  • Li, C. et al. Analogue signal and image processing with large memristor crossbars. Nat. Electron. 1, 52–59 (2018).

    Article 
    ADS 
    MATH 

    Google Scholar
     

  • Ielmini, D. & Wong, H. S. P. In-memory computing with resistive switching devices. Nat. Electron. 1, 333–343 (2018).

    Article 
    MATH 

    Google Scholar
     

  • Chou, C.-C. et al. A 22nm 96KX144 RRAM macro with a self-tracking reference and a low ripple charge pump to achieve a configurable read window and a wide operating voltage range. In Proc. 2020 IEEE Symposium on VLSI Circuits 1–2 (IEEE, 2020).

  • Boybat, I. et al. Neuromorphic computing with multi-memristive synapses. Nat. Commun. 9, 2514 (2018).

    Article 
    ADS 
    PubMed 
    PubMed Central 
    MATH 

    Google Scholar
     

  • Le Gallo, M. et al. Mixed-precision in-memory computing. Nat. Electron. 1, 246–253 (2018).

    Article 
    MATH 

    Google Scholar
     

  • Qu, Z., Zhou, Z., Cheng, Y. & Thiele, L. Adaptive loss-aware quantization for multi-bit networks. In Proc. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 7985–7994 (Computer Vision Foundation, 2020).

  • Mishra, A. & Marr, D. Apprentice: using knowledge distillation techniques to improve low-precision network accuracy. In Proc. Sixth International Conference on Learning Representations (ICLR, 2018).

  • Chu, T. et al. Mixed-precision quantized neural networks with progressively decreasing bitwidth. Pattern Recognit. 111, 107647 (2021).

    Article 
    MATH 

    Google Scholar
     

  • Lecun, Y. et al. Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998).

    Article 
    MATH 

    Google Scholar
     

  • Krizhevsky, A. Learning Multiple Layers of Features from Tiny Images. Master’s thesis, Univ. Toronto (2009).

  • Deng, J. et al. ImageNet: a large-scale hierarchical image database. In Proc. 2009 IEEE Conference on Computer Vision and Pattern Recognition 248–255 (IEEE, 2009).

  • Houshmand, P. et al. Opportunities and limitations of emerging analog in-memory compute DNN architectures. In Proc. 2020 IEEE International Electron Devices Meeting (IEDM) 29.1.1–29.1.4 (IEEE, 2020).

  • Shamsoshoara, A. et al. The FLAME dataset: Aerial Imagery Pile burn detection using drones (UAVs). IEEE DataPort (2020).

  • Warden, P. Speech commands: a dataset for limited-vocabulary speech recognition. Preprint at https://arxiv.org/abs/1804.03209 (2018).

  • Chowdhery, A. et al. Visual wake words dataset. Preprint at https://arxiv.org/abs/1906.05721 (2019).

  • Markus, N. et al. A white paper on neural network quantization. Preprint at https://arxiv.org/abs/2106.08295 (2021).

  • Jacob, B. et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proc. Conference on Computer Vision and Pattern Recognition (CVPR) 2704–2713 (Computer Vision Foundation, 2018).

  • Sun, S., Bai, J., Shi, Z., Zhao, W. & Kang, W. CIM2PQ: an arraywise and hardware-friendly mixed precision quantization method for analog computing-in-memory. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 43, 2084–2097 (2024).

    Article 

    Google Scholar
     

  • Chen, Y.-W. et al. SUN: dynamic hybrid-precision SRAM-based CIM accelerator with high macro utilization using structured pruning mixed-precision networks. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 43, 2163–2176 (2024).

    Article 
    MATH 

    Google Scholar
     

  • Agrawal, A. et al. A 7nm 4-core AI chip with 25.6TFLOPS hybrid FP8 training, 102.4TOPS INT4 inference and workload-aware throttling. In Proc. 2021 IEEE International Solid-State Circuits Conference (ISSCC) 144–145 (IEEE, 2021).

  • Wang, J. et al. A compute SRAM with bit-serial integer/floating-point operations for programmable in-memory vector acceleration. In Proc. 2019 IEEE International Solid-State Circuits Conference (ISSCC) 224–225 (IEEE, 2019).

  • RELATED ARTICLES

    Most Popular

    Recent Comments