Flash chip fabrication
The CMOS circuitry was fabricated in a standard CMOS foundry using a 0.13-μm process. The received 8-inch wafer had a passivation layer thickness of approximately 800 nm, with pre-reserved vias at the port pads of I/O (TGV1 region) and WL/BL/SL buffers (TGV2 region). The wafer was cut into individual dies, each with a dimension of 5 mm × 5 mm (four sets of identical circuits included). Polymer-mediated delamination treatments were performed on the CMOS substrate before integrating 2D flash. The CMOS substrate was cleaned by soaking in acetone for 12 h, followed by spin-coating with photoresist (S1818) and removal of the photoresist using N-methyl-2-pyrrolidone (NMP) soak for 12 h.
Direct-write lithography was used to expose windows at the TGV2 region, and e-beam evaporation (EBE) was used to fill the vias with 5/500 nm Cr/Au. WLs were defined using direct-write lithography, followed by the deposition of 5/100/5 nm Cr/Au/Pt. The O2 plasma treatment (50 W, 20 s) was used to further clean and activate the surface for dielectric deposition. A 13-nm HfO2 blocking layer was deposited using thermal atomic layer deposition. Tetrakis(ethylmethylamino)hafnium reacts with water at 150 °C to form HfO2. The floating gate pattern was defined by direct-write lithography, and 3-nm Pt was deposited by EBE. The O2 plasma treatment was performed again. Subsequently, a 7-nm HfO2 tunnelling layer was deposited using the same atomic layer deposition system. Vias through the HfO2/Pt/HfO2 memory stack were defined by direct-write lithography and etched using reactive ion etching (Ar + CHF3, 175 W, 255 s), and EBE was then used to deposit a 5/50 nm Cr/Au layer to fill the vias. Chemical vapour deposition monolayer MoS2 (purchased from Sixcarbon Technology) was transferred onto the memory stack using a gradual-release transfer process. The minimum approach speed between MoS2 and the substrate is carefully controlled to be as low as 500 nm per step using the custom-made transfer equipment. Polystyrene was used as the supporting layer because of its large Young’s modulus to avoid wrinkling. The polystyrene supporting layer was removed by soaking in toluene for 12 h. The MoS2 channels were patterned by direct-write lithography and etched by O2 plasma (30 W, 20 s). The sample was soaked in NMP for 12 h to remove the photoresist. To fully release stress and air gaps in MoS2, multiple annealing processes in an N2 atmosphere (200 °C, 3 h) were performed for both large-area films and patterned strips. The adhesion between MoS2 and the substrate can also be enhanced during these processes. BLs and SLs were defined by direct-write lithography, followed by the deposition of 5/100 nm Cr/Au using EBE. For the fabrication of the 2D flash on a SiO2/Si substrate, the process involving the vias mentioned above is not required.
To passivate the 2D flash module, a layer of S1818 photoresist was spin-coated onto the sample. The TGV1 region of the I/O module was exposed by direct-write lithography for wire bonding. The chip was packaged using a ceramic dual-in-line package (DIP 24).
Inverter chain design of the buffer module
According to the logical effort theory, the total logical effort, determined by the ratio of the load capacitance (10 pF in our case, considering design margin) to the inherent input capacitance of the first-stage CMOS inverter (2 fF, decided by selected CMOS technology), should be distributed across a chosen number of inverter stages for an optimized propagation delay time. The propagation delay time of the inverter chain in the buffer can be calculated by
$${t}_{{\rm{p}}}={t}_{{\rm{p}}0}\mathop{\sum }\limits_{j=1}^{N}\left(1+\frac{{C}_{{\rm{g}},j+1}}{\gamma {C}_{{\rm{g}},j}}\right)$$
(1)
where N is the number of stages of the inverter chain, Cg,j is the gate capacitance for the jth inverter, Cg,N+1 is defined as the capacitance load, here parasitic capacitance of the 2D memory array, tp0 is the intrinsic delay for the inverter and γ is a parameter dependent on the process, usually near 1.
For an optimized design, the gate capacitance (and the inverter size) should be the geometric mean of the adjacent inverters, such that
$${C}_{{\rm{g}},j}=\sqrt{{C}_{{\rm{g}},j-1}{C}_{{\rm{g}},j+1}},{\rm{where}}\;j=2,\ldots ,N$$
(2)
and the optimized propagation delay time can be written as
$${t}_{{\rm{p}}}=N{t}_{{\rm{p}}0}\left(1+\sqrt[N]{\frac{{C}_{{\rm{g}},N+1}}{{C}_{{\rm{g}},1}}}/\gamma \right)$$
(3)
Usually, Cg,1 is the minimum inverter gate capacitance for a certain process—in our work, 2 fF—and Cg,N+1 is 10 pF. Therefore, the optimized N for the inverter chain is 6 with a propagation delay of about 27.3tp0, whereas N = 4 is sufficient with a delay of around 30.7tp0 and offers benefits related to buffer size. For an inverter of each stage, the driver ratio is \(\sqrt[N]{\frac{{C}_{{\rm{g}},N+1}}{{C}_{{\rm{g}},1}}}\approx 8\), and the optimized driver chain is designed as shown in Fig. 3c.
Material characterization
The TEM-ready samples were prepared using the in situ FIB lift-out technique on an FEI Strata G4 HX dual-beam FIB scanning electron microscope. The samples were capped with sputtered electron-beam Pt and ion-beam Pt before milling. STEM and TEM images were captured with the Thermo Scientific Tecnai Z aberration-corrected transmission electron microscope at an accelerating voltage of 200 kV. Energy-dispersive spectra were obtained in STEM mode using a Super X FEI system. The AFM images of the devices were measured by an MFP-3D Origin+ (Asylum Research, Oxford Instruments) system. Optical images were captured by an optical microscope (OLYMPUS BX53M) and an extended-DOF microscope (KEYENCE VHX-6000).
Electrical measurements
The electrical characterization of the standalone 2D flash devices and the 4 × 32 array was carried out at room temperature and under atmospheric conditions (except the retention test) in a probe station (Cascade Summit 11000 type). The retention test was conducted in a customized vacuum probe station. The voltage pulses were generated using a semiconductor parameter analyser (B1500, Keysight). The waveform was captured using an oscilloscope (DPO 5204, Tektronix).
The electrical characterization of the 2D flash chip was performed with a dedicated chip test system. The arbitrary waveform generator (33120 A, Agilent) provides clock signals, monitored by an oscilloscope (DSOX1204A, Keysight). The d.c. power supply (E36312A, Keysight) provides d.c. signals required for testing the chip, including −1 V, −5 V, 2 V, 3 V, 5 V and 9 V. The host computer provides a software interface and loads the test program onto the FPGA. FPGA transmits the command from the host computer to the I/O ports of the 2D flash chip. The packaged 2D flash chip was placed into a test socket compatible with the DIP package before testing.