上海品茶

您的当前位置:上海品茶 > 报告分类 > PDF报告下载

DS2 - Handout.pdf

编号:154957 PDF 29页 33.90MB 下载积分:VIP专享
下载报告请您先登录!

DS2 - Handout.pdf

1、ISSCC 2024 DEMONSTRATION SESSIONDS1 Monday February 19,2024:5:00-7:00 PM DS2 Tuesday February 20,2024:5:00-7:00 PM Further,I wish to recognize Brad Phil l ips(MiraSMART Conferencing)and Steve Bonney(S3 iPubl ishing)for the structuring and formatting of this handout,and the tabl et version(avail abl

2、e for downl oad from ISSCC 2024).Final l y,I woul d l ike to acknowl edge the vision and encouragement of the past ISSCC Conference Chair,Anantha Chandrakasan(MIT),for his l eadership in the real ization of the demonstration-session idea.Enjoy!Laura Chizuko Fujino ISSCC Director of Publ ications&Pre

3、sentations l February 2024ISSCC 2024 DEMONSTRATION SESSION WELCOMEWhat is an I SSCC Demonstration Session?Demonstration sessions are designed to augment the experience of al l attendees by providing an opportunity for direct interaction with authors of sel ected papers and view some of their concret

4、e resul ts.At their demonstration,the authors wil l il l ustrate their research resul ts face-to-face,providing attendees with a more hands-on experience.Overal l,these Sessions wil l:Demonstrate chip operation.Provide opportunity for in-depth discussion with the chip creators.The Demonstration Sess

5、ion 1(DS1)wil l be hel d on Monday,February 19,from 5:00 to 7:00 pm PST,and the Demonstration Session 2(DS2)wil l be hel d on Tuesday,February 20,from 5:00 to 7:00 pm PST.Anal og Subcommittee:Minkyu Je,KAIST,Daej eon,Kor ea Shon-Hang Wen,Medi aTek,Hs i nchu,Tai wan Data Converters Subcommittee:Ying-

6、Zu Lin,Medi atek,Hs i nchu,Tai wan Shiyu Su,Uni v er s i ty of Water l oo,Los Angel es,CA Digital Architectures&Systems Subcommittee:Ji-Hoon Kim,Ewha Womans Uni v er s i ty,Seoul,Kor ea Mark Anders,Intel,Hi l l s bor o,OR Digital Circuits Subcommittee:Eric Fang,Medi aTek,Hs i nchu,Tai wan Akihide Sa

7、i,Tos hi ba,Kawas ak i,Japan I MMD Subcommittee:Taekwang Jang,ETH Zur i ch,Zur i ch,Swi tz er l and Sanshiro Shishido,Panas oni c,Os ak a,Japan Memory Subcommittee:Seung-Jae Lee,Sams ung,Hwas eong,Kor ea Juang-Ying Chueh,Etr on,Tai pei,Tai wan Power Management Subcommittee:Xugang Ke,Zhej i ang Uni v

8、 er s i ty,Hangz hou,Chi na Gael Pil l onnett,CEA-Leti,Gr enobl e,Fr ance RF Subcommittee:Yves Baeyens,Nok i a-Bel l Labs,Mur r ay Hi l l,NJ Jeff Wal l ing,Vi r gi na Tech,Bl ack s bur g,VA Security Subcommittee:Yong-Ki Lee,Sams ung,Suwon,Kor ea Technol ogy Directions Subcommittee:Guy Torfs,Ghent Un

9、i v er s i ty,Gent,Bel gi um Denis Dal y,Appl e,Wel l es l ey,MA Wirel ess Subcommittee:Negar Reiskarimian,Mas s achus etts Ins ti tute of Technol ogy,Cambr i dge,MA Yun Yin,Fudan Uni v er s i ty,Shanghai,Chi na Wirel ine Subcommittee:Tamer Al i,Medi aTek,Ir v i ne,CA Ben Rhew,Sams ung,Hwas eong,Kor

10、 eaAcknowl edgements:In the preparation of these demonstration sessions,I wish to first acknowl edge the authors of the participating papers.Their work has been organized and structured under the Chairmanship of Patrick Mercier(University of Cal ifornia,San Diego),and the Demonstration Session Commi

11、ttee,consisting of:ISSCC 2024 DEMONSTRATION PAPERSDS217.1 Omnidirectional Magnetoel ectric Power Transfer for Miniaturized Biomedical I mpl ants via Active Echo Wei Wang,Zhanghao Yu,Yiwei Zou,Joshua Woods,Prahal ad Chari,Jacob T.Robinson,Kaiyuan Yang,Rice University,Houston,TX 17.3 A Ful l y Wirel e

12、ss,Miniaturized,Mul ticol or Fl uorescence I mage Sensor I mpl ant for Real-Time Monitoring in Cancer Therapy Rozhan Rabbani*1,Micah Roschel l e*1,Surin Gweon1,Rohan Kumar1,Al ec Vercruysse1,Nam Woo Cho2,Matthew H.Spitzer2,Al i M.Niknejad1,Vl adimir M.Stojanovic1,Mekhail Anwar1,2 1University of Cal

13、ifornia,Berkel ey,CA;2University of Cal ifornia,San Francisco,CA;*Equal l y Credited Authors(ECAs)17.4 Environmental l y-Friendl y Disposabl e Circuit and Battery System for Reducing I mpact of E-Wastes Naoki Miura*1,Hiroaki Taguchi*1,Kazuyoshi Watanabe2,Masaya Nohara1,Tatsuyuki Makita3,Masahiro Tan

14、abe3,Takahiro Wakimoto3,Shohei Kumagai2,Hideyuki Nosaka1,Atsushi Aratake1,Toshihiro Okamoto2,Shun Watanabe2,Jun Takeya2,Takeshi Komatsu1 1Nippon Tel egraph and Tel ephone,Atugi,Japan;2University of Tokyo,Kashiwa,Japan;3PI-CRYSTAL Incorporation,Kashiwa,Japan;*Equal l y Credited Authors(ECAs)17.7 Drop

15、l et Microfluidics Co-Designed with Real-Time CMOS Luminescence Sensing and I mpedance Spectroscopy of 4nL Dropl ets at a 67mm/s Vel ocity Qijun Liu,Diana Arguijo Mendoza,Al peren Yasar,Dil ara Caygara,Aya Kassem,Dougl as Densmore,Rabia Tugce Yazicigil,Boston University,Boston,MA 20.3 A 23.9TOPS/W 0

16、.8V,130TOPS AI Accel erator with 16 Performance-Accel erabl e Pruning in 14nm Heterogeneous Embedded MPU for Real-Time Robot Appl ications Koichi Nose,Taro Fujii,Katsumi Togawa,Shunsuke Okumura,Kentaro Mikami,Daichi Hayashi,Teruhito Tanaka,Takao Toi,Renesas El ectronics,Tokyo,Japan 20.5 C-Transforme

17、r:A 2.6-18.1J/Token Homogeneous DNN-Transformer/Spiking-Transformer Processor with Big-Littl e.Network and I mpl icit Weight Generation for Large Language Model s Sangyeob Kim,Sangjin Kim,Wooyoung Jo,Soyeon Kim,Seongyon Hong,Hoi-Jun Yoo,Korea Advanced Institute of Science and Technol ogy,Daejeon,Kor

18、ea 20.6 LSPU:A Ful l y I ntegrated Real-Time LiDAR-SLAM SoC with Point-Neural-Network Segmentation and Mul ti-Level kNN Accel eration Jueun Jung1,Seungbin Kim1,Bokyoung Seo1,Wuyoung Jang1,Sangho Lee1,Jeongmin Shin1,Donghyeon Han2,Kyuho Jason Lee1 1Ul san National Institute of Science and Technol ogy

19、,Ul san,Korea;2Massachusetts Institute of Technol ogy,Cambridge,MA 20.7 NeuGPU:A 18.5mJ/I ter Neural-Graphics Processing Unit for I nstant-Model ing and Real-Time Rendering with Segmented-Hashing Architecture Junha Ryu1,Hankyul Kwon1,Wonhoon Park1,Zhiyong Li1,Beomseok Kwon1,Donghyeon Han2,Dongseok I

20、m1,Sangyeob Kim1,Hyungnam Joo1,Hoi-Jun Yoo1 1Korea Advanced Institute of Science and Technol ogy,Daejeon,Korea;2Massachusetts Institute of Technol ogy,Cambridge,MA 20.8 Space-Mate:A 303.5mW Real-Time Sparse Mixture-of-Experts-Based NeRF-SLAM Processor for Mobil e Spatial Computing Gwangtae Park1,Seo

21、kchan Song1,Haoyang Sang1,Dongseok Im1,Donghyeon Han2,Sangyeob Kim1,Hongseok Lee1,Hoi-Jun Yoo1 1Korea Advanced Institute of Science and Technol ogy,Daejeon,Korea;2Massachusetts Institute of Technol ogy,Cambridge,MA 23.2 A 1mm2 Software-Defined Dual-Mode Bl uetooth Transceiver with 10dBm Maximum TX P

22、ower and-98.2dBm Sensitivity 2.96mW RX Power at 1Mb/s Nicol a Scol ari,Franz X.Pengg,Konstantinos Manetakis,Camil o A.Sal azar,Al exandre Vouil l oz,Ernesto Prez Serna,Anjana Dissanayake,Pascal Persechini,Vl adimir Kopta,Erwan Le Roux,Francesco Chicco,Stefano Cil l o,Nicol a Gerber,Cdric Barbel enet

23、,Fabio Epifano,Paul o A.Dal Fabbro,Nicol as Raemy,CSEM,Neuchtel,Switzerl and 23.5 A 7.6mW I R-UWB Receiver Achieving-13dBm Bl ocker Resil ience with a Linear RF Front-End Anoop Narayan Bhat,Paul Mateman,Zul e Xu,Peter Vis,Paul Detterer,Gururaja Kasanadi Ramachandra,Yunus Baykal,Mario Konijnenburg,Ya

24、o-Hong Liu,Christian Bachmann,Peng Zhang imec,Eindhoven,The Netherl ands 24.1 A 90-to-180GHz APD-I ntegrated Transmitter Achieving 18dBm Psat in 28nm CMOS Dawei Tang1,Xiaoyue Xia1,Zheng Yan1,Peigen Zhou1,Zekun Li1,Chun Yang1,Rui Zhang1,Zhe Chen1,Jixin Chen1,Hao Gao1,2,Wei Hong1 1Southeast University

25、,Nanjing,China;2Eindhoven University of Technol ogy,Eindhoven,The Netherl ands 26.5 A 977W Capacitive Touch Sensor with Noise-I mmune Excitation Source and Direct Lock-I n ADC Achieving 25.2pJ/step Energy Efficiency Xiangdong Feng1,Zhiyu Wang1,Yekan Chen1,Tianyi Cai1,Yangfan Xuan1,Changgui Yang1,Wei

26、xiao Wang1,Yunshan Zhang2,Zhong Tang3,Yuxuan Luo1,Bo Zhao1 1Zhejiang University,Hangzhou,China;2Microaiot,Hangzhou,China;3Vango Technol ogies,Hangzhou,China 28.6 An 87%Efficient 2V-I nput,200A Vol tage Regul ator Chipl et Enabl ing Vertical Power Del ivery in Mul ti-kW Systems-on-Package Rinkl e Jai

27、n1,Shunjiang Xu2,Rajiv Kaushal1,Carl os Mariscal1,Humberto Cabal l ero3,Tamir Sal us4,Christopher Schaef1,Anup Deka5,Aruna Payal a5,Keng Chen6,Huong Do7,Jonathan Dougl as7 1Intel,Hil l sboro,OR;2Intel,Santa Cl ara,CA;3Intel,Guadal ajara,Mexico;4Intel,Haifa,Israel;5Intel,Bangal ore,India;6Intel,Hudso

28、n,MA;7Intel,Chandl er,AZ 29.5 A Portabl e 14GHz Dual-Mode Pul se and Continuous-Wave El ectron Paramagnetic Resonance Spectrometer Using a Subharmonic Direct Conversion Receiver Jui-Hung Sun,Michel l a Rustom,Thanh Dat Nguyen,Jaideep Singh,Peter Qin,Constantine Sideris,University of Southern Cal ifo

29、rnia,Los Angel es,CA 30.1 A 40nm VLI W Edge Accel erator with 5MB of 0.256pJ/b RRAM and a Local ization Sol ver for Bristl e Robot Surveil l ance Samuel D.Spetal nick*1,Ashwin Sanjay Lel e*1,Brian Crafton1,Muya Chang1,Sigang Ryu1,Jong-Hyeok Yoon2,Zhijian Hao1,Azadeh Ansari1,Win-San Khwa3,Yu-Der Chih

30、4,Meng-Fan Chang3,Arijit Raychowdhury1 1Georgia Institute of Technol ogy,Atl anta,GA;2Daegu Gyeongbuk Institute of Science and Technol ogy,Daegu,Korea;3TSMC Corporate Research,Hsinchu,Taiwan;4TSMC Design Technol ogy,Hsinchu,Taiwan;*Equal l y Credited Authors(ECAs)30.6 Vecim:A 289.13GOPS/W RI SC-V Ve

31、ctor Co-Processor with Compute-in-Memory Vector Register Fil e for Efficient High-Performance Computing Yipeng Wang,Mengtian Yang,Chieh-pu Lo,Jaydeep P.Kul karni,University of Texas,Austin,TX 31.5 A 750mW,37%Peak Efficiency I sol ated DC-DC Converter with 54/18Mb/s Ful l-Dupl ex Communication Using

32、a Singl e Pair of Transformers Tingxu Hu1,Mo Huang1,Rui Paul o Martins1,2,Yan Lu1,1University of Macau,Macau,China;2University of Lisboa,Lisbon,Portugal 32.1 A 47GHz 4-way Doherty PA with 23.7dBm P1dB and 21.7%/13.1%PAE at 6/12dB Back-off Supporting 2000MHz 5G NR 64-QAM OFDM Xiaohan Zhang*,Hao Guo*,

33、Taiyun Chi,Rice University,Houston,TX;*Equal l y Credited Authors(ECAs)32.5 E-band(71-to-86GHz)GaN Power Ampl ifier with 4.37W Output Power and 18.5%PAE for 5G Backhaul Bharath Cimbil i1,2,3,Mingquan Bao2,Christian Friesicke1,Rdiger Quay1,3,1Fraunhofer IAF,Freiburg,Germany;2Ericsson,Gothenburg,Swede

34、n;3University of Freiburg,Freiburg,Germany 33.1 A High-Accuracy and Energy-Efficient Zero-Shot-Retraining Seizure-Detection Processor with Hybrid-Feature-Driven Adaptive Processing and Learning-Based Adaptive Channel Sel ection Jiahao Liu1,Xiao Liu1,Xu Wang1,Ziyi Xie1,Zirui Zhong1,Jiajing Fan1,Hui Q

35、iu1,Yiming Xu1,Huajing Qin1,Yu Long1,Yuhong Zhou2,Zixuan Shen3,Liang Zhou1,Liang Chang1,Shanshan Liu1,Shuisheng Lin1,Chao Wang3,Jun Zhou1 1University of El ectronic Science and Technol ogy of China,Chengdu,China;2West China Hospital of Sichuan University,Chengdu,China;3Huazhong University of Science

36、 and Technol ogy,Wuhan,China 33.2 A Sub-1J/cl ass Headset-I ntegrated Mind I magery and Control SoC for VR/MR Appl ications with Teacher-Student CNN and General-Purpose I nstruction Set Architecture Zhiwei Zhong*,Yijie Wei*,Lance Christopher Go,Jie Gu,Northwestern University,Evanston,IL;*Equal l y C

37、redited Authors(ECAs)33.7 An Adhesive I nterposer-Based Reconfigurabl e Mul ti-Sensor Patch I nterface with On-Chip Appl ication Tunabl e Time-Domain Feature Extraction Jeonghoon Cho*,You Jang Pyeon*,Junyeong Yeom*,Hyunjoong Kim*,Sanghyeon Cho,Yonggi Kim,Taejung Kim,Jong-Hyun Kwak,Geonjun Choi,Yoons

38、ik Lee,Heungjoo Shin,Hoon Eui Jeong,JaeJoon Kim Ul san National Institute of Science and Technol ogy,Ul san,Korea;*Equal l y Credited Authors(ECAs)33.9 A Miniature Neural I nterface I mpl ant with a 95%Charging Efficiency Optical Stimul ator and an 81.9dB SNDR M-Based Recording Frontend Linran Zhao1

39、,Wei Shi2,Yan Gong3,Xiang Liu3,Wen Li3,Yaoyao Jia1,1University of Texas,Austin,TX;2Meta,Santa Cl ara,CA;3Michigan State University,East Lansing,MI 33.10 A 2.7ps-ToF-Resol ution and 12.5mW Frequency-domain NI RS Readout I C with Dynamic Light Sensing Frontend and Cross-Coupl ing-Free I nter-Stabil iz

40、ed Data Converter Zhouchen Ma1,Yuxiang Lin1,Cheng Chen1,Xiangao Qi1,Yongfu Li1,Kea-Tiong Tang2,Fa Wang3,Tianhong Zhang4,Guoxing Wang1,Jian Zhao1 1Shanghai Jiao Tong University,Shanghai,China;2National Tsing Hua University,Hsinchu,Taiwan;3Shanghai United Imaging Microel ectronics Technol ogy,Shanghai

41、,China;4Shanghai Mental Heal th Center,Shanghai,China 34.6 A 28nm 72.12TFLOPS/W Hybrid-Domain Outer-Product Based Fl oating-Point SRAM Computing-in-Memory Macro with Logarithm Bit-Width Residual ADC Yiyang Yuan1,2,Yiming Yang3,Xinghua Wang3,Xiaoran Li3,Cail ian Ma1,2,Qirui Chen3,Meini Tang3,Xi Wei3,

42、Zhixian Hou3,Jial iang Zhu1,2,Hao Wu1,2,Qirui Ren1,2,Guozhong Xing1,Pui-In Mak4,Feng Zhang1 1Institute of Microel ectronics of the Chinese Academy of Sciences,Beijing,China;2University of Chinese Academy of Sciences,Beijing,China;3Beijing Institute of Technol ogy,Beijing,China;4University of Macau,M

43、acau,ChinaOmnidirectional Magnetoelectric Power Transfer for Miniaturized Biomedical Implants via Active EchoWei Wang,Zhanghao Yu,Yiwei Zou,Joshua Woods,Prahalad Chari,Jacob T.Robinson,Kaiyuan YangRice University,Houston,TXMotivationKey Idea and System ArchitectureASIC Implementation and SetupMeasur

44、ement Results and ComparisonPaper No.:17.1CH3CH2ME driver3ME driver2Gate DriverME driver1TX coil array Implant chipPower ManagementME FilmCstoreStimulatorPulse TXVLDO_TXCoilElectrodeENSTIMDataStimENTransRectifierLDOsV-refDC-DCCLK&Data RecoveryController PUFFSMVStimVLDOVLDOCLKLODataI1kTikRiAlignedAct

45、ive EchoLNASingleSlopADCPeak DetectorCH1ARX1VRX1Polarity DetectorRXchipVDriverOptimization goal:I1:I2:I3=ARX1cos(1):ARX2cos(2):ARX3cos(3)=kR1:kR2:kR3Received power regulation:VRX Vrect1I2I3Power&Downlink Data TransEN SensePWM23ARX1,2,3kTi kRii=1,2.LUT-based ControllerMotion&rotation systemPower supp

46、lySignal Gen for rotation motorME Driver ControllerLDO arrayRX boardNI DAQVAC1STIM1VRX1VRX2C*TSDownlink*C:Charging phase;TS:Task sequence10ms20msStimulation:Amplitude:3VPW:0.4msTSChargingVAC Notch defined Active Echo trans windowDifference phase1.35MHzInitial16 cyclesStopVAC1This WorkZ.YuRFIC22 2Z.Y

47、uJSSC22 3J.LeeNat.Elec.21 10J.TangISSCC21 1D.PiechNat.BME20 9Technology(nm)18065Bio-ApplicationStimulationStimulationStimulationRecordingN/AStimulationWPT MechanismME(340kHz)ME(340kHz)ME(330kHz)Inductive(900MHz)Inductive(6.78MHz)Ultrasound(1.85MHz)TX typePlanar Coil ArraySingle CoilSingle

48、 CoilTX Coil+Relay CoilSingle CoilSingle TransducerPower Transducer Size(mm)520.18420.1420.10.5x0.5(on-chip)2525(PCB)*10.80.8Global RegulationYes,Active Echo Yes,LSK Backscatter NoNoYes,LSK Backscatter NoOmnidirectionalYesNoNo(50 rotation)NoNoNo(kR2Different phase:1=0;2=piActive echo transVRX2In-Vit

49、ro Test with the Porcine TissueRef:Z.Yu,RFIC22;J.Tang,ISSCC21;H.Qiu,JSSC22;J.Chen,Nat.Biomed.Eng22Magnetoelectric WPTTX Coil ME FilmsPZTMetglasApplied magnetic fieldVDC biasing filedMulti Coil WPTMisaligned H2HsumMEidealI1H1H1I1I2OmnidirectionalPower efficientJJJJk sensor needed,difficult in ME case

50、LLMTX issue KKHigh efficiency in mm-scaleLower operating frequency JJJJJJBetter misalignment-tolerant LLNot omnidirectional ImplantSystoleDiastoleImplantImplantVagus Nerve StimCardiac Pacemaker Endovascular implantMisalignment due to:Body motion Respiration Heart Beating Hard-to-control orientation

51、during implantationPrior Art02040608000.20.40.60.81Normalized VoltageAngle Rotation()ME InductiveUltrasoundFully Integrated Implant Biasing magnetsElectrodeME film12mmFor a multi-TX single RX system:And the wireless transfer(wt)efficiency can be derived as:If all the TXs have identical impedance,the

52、n The equality holds and PTE gets maximum value,only if Maximum PTE&Omnidirectional Power ConditionRX Single Channel SummaryStimulator SettingAmplitude(4 bits)1-3.5VPluse Width(3 bits)0.05-1.5msShapeBi-phase,Monophasic Frequency0-200Hz Continuous changeDelay(4 bits)0-1msTechnology180PowerTotal:18.65

53、mW400MHzArea0.147mm2Gain range10-60dB with 1dB/StepIRN-149dBm/HzLPF BW 1M-8MADC bit71.68mm0.8mmPulse TXCentral controllerPMUStimCLK V&IVLDO_TX capVLDO capDC-DC capVLDO_TX capVRECT cap1.95mm0.83mmLDOLNAPGABPFLDOcapChannel 2Channel 3RampGenPhase DetectorScan ChainTest BufferPeak DetADCControllerImplan

54、t board:14.2mm35x2x0.18mm ME film250nH AE coilChipPackaged SystemA Fully Wireless,Miniaturized,Multicolor Fluorescence Image Sensor Implant for Real-Time Monitoring in Cancer TherapyR.Rabbani*1,M.Roschelle*1,S.Gweon1,R.Kumar1,A.Vercruysse1,N.Cho2,M.Spitzer2,A.Niknejad1,V.Stojanovic1,M.Anwar1,2*Equal

55、ly-Credited Authors(ECAs),1University of California,Berkeley,CA,2University of California,San Francisco,CAMotivationArchitectureSystem ImplementationVerificationPaper No.:17.3Die PhotoImaging Response to Immunotherapy(Ex Vivo)Performance ComparisonWireless Fluorescence Imaging SetupUltrasound wirele

56、ss power harvesting and comm.Long-term implantation deep in the bodyLens-less imagingChip-scale form factor Multicolor fluorescence imagingTrack multiple different cell types in vivo in real timeA 0.09cm3Implantable Fluorescence ImagerProblem:Limited access to real-time microscopic information intis

57、sue for time-sensitive treatments for cancer therapy.Current imaging methods(MRI and PET)look at macroscalechanges developing over months.Currentimagesensorslackwirelessoperationprecludinguntethered chronic implantation.A miniaturized wireless fluorescence microscope enableshigh-res real-time imagin

58、g of multiple cell types deep within tissue.Environmentally-Friendly Disposable Circuit and Battery System for Reducing Impact of E-Wastes N.Miura,H.Taguchi,K.Watanabe,M.Nohara,T.Makita,M.Tanabe,T.Wakimoto,S.Kumagai,H.Nosaka,A.Aratake,T.Okamoto,S.Watanabe,J.Takeya,T.KomatsuNippon Telegraph and Telep

59、hone(NTT),Atsugi,Japan University of Tokyo,Kashiwa,Japan PI-CRYSTAL Inc.,Kashiwa,JapanMotivationMaterials and DevicesSystem ImplementationVerificationPaper No.:17.4ConventionalThis conceptCCPolycycloolefinAlOx(60 nm)C(30 nm)n,pSourceDrain30 mGate10-1210-1110-1010-910-810-710-6Drain Current A1050-5-1

60、0Gate Voltage V8006004002000Sqrt Drain Current x10-6 A1/210-13 10-11 10-9 10-7 Drain Current A-8-404Gate Voltage V1.00.80.60.40.20.0Sqrt Drain Current x10-3 A1/2Organic Semiconductor with Carbon-ElectrodeBattery only with Soil-Fertilizer Elements ElectrodeSeparator+electrolytic(CH3COO)2Mg+-CMgCellul

61、ose4 mmO2(Air)Demonstration SystemApplication(Moisture Sensor with Sound Wave)After use(e-wastes)Problem of E-Wastes(*)Electrical devices everywhere.Impact for Plant GrowthCrushed battery in soil(610 g).SeedsCultivation test.Our batt.Mn dry batt.Li coinbatt.BrankBatt.amt.0.25 g1.25 g5.00 g3 Weeks af

62、ter seeding.n-type Tr.100 mp-type Tr.n-typep-typeSat.VD=-10 VSat.VD=10 VNo harmful substances nor scarce elementsin disposable electric devices.(*)Electronic and Electrical Wastes3150120510Elapsed time(hour)100 A Voltage(V)74 mmOscilloscopeBatteryBoardPower onMoistureSensorcircuit3bitIDDetectionSpea

63、kerSystem diagramOsc.(Carrier)OutOsc.(CLK)P/SCount.Sel.Mod.fclkfcSensor circuitMoisturePerformanceCarrier frequency Hz140Clock frequency HzVoltage VCurrent ADate rate bpsNum.of ID.6.7 14.8-15.63-203ASKModulation6.7 Achievements and Future WorkOsc.(CLK)Osc.Count.Sel.Mod.MaterialsCircuitElectrode/Wire

64、SemiconductorBatteryOrganic(C,H,O,S,N)SubstrateInsulator(CH3COO)2MgCathodeAnodeElectrolyteAlOx,PolycycloolefinPolyimide,PLACCMgSi,As,Ga,etcAl,Cu,Ru,W,Pt,etcSiSiMn,O,etcLi,etcLi,Cl,O,etcThis workConv.(ex.)Sensor circuitOSC.(Carrier)OSC.(CLK)Mod.Sel.Count.Power onHarmful substancesScarce elementsResou

65、rceMinor-metal-element(*)freePrecious-metal-element freeMetal-element freeCircuitBatteryLevel BurnableHarmless(to plant)BiodegradableEnviron-ment Results of 6 chips.Results of 1-chip version.3-bit IDRare-earth-element free Num.of ID.2Num.of Tr.82Size mm x mm18 x 30 Achieved Checking(*)differs from c

66、ountries.Droplet Microfluidics Co-Designed with Real-Time CMOS Luminescence Sensing and Impedance Spectroscopy of 4nL Droplets at a 67mm/s VelocityQijun Liu1,Diana ArguijoMendoza1,AlperenYasar1,Dilara Caygara1,Aya Kassem1,Douglas Densmore1,and Rabia TugceYazicigil11Boston University,Boston,MAMotivat

67、ionPlatform Implementation and ArchitectureDemo System ImplementationPaper No.:17.7Verification and MeasurementThis work was supported by NSF SemiSynBio-II(grant no.2027045),Catalyst Foundation,NIH T32(grant no.5T32GM130546-04),and DoD(grant no.HQ00342110008).Acknowledgments:Measurement SetupMeasure

68、ment Flow DiagramLow-noise and high-resolution impedance spectroscopy and high-resolution bioluminescence detection achieved by:Modular and low-cost droplet microfluidic device with conductive-ink electrodes.High-gain TIA using a temperature-and process-compensated pseudo-resistor-based feedback net

69、work.Ring-oscillator-driven counter for the quantification of light-pulse amplitude.Resolution of 6.7nA/count.Detects 38.2 to 118.9nL bioluminescent droplets at velocities from 0.8 to 24.3mm/s.Measurements demonstrate 1dB-compression at 36nA,input-referred noise of 2.4pArms at 1kHz.Resolution of 45p

70、A.Detects 4 to 47.9nL droplets at velocities from 1 to 67mm/s.Impedance Spectroscopy Electrical PerformanceImpedance Spectroscopy Real-Time Droplet DetectionLuminescence Detection Optical Performance with Droplet DetectionA 23.9TOPS/W 0.8V,130TOPS AI Accelerator with 16Performance-Accelerable Prunin

71、g in 14nm Heterogeneous Embedded MPU for Real-Time Robot ApplicationsKoichi Nose,Taro Fujii,Katsumi Togawa,Shunsuke Okumura,Kentaro Mikami,Daichi Hayashi,Teruhito Tanaka,Takao ToiRenesas Electronics,Tokyo,JapanPaper No.:20.3MotivationArchitectureSystem ImplementationVerificationPose estimationObject

72、 detectionFaster operationKeep recognition accuracyDenseSparse(90%)DenseSparse(70%)S/W(Sparse operation demo with multiple tasks)Demo system setupComparison demois shown in videoThermal cameraDRP-AI boardMonitorPower monitorNote 1 Only CNN operation time(not include pre/post processing)Note 2 The va

73、lues are sample application results(measured in 2023)and may be different at demos.DRP-AI11.7msec0.70 WInference time*13.0 W16.0msecINT8/DenseINT8/Sparse70GPUInference power(active standby)Example of AI application evaluation resultsHuman-Robotinteraction systemsEmbedded systems with advanced enviro

74、nmental awareness and real-time judgement and controlTarget and functionsRequirements and solutionAI accelerator comparison(video demo)AI processing with DRP-AI(live demo)Comparison with prior worksAI power efficiency with actual modelsFlexible N:M pruning methodPerformance acceleration method with

75、pruned modelsDRPMACPooling etc.(Intermediate op.)convolutionPre/post processingFan/fin-lessoperation10Wfor fan/fin-lessto embed in robot100TOPS(peak)for multi task/inputrobot application360Real-time cooperationwith AI and Non-AI applicationsSolution1)AI performance acceleration with flexible pruning

76、 rate control technology2)Heterogenous architecture with reconf.processor(DRP),MAC unit and CPU70727476780.5 0.6 0.7 0.8 0.91Imagenet TOP1 accuracy%Pruning ratioUnstructuredpruning(ideal)This workdegrade is 10m modeling&1-FPS rendering on edge GPUKey Building Blocks1)Spatial Management Unit(SMU)-66%

77、off-chip memory access supporting SHSP2)Attention-based Hybrid Interpolation Unit(AHIU)-1.51 throughput by skipping far vertices-56.4%power by hybrid Interpolators(PIU&SIU)3)Similarity-Sparsity Skipping Core(S3Core)-1.61 efficiency&1.41 throughput by coarse-grained similarity/sparsity skippingCoarse

78、 LevelsParallel Proc.Store on small$w/parallel MACsFine LevelsSerial Proc.Store on SRAMw/serial MACHash Embedding Access CharacteristicsLevels204060801000.20.40.60.81.0123456789 10 11 12 13 14 15 16Hit Ratio%#of Accessed VerticesText-to-3D“a corgi taking a selfie”NeRFPhotogrammetry3D Modeling Perfor

79、mance&Visual ResultsDemonstration System:3D Capture of Real-World*reconlabsSpace-Mate:A 303.5mW Real-Time Sparse Mixture-of-Experts-Based NeRF-SLAM Processor for Mobile Spatial ComputingGwangtae Park,Seokchan Song,Haoyang Sang,Dongseok Im,Donghyeon Han,Sangyeob Kim,Hongseok Lee,and Hoi-Jun YooMotiva

80、tionArchitectureSystem ImplementationVerificationPaper No.:20.8Challenge of SMoE-based NeRF-SLAM Start fixed position172cm144cm40cmRobotUser customizes map2User drives robot3ObjHierarchicalSampling CoreGWM&InterconnectIOMEMIOMEMSSCore 03SMoECluster 1 SMoECluster 2SMoECluster 3I/F1I/F0DSCore01Specifi

81、cationsTechnologySamsung 28 nmDie Area4.5 mm 4.5 mmSRAM KB1348 KBSupply Voltage0.7 0.9VMax.Frequency200 MHzData TypeFP16FP8DNN3D Embed.Sensor TypeVGA,RGB-DFrame Rate FPS13.1 52.6Tracking/Mapping PerformanceOperatingCondition0.7V50MHz0.79V100MHz0.9V200MHzPower mWEnergy/Frame mJ/Frame101.7/118.62.44/6

82、.34226/263.12.71/7.03570.2/655.73.42/8.76HierarchicalSampling CoreGWM&InterconnectIOMEMIOMEMSSCore 03SMoECluster 1 SMoECluster 2SMoECluster 3I/F1I/F0DSCore01CurrentPositionCamera trajectoryEvaluation on Custom EnvironmentChip SummaryUSBComm.ImagePose,mapSystem BoardSpace-MateFlashDDR3FPGAUSBCtrlr(FX

83、3)RobotROS Program Raspberry Pi 4RGB-D Camera(USB 3.0)ImageRemote PC(WiFi)CommandBank-conflict Free Expert-wise RoutingExpertLayerbatch0batch1batch2batch3batch4Conventional DNNHeavy OPs LL 5K7.5K10K02.5KDecision ExpertAct.Param.94.9%byRedundant FetchComputationMemoryCycleComp.Mem.LatencyDecisionLaye

84、rbatch0batch1batch2batch3batch4Small#Ch 6.9 OPs JJ9697 weight reuse LL Overview of Out-of-order SMoE DataflowAdv.:expert stationary param.access Disadv.:bank conflict&workload imbalanceSSCore0Expert0,1SSCore1Expert2,3SSCore2Expert4,5SSCore3Expert6,7Bank5IA5IA13Bank4IA4IA12Bank3IA3IA11Bank2IA2IA10Ban

85、k1IA1IA9Bank0IA8Bank7IA7IA15Bank6IA6IA14IA0Access by DMAPShared IOMEM2.Expert layer Out-of-order accessT0,1SSCore0Bank03SSCore1Bank47SSCore2Bank03SSCore3Bank47Bank5IA5IA13Bank4IA4IA12Bank3IA3IA11Bank2IA2IA10Bank1IA1IA9Bank0IA8Bank7IA7IA15Bank6IA6IA14IA0Shared IOMEMT0T1T2T3mod 8(batch idx)=bank id1.D

86、ecision layer In-order access00000000000000000000000000000000000000000000000000000000000000000011111118 ExpertsExpert StationaryExpertIDXAddress Gen.GWM(320KB)01Core IDCTT(11B)2,74,6Expert IDX17,11032,101Batch Cnt20,145,8233,557,68Allocate Exper

87、t to CoreExp2Exp7Exp4Exp6Exp0Exp1Exp3Exp5Unicast 8 expert params.SSCore0WBUFComplementary Sorting based Task Allocation#BatchesExpert074-largest&4-smallest workloads#BatchesExpert07Mixed workloads Similar#batchesDecend.SortAscend.SortSSCore1SSCore2SSCore3Task Allocation for High PE Utilization703111

88、NZ indexNZ flag+cyc0cyc1cyc2Batch TilePointer+pushBIDFIFO#0FIFO#1FIFO#7Zero Elim.00000011BatchCounter(BC)popidx,validBAT#0(0.28 KB)Conflict-FreeGatherParallel in 8 EBRUsUpdate Reorder FinishedCTT(11B)FIFO#7B17Bank IDX=1FIFO#0FIFO#1FIFO#2FIFO#3BAT#0DMAPBufZeroElim.FIFO ID=Bank ID=Modulo-8(BIDX)B1B10B

89、3B17BAT#0DMAPBufZeroElim.Valid-bitGathered index24,1,10,31,1,1,1Pop 4 nonempty FIFOsFIFO#7FIFO#0FIFO#1FIFO#2FIFO#3B1B10B3B17B24B19A 1mm2 Software-Defined Dual-Mode Bluetooth Transceiver with 10dBm Maximum TX Power and-98.2dBm Sensitivity 2.96mW RX Power at 1Mb/sN.Scolari,F.X.Pengg,K.Manetakis,C.A.Sa

90、lazar,A.Vouilloz,E.Prez Serna,A.Dissanayake,P.Persechini,V.Kopta,E.Le Roux,F.Chicco,S.Cillo,N.Gerber,C.Barbelenet,F.Epifano,P.A.Dal Fabbro,N.Raemy,R.Berguerand,J.Beysens,G.Haenning,P.Jokic,Y.PiguetCSEM,Neuchtel,SwitzerlandMotivationArchitectureSystem ImplementationVerificationPaper No.:23.2IOTSoftwa

91、re-Defined Dual-Mode Bluetooth TransceiverBQF OffsetRx GainPeak DetectorsAGCReconfigurable Rx Data PathADCsPAADPLLdedicated busfast clockPA linearize and interpolateADPLL interfaceMagFreqXtal logicProtocolTimerSequencerReconfigurable Tx Data PathProgramable Tx Packet HandlerTx FIFOAES CCMProgramable

92、 Rx Packet HandlerRSSIbitstreamcommandsIQ samplesRx FIFORx FIFORegistersIRQSSPIAPBGPIOsDigital BasebandDigital BasebandSoCinterfaceBlock nBlock 1Block 2Block 3Block 4DispatcherCrossbarReconfigurable Data PathDataPath SinkTx:ADPLL+PARx:Packet HandlerDataPath SourceTx:Packet HandlerRx:ADCsdata+data_ty

93、pereqackBlock 1Block 2.CkDataData_typeReqAcknew datanew data_typen cycles(n 1 cycleTx Packet HandlerADPLL+PACRCData WhiteningPulse ShaperInterpolatorNMultiplierx MPA rampup/downLow-Power&high-performance RF/Analog Front-EndLow-Power Software-Defined Dual-Mode Bluetooth TransceiverThe low-power and l

94、ong-range benefits of BLE(e.g.,IoT)The high performance of BT-BR/EDR(e.g.,audio,legacy)9b ADC25%duty IQ/250%duty/2Digitally Controlled Delay LineDual Mode Digital PA(+10,+2 dBm)Dynamic Freq.DividersIQTX Power SenseActive-RC Low pass Filter(LPF)RC osc.for Cal.9b ADCLNATemperature SensorXTAL48 MHzVDDP

95、AXOPTAT/Band-gap ReferenceDigitalDigitalBackendBackend On-chip MatchSoftware Defined Dual-mode Bluetooth ModemTX/RXCalibrationRF/Analog Front EndRF/Analog Front EndGPIO/SPIRSSI/AGC(LNA,LPF)HPLPESD Protection Circuitryn-pathmixersLDOsVDDAOn-chip LDO DecouplingCMOS 22nm FDCMOS 22nm FD-SOISOILoop-back4

96、.8 GHz 4.8 GHz ADPLLADPLLDividerLoop FilterTDC/DTCAmplitude RegulationDigital FM Predistortionto ADCPerformance comparisonNeed for an optimized Bluetooth Dual-Mode(BT-DM)Transceiver(TRX)integrating:System evaluation results1)Direction finding Detection:Max.distance=10mAngular resolution=3(5cm 1m)Alg

97、orithm proc.=100ms1 CTE every 5ms2)VisageULP RX:2.96mWTX Pout,max=10.0dBmULP imager:90W/frame close-to-ideal PDNs&operation near VminHigh Current ArchitectureBasic circuit diagram with 52 phases enabling 200A Active phase current balancing for load distribution across phasesInductor DesignByte-Inter

98、leaved Register PlacementRepresentative 1kW System on PackageEfficient low-profile VR with formidable current capabilityVccinVXBRVSSVSSZoog25Phase0Phase0Phase controlPhase0PWM4Phase0Phase1Phase controlPhase0Phase controlPhase1Phase Current Balancing+-Rx3Rxvref3RxRx+-Gang/AVP controlvidcodeCloadCOMPE

99、NSATORVccinVXBRVSSVSSVccinVXBRVSSIsenseZoog0Package inductor arrayPhase1Phase022Current sense/ctrl VR chiplet SLoad dieVR chiplet NLoad dieLoad dieHigh current Input SupplyXDP Interface for VR chiplet,Load diesMulti chip package To DC Load Base DieTop DieVR ChipletsCold PlateSoP with coax magnetic i

100、nductorsMotherboardMotherboardDie side VR ChipletConstruction options for use of VR chipletRepresentative Net Power needs of a 500W SoC Base DieTop DieVR Chiplet ArrayCold PlateMotherboardSoPPackage with coax magnetic inductorsFully automated test setupRepresentative 1kW System on PackagePackage wit

101、h two ganged VR chiplets&transient loadInductor array fabricated in the packageInductor footprint is contained within the die shadowValidation set upMeasured Power Conversion efficiency of 81-87%across various load and output voltage conditionsHigh bandwidth regulation reduces the need for low frequ

102、ency decoupling with capacitors1.75V input Single chiplet operationClosed loop gain and phaseStimulus realizes an effective voltage step of 14.7mV on Vref for small signal characterization.Reference tracking is shown here.14.7mVTrigger Clock VoutSingle die operationOn die step stimulus for small sig

103、nal measurementsDistributed magnetic Inductors-50607055%60%65%70%75%80%85%020406080100Iload distribution(A)EfficiencyIload total(A)EfficiencyI_U1(A)I_U1_ideal(A)I_U2(A)I_U2_ideal(A)Two-die ganged operationElectron Paramagnetic Resonance(EPR)A Portable 14GHz Dual-Mode Pulse and Continuous-

104、Wave Electron Paramagnetic Resonance Spectrometer using a Subharmonic Direct Conversion ReceiverJ.-H.Sun,M.Rustom,T.-D.Nguyen,J.Singh,P.Z.Qin,C.SiderisUniversity of Southern California,Los Angeles,USA.MotivationArchitectureVerificationPaper No.:29.5Potential of Portable Multi-Mode EPRPortable point-

105、of-care diagnostics devicesWearablesMiniaturizing quantum information systems Portable Spectroscopy SystemFPGAPermanentMagnetEPRICSystem ImplementationDataAcquisitionPCBPerformance Summary and Comparison TableThis WorkISSCC 2018 1MWSCAS 2022 2ISSCC 2023 3MWCL 2021 4SENSORS 2019 6Process65nmCMOS0.13m

106、 CMOS40nm CMOS0.13m BiCMOS28nm CMOS0.13m CMOSFreq.range(GHz)12.814.9(CW)11.814.2 12.214.9261.3263.611.214.511.512.8Sensitivity(spins/Hz)2.9 109(CW)8 1096 1010Not reported6.1 1083.4 1091IC power(mW,total)98(Pulse)39(CW)120(8 elements)24043001.12 2255EPR modeCW,PulseCWCW(+RS)CWCWCW1 From DPPH spectrum

107、,linewidth 2 G 2Requires external RF sourceEPR on ChipNominal B00.5 TPulse length resolution0.6 nsNominal B(pulse)3.9 G(140m sensor)Minimum pulse length0.6 nsNo.Sensors2(independent)Delay resolution2.3 nsFreq.range(pulse)12.815 GHzSensitivity(pulse,1000 averages)4.6109spinsFrom specialized laborator

108、y technique to widely-available analytical method EPR ExperimentsInversion Recovery(2-Pulse)Free induction decay(FID)CW SpectraSensor VCOPulse On140m200mPulse OffFast startupReduced phase noiseMinimal deadtimeStandaloneVCOInject+Short(2x)Spectrometer Architecture0.8431.1430.013300.0020.0040.0060.008

109、0.010.0120.01400.20.40.60.811.21.40.800.901.001.10TOPS/MM2TOPS/WVDD80MHz to 210MHz00.20.40.60.811.2HD SRAMRRAMNorm.Power00.20.40.60.811.2Cell256Kb MacroNorm.AreaSRAMRRAMA 40nm VLIW Edge Accelerator with 5MB of 0.256pJ/b RRAM and a Localization Solver for Bristle Robot SurveillanceSamuel D Spetalnick

110、*1,Ashwin Sanjay Lele*1,Brian Crafton1,Muya Chang1,Sigang Ryu1,Jong-Hyeok Yoon2,Zhijian Hao1,Azadeh Ansari1,Win-San Khwa3,Yu-Der Chih4,Meng-Fan Chang3,Arijit Raychowdhury11Georgia Institute of Technology,Atlanta,USA,2Daegu Gyeongbuk Institute of Science and Technology,Daegu,Korea,3TSMC Corporate Res

111、earch,Hsinchu,Taiwan,4TSMC Design Technology,Hsinchu,TaiwanMotivationArchitectureSystem ImplementationVerificationPaper No.:30.1Compute Mounted on Tiny RobotEncoderMax DepthPerception Front-endInputInput ImageLocalization Back-endDecodersSpace encodingA=A+WRelevant objectsTrajectory top-viewAnnotate

112、d objectsXYOdometryInstantaneous motionDepthSegmentationActuationPlatformLoop ClosureBristle Robot(Hao et al.,ICRA 2020)Task:Autonomous Steering with Object Avoidance WorkloadRequirementChallengePerceptionSub-mWStandbyPower-DownMB-rangeNVM StorageData DensityLocalizationFrequent State UpdatesIn-Plac

113、e AdditionRRAM For On-Chip Inference(Perception Task):Localization:In-Place Updates with 10T CellsPerception:On-Chip Inference with Dense RRAMLarge SRAM vs Large RRAM 40nm(Sim.,256Kb Macro,0.9V)Power-Down PowerDensity5x2.9x2.1xRRAM 32KBSub-Bank 0RRAM 32KBSub-Bank 17256b/2 Cycles NVM+ECC128b I+O Ring

114、 Intfce.128b/384b Matrix Re-mapping20 x 384b Tensor RF(TRF)SPI(Configuration&Test)8b Tensor Datapath32x 32b Core RF2P Output RF8T 4KB(+Re-mapping)128b4x4 MatMulReLUQuantizationAdd/ScaleRWWRSRAM Instr.Memory8KBSRAMTensor Memory64KBVLIW CoreParsing&Control128b128b256x256RRAM Sub-Array256x256RRAM Sub-A

115、rrayWrite MUX and SwitchesRead MUX,RREF,SAWrite MUX and Switches Read MUX,RREF,SARead SL VSS TiesRead SL VSS TiesRead SL VSS TiesRead SL VSS TiesWriteWL DriverRead WL Driver Timing Gen.,VCLAMP DAC,Decoding,RoutingIO/Route/LSIO/Routing/Level ShiftOther LSIO/Routing/Level Shift256x256RRAM Sub-Array256

116、x256RRAM Sub-ArrayWriteWL DriverRead WL Driver 256x32b RRAMRD SL VSS Tie16:1WRMUX4:1 Final WR MUX+Polarity32:4 RD MUX8:1 RD MUXRD RREF DACRD SA&DOUT Driver32:4 RD MUX256x32b RRAMRD SL VSS Tie16:1WRMUX16:1WRMUX16:1WRMUXNVM Matrix Unit 00.5MBRRAMVLIW Controller68KBSRAM8b 4x4 Matrix Unit256b Register R

117、/WSPI Port&Top LogisticsTest Board256b SPI0 230MHz CLKNVM Matrix Unit 40.5MBRRAMVLIW Controller68KBSRAM8b 4x4 Matrix UnitNVM Matrix Unit 50.5MBRRAMVLIW Controller68KBSRAM8b 4x4 Matrix UnitNVM Matrix Unit 90.5MBRRAMVLIW Controller68KBSRAM8b 4x4 Matrix UnitCore Data Ring128bLocalization ControllerModu

118、le 00Module MB Chip with 10 NMUsNonvolatile Matrix Unit(NMU)Custom RRAM Macro FloorplanTest Chip SummaryIOs and Sealring10 x NMUsLocalization UnitRing RoutingSPI,Power,etc.RRAM Bank6T Tensor SRAM8T Ring SRAM6T Inst.SRAMLogic,RFs,etc.11.2%68.7%2.0%2.7%15.3%57.1%13.4%2.1%2.5%24.9%Chip Areab

119、0b1b2b3Vbp-bitcellVbn-bitcellVDDVSS4x BitcellOdometryLoop closureImagetSpace EncodingImaget-1Re-visiting a previous spotVoltage Pulses for ArithmeticIn-Place Updates with 4x 10T CellsLocalization Task Overview10Retain Data001111Shift Left(+1)100Shift Right(-4)1101100RRAM SRAM800mV,80MHz0.2560.171850

120、mV,100MHz0.2720.1931100mV,210MHz0.3420.272Measured Power and PerformanceLocal Access Costs(pJ/bit)Power across ConfigurationsEfficiency&Compute DensityEnd-to-End ShmooPre-ECC RRAM Bit-Error RateSystem and NVM Macro FunctionalityOnly Using RF(Transient)Reading from RRAM/SRAM,w/3x weight reuse.Summary

121、 InformationDie Micro-PhotographPrototype MockupFlex PCB BackFlex PCB FrontPiezoelectric Bristle Robot24 mmMounted on RobotMicrocontroller/USB-Based Detailed Testing BoardDemonstration System SummaryIntegrated Portable Testing BoardEx.Python Test ScriptTest PC:High-Level Python USBInterfaceTesting B

122、oardSPISPI/I2CMCU.CDUTDACs/ClockPower Pre-FilteringHV Rails for RRAM Write:DACs&Voltage BuffersDUTCore VDD:LDO,Trimming&Power MeasurementPCB CLK Source&SwitchI/O VDDTest Management:Pi Pico MicrocontrollerUSB Data&Power to PCUSBDUTDriver stage 1VRVRMCUDriver stage 2Camera#(Setup Code Omitted)#Write&C

123、heckfor core_num in core_nums:if(do_write):if is_otp:core.wr_otp_rram(s,core_num,wdata:36*lines)else:if needs_form:core.form_rram(s,core_num,start=0,lines)if not form_only:core.wr_rram(s,core_num,wdata:36*lines)rw.check_rram(s,core_num,wdata,lines,start=0)75060Frequency(MHz)VDD(mV)0260850

124、95010501150All PassPass(RRAM OTP Mode)RRAM FailLogic FailTechnology40nm with Foundry RRAMChip Size4.5mm x 4.5mmVoltages0.8 1.1V VDD,1.5 4.0V WriteModules10 x NVM Matrix Units,1x LocalizationRetentive Sleep110W 0.5VOn-Chip RRAM/SRAM5MB(Post-ECC)/760KBRRAM ECC72bit-Word SECDEDRRAM Access Energy VMIN25

125、6fJ/bitMAC Compute Efficiency VMIN0.843 TOPS/WNVM Matrix Unit 0NVM Matrix Unit 1NVM Matrix Unit 2NVM Matrix Unit 3NVM Matrix Unit 4NVM Matrix Unit 9NVM Matrix Unit 8NVM Matrix Unit 7NVM Matrix Unit 6NVM Matrix Unit 5Localization&OtherSPI&Other4.5mm4.5mmCore-to-Core Ring40180501

126、000579111315Pre-ECC Bit Errors 1.1V,200MHz0.85V,100MHz0.80V,100MHz(OTP)Pre-ECC Tgt.(1ppb.)Pre-ECC BER(ppm.)10.621.231.842.453.0RDAC Wiper Setting(0.15)4,718,592 Freshly Written Cells 27C0.400.551.103.324.026.6335.6949.46156.90132.0181.1603.90.(OTP)0.8V,80MHz0.85V,100MHz1.1V,210

127、MHZMeasured Power(mW)Clk GatedRRAM OnClock OnPeak PowerNMU AreaVecim:A 289.13GOPS/W RISC-V Vector Co-processor with Compute-in-memory Vector Register File for Efficient High-performance ComputingYipeng Wang,Mengtian Yang,Chieh-pu Lo,Jaydeep P.KulkarniUniversity of Texas at Austin,TXMotivationArchite

128、ctureSystem ImplementationVerificationPaper No.:30.6Mitigate the efficiency trajectory of HPC.5-10Xmore efficiency scalingwith Compute-in-memory(CIM)(AMD,ISSCC 2023)First CIM system basedon RISCV generalpurpose platform.Architectural improvementfor vector processor.Reduced data movementand memory BW

129、.CPUVector processorCIM VRFOthers OscilloscopePowerJTAGUARTLive throughputcomparison of thiswork and baselinevector processor onmatrix multiplicationand AI workflows.EfficiencyZettascale 0 before VG2(G1)2W Efficiency increasingly importantA High Accuracy and Energy-Efficient Zero-Shot-Retraining Sei

130、zure Detection Processor with Hybrid-Feature-Driven Adaptive Processing and Learning-Based Adaptive Channel SelectionJ.Liu1,X.Liu1,X.Wang1,Z.Xie1,Z.Zhong1,J.Fan1,H.Qiu1,Y.Xu1,H.Qin1,Y.Long1,Y.Zhou2,Z.Shen3,L.Zhou1,L.Chang1,S.Liu1,S.Lin1,C.Wang3,J.Zhou11University of Electronic Science and Technology

131、 of China,Chengdu,China2West China Hospital of Sichuan University,Chengdu,China,3Huazhong University of Science and Technology,Wuhan,ChinaMotivationArchitectureSystem ImplementationVerificationPaper No.:33.1Patient-Specific Training Requires Time-Consuming&Costly HospitalizationChallenge 1Challenge

132、2Limited Detection AccuracyOur Solution:A Zero-Shot-Retraining Seizure Processor w/Hybrid-Feature-Driven Adaptive Processing High Accuracy High Energy EfficiencyLarge Energy ConsumptionChallenge 3Fp1F3F7T3T5O1P3C3O2P4CzPzFzF4Fp2C4F8T4T6EEG ElectrodesF7-Fp1T5-T3T4-T6T6-O2Multi-ChannelEEGNormalSeizure

133、 OnsetSeizure Detection ProcessorClassification Results Seizure Detection Processor Detect the seizure onset of patients for alert or stimulationNon-Seizure Data(Easy to Collect)Patient-Specific TrainingLabeled DataSeizure DetectionClassifierSeizure Data(Difficult to Collect)Seizure Detection Proces

134、sorChallenge#2:Limited AccuracyFalse AlarmMiss Detection Seizure OnsetSeizure TerminationNormalIctal.Time/sAmplitude/mV026000False AlarmMiss Detection Seizure OnsetNormalSeizure.TimeAmplitudeSeizure Detection ResultsSignificant LatencySeizure TerminationSeizure OnsetMulti-Channel EEG InputSeizureNor

135、malConvolution LayersFully-Connected Layers Hybrid-Feature-Driven Adaptive Processing Architecture w/On-Chip Learning Requiring no Seizure Data from the Patient.Learning-based Adaptive Channel Selection Technique to further reduce the energy.Achieved highest sensitivity(98.8%)and the lowest energy c

136、onsumption(0.072 J)with comparable specificity among SOTA designs.HAPEMFEEROLEOthers1.4 m0.7 m MFEE:Manual-Feature Extraction Engine HAPE:Hybrid-Feature-Driven Adaptive Processing Engine ROLE:Reconfigur-able On-Chip Learning Engine LCSM:Learning-based Channel Selection ModuleFeature Fusion Controlle

137、rData Buffer&Data SegmentLCSMNN Feature Extraction ControllerShared ProcessingElementsShared ProcessingElementsShared ProcessingElementsShared ProcessingElementsWeight MemoryBias MemoryROLEHAPEData MemoryHybrid-Feature Classifiation ControllerHybrid-Feature Buffer8-bitChannelMultiplexerManual-Featur

138、e BufferFreqency-Domain Feature ExtractionTime-Domain Feature ExtractionCh#0Ch#15Feature-Driven Adaptive Processing Controller NN FeaturesSeizure DetectionResultsMFEEFeatures of Ch#15Features of Ch#0.Ch_Sel 15:0Segmented EEGActivateManual FeaturesShared NN Processing ElementsActivationBufferWeight B

139、ufferMulti-FuncMACUnitChannel WeightsCh_Sel 15:0Data InterfaceNN Instructions&Weights8-bit8-bit8/16bit8-bitMulti-ChannelEEGSPIOverall Architecture of the Proposed Seizure Detection Processor Technology:55nm Core Area:0.98mm2 Supply Voltage:0.68V Frequency:2.5MHz SRAM Size:8KB Data Precision:INT16/IN

140、T8 Power Consumption:25.8W Energy/Classification 0.072 J Demo1:Seizure Detection Using Public EEG DatasetOur ChipFlashTest Board(3cm3cm)USBBluetoothWearable EEG Recording DeviceMotor ImageryRobot Control Move Forward Spin Move BackwardWirelessTransmitMove ControlLaptopChip PowerMeasurementGUI(Send T

141、est EEG/Display Seizure Detection ResultsDemo2:Wearable Brain-Computer-Interface System.Demonstration FlowDemo1:The laptop sends EEG signals from the CHB-MIT dataset tothe test board and captures seizure detection results from the testchip.A current meter measures the power consumption of the testch

142、ip.Demo2:Real time user EEG signals collected from a wearable BCIdevice are sent to the test board via Bluetooth and our chip arereconfigured to identify the imagined motor commands and controlthe robots movement.Teacher-Student CNN Control FlowA Sub-1J/class Headset-Integrated Mind Imagery and Cont

143、rol SoC for VR/MR Applications with Teacher-Student CNN and General-Purpose Instruction Set ArchitectureZhiwei Zhong*,Yijie Wei*,Lance Go,Jie GuNorthwestern University,Evanston,IL,USA (*Equally Credited Authors)MotivationArchitectureSystem ImplementationVerificationPaper No.:33.2LDO x 3SPIInput Memo

144、ryOutput vectorDigital CoreModel ControlWeight MemoryLNASAR ADCPGALPF16 Channel AFEResultsIIR Band Pass FilterTeacher CNNStudent CNNInstr.WeightReferenceGeneratorLDO x 3LDO x 3Scan IOClock Gen.Analog CorePE ArrayOutput Mem.OOKDataShifterSparsity ControlClk.GatingPseudo-resistorINNINPOUTNOUTPSwitch C

145、apsMotor ImageryMental ImageryAffect DetectionSSVEPsADCIIR FilterConvFCDFTFCInput MemOutput MemFCTasks Conv Instruction Image size Kernel size#Kernels Addr of Input Addr of Bias Addr of Weight Activation Func.Pooling Sel.#of Filters Sparsity thresh.Input length,Frequency,Channel selection,Zero paddi

146、ng Memory selection,Data length,Initial and target address#Class,Branch target address(BTA)of instructionSampling rate,Channel selection,Coefficients addressType 2:0Instruction Bits 127:3ConvData MovFCWTADFTIIR filterImage size,Kernel size,#Kernel,Sparsity threshold#Input neuron,#Output neuron,Spars

147、ity threshold OutputSRAMCol8Bank 2ActivationMax PoolingWeightSRAMClock Gating.OutputSRAMCol8Bank 1OutputSRAMCol2Bank 2ActivationMax PoolingWeightSRAMClock Gating.OutputSRAMCol2Bank 1OutputSRAMCol1Bank 2ActivationMax PoolingWeightSRAMClock Gating.OutputSRAMCol1Bank 1.x8PECol.Input SRAMData Transfer&M

148、anagement for End-to-end OperationsIIRInstrInstruction MemoryDFTInstrConvInstrFCInstrData MovInstrWTAInstr.x10.ComparisonClock Gating01ComparisonClock Gating01ComparisonClock Gating01Sparsity control.Control UnitDetailed Neural Processor Architecture and ConfigurationInstruction Set Architecture(ISA

149、)for General-purpose EEG ClassificationFully-integrated System-on-Chip Block DiagramFlexible Data FlowEEG Controlled Pop-up Menu Yes or No selection by mental imageryUser practiceFocus topop up menuImagine Surfing sceneImagine Rainy sceneVR game speed control(faster/slower)based on arousal detection

150、Confusion Matrix Guided Teacher-Student CNN SchemeReal-time Affect Tracking and Mental Imagery Classification*Digital backend only*Analog frontend+digital backendISSCC212ISSCC233JSSC224VLSI215ISSCC226CICC207This WorkProcess65nm40nm28nm40nm65180nm65nmSoCNOYESNOYESYESYESYESSupply Voltage(V)0.751(AFE)/

151、0.9(DBE)0.51.1(AFE)/0.7(DBE)1.211.0(AFE)/0.8(DBE)#Channel222816256416Area(mm)1.746411.28167.5Application Space&taskBiomedical:ECG/Seizure/EMGBiomedical:EEG SeizureBiomedical:EEG SeizureBiomedical:EEG SeizureBiomedical:EEG Seizure+StimulationBiomedical:Autism spectrum disorderVR/MR,EEG Affect Trackin

152、g/Imagery control/SSVEPClassification MethodsCNN/MLPSciCNNLRGTCA SVMNeuralTreeDNNCNN,IIR,FC,DFTEnergy(uJ/Class)4.36/2.06/5.25*28.33*1.5nJ*0.97*0.227*10.13*0.89(teacher-student)*Verification(#subjects)Bonn(N/A)CHBMIT(24)CHBMIT(24)CHB-MIT(24)CHB-MIT(24)iEEG(6)DEAP(32)SEED(15)THU-SSVEP(35)Motor Imagery

153、(4)in-situ DemostrationNOYESN/AN/AYESN/AAffect tracking&imagery controlAFE Gain(dB)36-5440.7-57.950-6445-72AFE Power/ChN/A1.51uW1.63uW8uWN/ANONOImagine Rainy Scene Imagine Surfing SceneSoC BoardVRHeadsetVRHeadsetT3T5O1PZCZT4T6O2SoCEEG INPUT33 mm66 mmSoC BoardSoC Demo ModuleBluetooth Module BoardLi-i

154、onBatterySoC BoardIO to Bluetooth Module9 mmT4T6O2T3T5PZCZGNDOverall Accuracy%SSVEPDataset8MotorImageryDataset9Public Dataset Results86%50709080%4 Subjects35 SubjectsTeacher-Student Energy Efficiency3 mm2.5 mm16 ChannelAFE+ADCWeightSRAMLDOsOutput SRAMInputSRAMOutputSRAMDigital CoreGamingOfficeMovie/

155、ShoppingVR HeadsetEEG Cap Cumbersome Wearing Lack of On-line Low-Latency Computing Support Lack of Real-time Mind Imagery Feedback Control High Power for AIRequirementsExisting setup Seamless integration of VR/MR systems and brain-machine interfaces(BMI)Flexible Support for multiple mind imagery tas

156、ks with low power and latency Left EarSSVEPSensitiveChannelsRight EarFrontFor Affect Monitoring&Motor ImageryForMentalImageryThe 10-20 System for EEG1)SoC for BMI for Mind Imagery Control of VR/MR2)Integration of VR Headset and EEG Channel Placement3)ISA and Reconfigurable Architecture for General-p

157、urpose BMI Tasks4)Teacher-Student CNN and Sparsity Enhancement for Power SavingAverage Power in Affect Track.0.85WDigital Corefs=128Hz1Class/2sec(Depend on application)25.6WLeakage8-ch AFE64WCore SoC Power(clk freq=2MHz)(Relaxed state)(Intense state)Student Confusion MatrixAff.1 Aff.2 Aff.3 Aff.4Aff

158、.1140040Aff.2214200Aff.3201411Aff.402101232x256x12x248x82x31x84Conv81x9x1Max pool 1x8ReLU+DenseEEG inputStudent CNN ModelActivate02550255Aff.1Aff.1Aff.1Aff.3Aff.2Aff.4Aff.4Tchr CNN Stu CNNStu CNNStu CNNStu CNNStu CNNCH1CH224610140Time(s)Tchr CNNAff.1 Aff.2 Aff.3 Aff.4Aff.1142200Aff.2213903Aff.340134

159、6Aff.4096048Low confidenceHigh confidenceAff.3Stu CNNAff.4Stu CNN812163X Latency Student Confusion MatrixAff.1 Aff.2 Aff.3 Aff.4Aff.1140040Aff.2214200Aff.3201411Aff.40210123Activate02550255Aff.1Aff.1Aff.1Aff.3Aff.2Aff.4Aff.4Tchr CNN Stu CNNStu CNNStu CNNStu CNNStu CNNCH1CH224610140Time(s)Tchr CNNAff

160、.1 Aff.2 Aff.3 Aff.4Aff.1142200Aff.2213903Aff.3401346Aff.4096048Low confidenceHigh confidenceAff.3Stu CNNAff.4Stu CNN812163X LatencyTeacher Confusion Matrix2x256x12x248x82x31x84Conv81x9x1Max pool 1x8ReLU+DenseEEG inputStudent CNN Model2x256x12x248x162x31x16Conv161x9x1Max pool 1x8ReLU2x28x16Conv161x4

161、x162x14x164Max pool 1x2ReLU+DenseEEG inputTeacher CNN Model2x256x12x248x162x31x16Conv161x9x1Max pool 1x8ReLU2x28x16Conv161x4x162x14x164Max pool 1x2ReLU+DenseEEG inputTeacher CNN ModelTeacherTeacherStudentEnergy/Class(uJ)Student1.970.60.89012An Adhesive Interposer-Based Reconfigurable Multi-Sensor Pa

162、tch Interface with On-Chip Application-Tunable Time-Domain Feature ExtractionJeonghoon Cho*,You Jang Pyeon*,Junyeong Yeom*,Hyunjoong Kim*,Sanghyeon Cho,Yonggi Kim,Taejung Kim,Jong-Hyun Kwak,Geonjun Choi,Yoonsik Lee,Heungjoo Shin,Hoon Eui Jeong,Jae Joon Kim *Equally Credited Authors(ECAs),Ulsan Natio

163、nal Institute of Science and Technology,UlsanMotivationIC ArchitectureSystem ImplementationVerificationAdhesive Interposer PatchPPG&Chemo-R&Electro-Chemical Sensor MCU,BLEPatch Electrodes for ExGICBatteryPaper No.:33.7Reconfigurable Environmental&Healthcare Multi-Sensor Patch InterfaceMultimodal,Mul

164、ti-Channel,Adaptive SystemImplemented System&ICs Microphotograph Application Tunable On-Chip AI Alert System Supporting Multi Signal Monitoring Accessible Hardware-Level ReconfigurationChallenges for Multi-Sensor Interface Analog Frontends Calibration of Gain and Offset Analog Min-Max Normalization

165、Sensor Variation Matching Adaptive Size Control w/1-b Quantization Signal Window Size Optimization QTD-CNN&One-Shot BNN Memory Leakage MinimizationEnv.+Bio Front-End+Analog ClassifierPMIC4.22mm1.35mm4mmGas chamber(NO2)4x Gas sensorGas monitoring configurationGas/Arrythmia Detection ResultQTD-CNN4 Ch

166、 Shift RegisterVINPCh.141-bit Quantization w.Time Window Size Ctrl.Ch.1DnClkWClkWClkWDn-1Dn-2Dn-3Reference Voltage DACTime Window Control.(CLKW=10ms 10000s)ClkWMin-Max Normalization4ch MUXAnalog Frontends(x 4ch)R-2R DACVDAC=(Vmax Vmin)-VminParameterMemoryVINP PGA(Vmax Vmin)x 1GainPGA=Ch 4Ch.3Ch.2Ch.

167、1CCVDDVDD7-bit capacitor array(C)DCh1n-33 past(x 4Ch.)Ch 4Ch 3Ch 2Ch 12C2C7-bit capacitor array(2C)Present(x 4Ch.)One-Shot BNN8 bit Resistive,Current DACDrift AutoCancel.Sensitivity CalibrationCurrent Mirror Based PotentiostatDC Current Optimiz.Programmable Dynamic RangeCurrent GeneratorPhase Synchr

168、onizationBaseline Cancel.BioZPPGDC Light Cancel.Ambient Light Cancel.Chemo-RE-ChemExGInput Impedance BoostingElectrode DC offset Cancel.Analog FrontendsEnvironmentalBiomedicalPerformance Summary&Comparison Table Multimodality Signal Variations High Reliability Compact System On-Chip AI Edge Patch De

169、viceBLE&MCU Module ICGas Sensor with TSVAdhesive Interposer PatchAdhesive Interposer Patch ElectrodeGNDVINPVINNMulti-sensor PCBAdhesive Interposer PatchGas Sensor with TSVWallSetup&Output of Environmental/BiomedicalAdhesive Interposer Patch ElectrodeGNDVINPVINNMulti-sensor PCBBioZExGPPGNO2 GASMIT-BI

170、H DB Patient 220:Ventricular Premature Contraction(V)Classifier Input:past 12 quantized+n-3:n RR intervalsPredictionLabelRR Interval(normalized)01ECG(mV)210-1Hit Rate:90.63%Hit Rate:92.41%-1012012ECG(mV)RR interval(normalized)RR interval nTime(s)0400080000000Gas Input:NO2Sensor

171、 Output(normalized)0.50.01.0:Gas Conc.Rising EventPredictionLabelA Miniature Neural Interface Implant with a 95%Charging Efficiency Optical Stimulator and an 81.9dB SNDR M-Based Recording FrontendL.Zhao1,W.Shi2,Y.Gong3,X.Liu3,W.Li3,Y.Jia1 1University of Texas,Austin,TX,2Meta,Santa Clara,CA,3Michigan

172、 State University,East Lansing,MIMotivationArchitectureSystem ImplementationVerificationPaper No.:33.9Brain MappingStroke ResearchNeural interface implants revolutionize neuroscience researchChallenges:small power Rx limited power budget single modality in state-of-the-art designThis Work A miniatur

173、e,dual-modal neural interface implant.Wireless power and data transmission A linear charging switched capacitor stimulation:high ISTIM,low supply voltage,requires only a small off-chip capacitor.A CT M-based neural recording front-end:81.9dB SNDR,172.2dB FoMSNDR.Device Miniaturization:reduce invasiv

174、enessBrain Machine InterfaceA 2.7ps-ToF-Resolution and 12.5mW Frequency-domain NIRS Readout IC with Dynamic Light SensingFrontend and Cross-Coupling-Free Inter-Stabilized Data Converter Zhouchen Ma1,Yuxiang Lin1,Cheng Chen1,Xiangao Qi1,Yongfu Li1,Kea-Tiong Tang2,Fa Wang3,Tianhong Zhang4,Guoxing Wang

175、1,Jian Zhao11Shanghai Jiao Tong University,Shanghai,China 2National Tsing Hua University,Hsinchu,Taiwan 3United Imaging Microelectronics Technology,Shanghai,China 4Shanghai Mental Health Center,Shanghai,ChinaMotivationSystem ImplementationVerificationPaper No.:33.10Tumor self-examination employing F

176、D-NIRSWearableNon-invasiveMetabolic ATissueMetabolic BEmittedLight DensityLight Density+Light Flight TimeSlow MetabolismfBW 0.0110HzFast Light Speedfc 10300MHzArchitecturePrinciple of Frequency-domain NIRSChallenge of current FD-NIRS Readout ICLow powerLong-term ToF MeasurementHigh accuracyReduce In

177、tensity/Phase Crosstalk Low-power&high-accuracy FD-NIRS ICLow-power Dynamic Light Sensing Frontend High-accuracy Inter-stabilized IPDCLaser Diode DriverFIA-based TIA+PGADC SupplyFPGAPower BoardPCBPCChipLDPDPerformance ComparisonOn-site InteractionDemonstration SystemPlease select the“tumor”location

178、Image through a 1D line scanLeftRightTell the audience the location of the“tumor”AbsorptionReduced ScatteringImages of the“tumor”on the left:1cm1cm“tumor”“tumor”In-vivo TestfrequencyHigh-accuracy Inter-stabilizedIntensity/Phase-to-digital Converter 1D Line Scan PlatformFiberA 28nm 72.12TFLOPS/W Hybr

179、id-Domain Outer-Product Based Floating-Point SRAM Computing-in-Memory Macro with Logarithm Bit-Width Residual ADCPaper No.:34.6MotivationArchitectureSystem ImplementationVerificationY.Yuan,Y.Yang,X.Wang,X.Li,C.Ma,Q.Chen,M.Tang,X.Wei,Z.Hou,J.Zhu,H.Wu,Q.Ren,G.Xing,P.Mak,F.ZhangInstitute of Microelectr

180、onics of the Chinese Academy of Sciences,Beijing,China University of Chinese Academy of Sciences,Beijing,ChinaBeijing Institute of Technology,Beijing,China University of Macau,Macau,ChinaHybrid-Domain Outer-Product Based Floating-Point SRAM Computing-in-Memory MacroChallenge2.Trade-off of Precision&

181、Throughput&ADC OverheadChallenge3.Inner-Product Based Adder Tree Burden CIM Array Big Fan-In Adder TreeInner Product Based CIMGreat Amount Multi-Bit Adder Outer Product Based CIMMuch Less Adder ConsumeAccumulatorD QD QD QCIM Array WL Driver6T6T6T 6T6T6T BL Driver Shift&Add Adder TreeTransistor-Bit M

182、ultiplyLess In-Array ConsumeAnalog CIMLarger Out-arrayConsumeWL Driver6T6T6T 6T6T6TA/D BL Driver Shift&Add A/D A/DHeavy ADC-Bit Shift&Multi-bit AdditionDigital CIMA/D High precision weight requires a quite heavy high precision bit ADC Decode LogicSAR Logic&Decode LogicDAC Cap ArrayFlash:High Through

183、put&High CostSAR:Low Throughput&Low CostElement-wise summationVector-wiseAccumulationLarger In-ArrayConsumeLogic Gate-Bit MultiplyLess Out-array ConsumeAdder Tree-Bit Shift&Multi-bit AdditionShift&AddN Bit ADCW0W1 Bit Serial InputtingIM-1I0I1I2M+N Bit ADCW0W1 M Bit DACWN-1WN-1IM-1:0ControlControlIn-

184、Array Bit MultiplicationOut-Array Multi-bit AccumulationAdderTreeLogic Gate&RoutingAdderTreeTransistor&RoutingHeavy ADCTransistor&RoutingNorm.Energy EfficiencyChallenge1.Combine the Advantage of Digital and Analog CIMAnalog CIMDigitalCIMOurWork10.50nij,ijj 1cb a=ni,ji,kk,jk 1ca b=The Increasing Bit-

185、Width Demand for Floating Point SRAM CIM Proposes Some New ChallengesActivation BufferCTRLBL DriverWL Driver CIM BlockCIM BlockCIM CellCLK CIM Cell CellCellCellCellCellCellCellCell SRAM Sub-ArrayMultiplier8 Bit Hybrid Domain MacroExponentComputeCircuitSign CircuitAccumulatorINT/FP Adder16Bit Registe

186、rsMulti-bit AccumulatorAnalog Bit Mul&Bit AddResidual Stages SRAM 16b ColSparsity Control Circuit DQDQDQENStage1Decode And Output LogicInputrefrefIrefBit1MSBBit2Bit3LSBbiasbiasStage23 Bit Residual ADC(With 2 Residual Stages)ENEN3 Bit Residual ADCEN3bitAnalogInput3+1bit AdderHABCoutSAHABCoutSAHABCout

187、SA1bit1:11:1Residual Compare StagerefInputBitOutputG=2G=2minmaxOutput01ENMacro Overall ArchitectureResidual ADC ArchitectureHybrid-Domain Scheme Iunit O0(analog)=W0A0O1=W0A1+W1A0O2=W0A2+W1A1+W2A0O14=W7A7O7=W0A7+.+W7A0O8=W1A7+.+W7A1 Iout=IrefIrefW1A7IunitIunitW7A0W7A1W7A7W1A1W1A0W0A0W0A7Weight 6T Cel

188、l15Weight 6T Cell0 Weight 6T Cell15Weight 6T Cell0 Weight 6T Cell15Weight 6T Cell0 BLBLBBLBLBBLBLBW0A1DQ O1O2O6O7O8O9O13O141:11:1ADC1ADC2O0PSUM14i7ii 0i 17Oi(OiOi7)O0222=+ADC 2ADC 1MAC ReusltCLKO13O14O11O12O9O10O8O6O7O4O5O2O3O1PSUM8 Cycles for 8bA 8bWComparison Table*134 and our work all take averag

189、e energy efficiency,2 takes 50%input sparsity*Take throughput and energy efficiency of 8b input&8b weight ISSCC231P.-C.WuISSCC232A.GuoISSCC223P.-C.WuISSCC214J.-W.SuThis WorkTechnology22nm28nm28nm28nm28nmMAC OperationHybrid DomainDigital DomainTime DomainAnalog DomainHybrid DomainMinimum Access Time6

190、.4ns0.8V6.8ns0.9V6.6ns0.9V6.8ns0.95V4.8ns0.95VInput PrecisionBF16BF16/INT8INT8/INT4INT8/INT4BF16/INT8Weight PrecisionBF16BF16/INT8INT8/INT4INT8/INT4BF16/INT8Throughput1.24-1.28TFLOPS-1.05-1.24TOPS-1.98-4.28TFLOPS2.89-5.31TOPSArea Efficiency0.088TFLOPS/mm22.05TFLOPS/mm22.57TOPS/mm2-2.21TFLOPS/mm22.74

191、TOPS/mm2Energy Efficiency16.22-17.59TFLOPS/W14.04-31.6TFLOPS/W19.5-44TOPS/W21.10-27.75TOPS/W15.02-22.75TOPS/W16.55-32.78TFLOPS/W22.78-50.53TOPS/W*All take maximum area efficiency,1 is estimated by reported throughput and area(90%shrink)*Shmoo Plot&FoM Comparison&Chip SummaryTechnology(nm)Macro Area(

192、mm2)SRAM Capacity(Kb)Supply Voltage(V)Number of InputChannelsInput PrecisionBF16INT8Weight PrecisionBF16INT8Throughput1.98-4.28TFLOPS2.89-5.31TOPSEnergy Efficiency(Average performance)16.55-32.78TFLOPS/W22.78-50.53TOPS/WEnergy Efficiency(Peak performance)36.41-72.12TFLOPS/W50.12-111.17TOPS/WAccuracy

193、 Loss(Cifar100)-0.05%Chip Summary281.94192Kb0.7-0.95256FoM=IN-precision W-precision OUT-ratio Energy EfficiencyNormalized FoMISSCC20214ISSCC20223ISSCC20231This work5.763.551.860.950.900.600.850.800.750.700.65VDD(V)Access time(ns)4.08.54.55.05.56.06.57.07.58.0FAILPASSEvaluation PicturesDemonstration SystemEvaluationBoardControl PCControl PCMCUCIM ChipST-LinkEvaluation BoardPossible Applications:Image Recognition,Voice Recognition,Abnormity Diagnosis

友情提示

1、下载报告失败解决办法
2、PDF文件下载后,可能会被浏览器默认打开,此种情况可以点击浏览器菜单,保存网页到桌面,就可以正常下载了。
3、本站不支持迅雷下载,请使用电脑自带的IE浏览器,或者360浏览器、谷歌浏览器下载即可。
4、本站报告下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰。

本文(DS2 - Handout.pdf)为本站 (2200) 主动上传,三个皮匠报告文库仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知三个皮匠报告文库(点击联系客服),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。
会员购买
客服

专属顾问

商务合作

机构入驻、侵权投诉、商务合作

服务号

三个皮匠报告官方公众号

回到顶部