BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Department of Electrical and Computer Engineering (HKUECE) 電機與計算機工程系 - ECPv6.15.20//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:Department of Electrical and Computer Engineering (HKUECE) 電機與計算機工程系
X-ORIGINAL-URL:https://ece.hku.hk
X-WR-CALDESC:Events for Department of Electrical and Computer Engineering (HKUECE) 電機與計算機工程系
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:Asia/Hong_Kong
BEGIN:STANDARD
TZOFFSETFROM:+0800
TZOFFSETTO:+0800
TZNAME:HKT
DTSTART:20240101T000000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=Asia/Hong_Kong:20250522T143000
DTEND;TZID=Asia/Hong_Kong:20250522T160000
DTSTAMP:20260509T183749
CREATED:20250603T025446Z
LAST-MODIFIED:20250603T025451Z
UID:111502-1747924200-1747929600@ece.hku.hk
SUMMARY:Reimagining Edge AI and LLM Inference with Compute Memory Architectures
DESCRIPTION:Abstract\nRecent advances in artificial intelligence (AI)\, especially in large language models (LLMs)\, have dramatically increased model sizes and computational demands\, significantly straining computing system capabilities. This issue is particularly acute in resource-constrained edge AI scenarios\, where efficient hardware acceleration of compute-intensive tasks and optimization of data reuse to minimize costly data transfers are essential. Addressing these challenges\, this talk will explore various options for designing compute memory subsystems through innovative circuit-level and system-level approaches to enhance the efficiency of edge AI applications. Firstly\, we will introduce MAXWELL\, a near-SRAM co-design computing architecture specifically tailored for edge AI. MAXWELL optimizes performance and energy efficiency by leveraging the regular structure of memory arrays\, achieving high parallelization for both convolutional and fully connected layers\, while supporting fine-grained quantization for real-time image and video processing\, autonomous vehicles\, and Internet of Things (IoT) devices. Secondly\, we will delve into SLIM\, a complementary approach to MAXWELL for big data analytics\, scientific computing\, and financial modeling that provides a cost-effective solution for sparse LLM inference. SLIM leverages in-memory processing in DRAM to significantly reduce latency for large caching datasets in edge AI and utilizes near-storage in Solid State Drives (SSD) for large LLM models. Potential applications of SLIM include. This talk will demonstrate that these complementary approaches for compute memory subsystems can achieve up to 10x speed-ups compared to state-of-the-art edge AI accelerators that require data transfers at the boundaries of ML layers. Furthermore\, the proposed co-design approach can accelerate performance by up to 250x compared to pure software optimizations on the X-HEEP edge AI open-source platform\, which integrates MAXWELL and SLIM logic with a 32-bit RISC-V core. Notably\, these accelerator-specific components of computing memories account for less than 12% of the total memory area of X-HEEP. \nSpeaker\nProf. David ATIENZA\nFull Professor of Electrical and Computer Engineering\,\nHead of the Embedded Systems Laboratory (ESL)\,\nAssociate Vice President for Research Centers and Technology Platforms\,\nÉcole polytechnique fédérale de Lausanne (EPFL)\, Switzerland \nSpeaker’s Biography\nDavid Atienza is a Full Professor of Electrical and Computer Engineering at EPFL\, Switzerland\, where he heads the Embedded Systems Laboratory (ESL) and serves as the Associate Vice President for Research Centers and Technology Platforms. His research focuses on system-level design methodologies for energy-efficient multi-processor system-on-chip (MPSoC) architectures\, targeting next-generation computing servers\, data centers\, and edge AI embedded systems\, particularly smart wearables and medical devices in the Internet of Things (IoT) era. Dr. Atienza has co-authored over 450 publications\, holds 14 patents\, and has received numerous best paper awards at top conferences. He is the Editor-in-Chief of IEEE TCAD and ACM CSUR\, and has served as the Technical Program Chair of DATE 2015 and General Chair of DATE 2017. Among other awards\, he has received the 2024 Test-of-Time Best Paper Award at the IEEE/ACM CODES+ISSS Conference\, the IEEE/ACM ICCAD 10-Year Retrospective Most Influential Paper Award in 2020\, the DAC Under-40 Innovators Award in 2018\, and an ERC Consolidator Grant in 2016. He has also served as President of IEEE CEDA (2018-2019) and Chair of the EDAA (2022-2024). He is a Fellow of IEEE and ACM. \nOrganiser\nProf. Kaibin HUANG\nDepartment of Electrical and Electronic Engineering\,\nThe University of Hong Kong \nAll are welcome!
URL:https://ece.hku.hk/events/20250522-1/
LOCATION:Room CB-603\, 6/F\, Chow Yei Ching Building\, The University of Hong Kong
CATEGORIES:Highlights,Seminar
ATTACH;FMTTYPE=image/jpeg:https://ece.hku.hk/wp-content/uploads/2025/06/1280-4.jpg
END:VEVENT
END:VCALENDAR