RPG Seminar – From Understanding to Intervention: Interpretability-Guided Methods for Improving Large Language Models

RPG Seminar banner

RPG Seminar – From Understanding to Intervention: Interpretability-Guided Methods for Improving Large Language Models

 

Abstract

Large language models have achieved impressive performance, but improving them efficiently and reliably requires more than scaling alone. In this talk, I present a series of works that explore how internal understanding of LLMs can be translated into practical interventions for better capability, efficiency, and controllability. I begin with actionable mechanistic interpretability, introducing a unified “Locate, Steer, and Improve” perspective that turns model analysis into a framework for intervention. I then show how this perspective supports several concrete advances: data-free mixed-precision quantization guided by numerical and structural sensitivity, multilingual capability enhancement through representation shifting and contrastive alignment, personalized multi-teacher distillation that routes each prompt to its most suitable teacher, and coarse-to-fine selective fine-tuning for mitigating catastrophic forgetting while preserving general versatility. Together, these works reflect a common theme: interpretability is not only a tool for explaining LLMs, but also a principled basis for designing more efficient training, compression, and adaptation methods.

Speaker

Mr Hengyuan ZHANG
Department of Electrical and Computer Engineering
The University of Hong Kong

Biography of the Speaker

Hengyuan Zhang is a Ph.D. candidate at the University of Hong Kong, supervised by Prof. Ngai Wong and Prof. Hayden Kwok-Hay So. His research focuses on the improvement of efficiency and interpretability within large language models. He aims to uncover and characterize the internal processes that govern model behavior, with the goal of improving model speciality, interpretability, and reliability in real-world deployments. He has published multiple papers in leading venues such as ACL, EMNLP, TKDD, and NeurIPS.

Organiser

Prof. Ngai WONG

Department of Electrical and Computer Engineering, The University of Hong Kong

All are welcome.

More Upcoming Events