RPG Seminar – From Understanding to Intervention: Interpretability-Guided Methods for Improving Large Language Models
Abstract Large language models have achieved impressive performance, but improving them efficiently and reliably requires more than scaling alone. In this talk, I present a series of works that […]
