Zoom Link
https://hku.zoom.us/j/96029601263
Abstract
This seminar presents novel solutions to bridge the gap between creative intent and automated visual synthesis. We first address the challenge of compositional alignment in text-to-image generation. This includes T2I-CompBench(++) for systematic evaluation of multi-attribute prompts, and GenMAC, a multi-agent collaborative framework that enables precise semantic orchestration. Second, we target the lack of cinematic artistry in video generation via Filmaster, a system that injects professional camera language and rhythm into the synthesis process using real-film references. Finally, we tackle spatio-temporal inconsistency in dynamic sequences using CineScene, which leverages implicit 3D-aware representations to preserve scene consistency across camera trajectories. Together, these contributions pave the way for next-generation controllable tools in digital cinematography and professional content creation.
Speaker
Ms Kaiyi HUANG
Department of Electrical and Computer Engineering
The University of Hong Kong
Biography of the Speaker
Kaiyi Huang received her BSc degree and MSc in Electrical Engineering from the Shanghai Jiao Tong University, and a double MSc degree from Ecole Centrale Paris (joint program). She is currently pursuing a PhD in the Department of Electrical and Computer Engineering at the University of Hong Kong. Her research focuses on the image/video generation, film generation and world model.
Organiser
Prof Xihui LIU
Department of Electrical and Computer Engineering, The University of Hong Kong
All are welcome.

