RPG Seminar – Bridging Visual Generation and Understanding in Native MLLMs with a Unified Visual Tokenizer
Zoom Link: https://hku.zoom.us/j/98499142544?pwd=zvVs3BqWzIzCA071Dqq2rYW7vIAqj7.1 Abstract The advent of GPT-4o highlights the immense potential of Multimodal Large Language Models (MLLMs) with native visual generation capabilities. These unified models offer precise control in […]
