Multi-Modal Large Language Models

Collections:

Our Work

*Equal Contribution and Corresponding Author
Currently none.

Paper Reading

  • 2025.10.16: VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation. paper | github #Tokenizer #Unified-Model