Bagel

October 29, 2025

by Leonardo

1. Scalable Generative Cognitive Model (BAGEL)

BAGEL adopts a Mixture-of-Transformer-Experts (MoT) architecture comprising two transformer experts—one dedicated to multimodal understanding and the other to multimodal generation.

Figure 2: Causal mask in BAGEL during training.

2. LightBagel

Figure 3: Illustration of Shallow Fusion and Deep Fusion.

References

Emerging Properties in Unified Multimodal Pretraining
LightBagel: A Light-weighted, Double Fusion Framework for Unified Multimodal Understanding and Generation

🔒 Access Restricted

Access Control

Bagel

1. Scalable Generative Cognitive Model (BAGEL)

2. LightBagel

References