credkellar-boop/Mona
Mona is an open-source, long-form video generation framework built on advanced Spatio-Temporal Diffusion Transformers. While standard models cap at short clips, Mona is engineered for narrative coherence up to 30 minutes. Using streaming VAEs and memory-augmented transformers, it bypasses VRAM limits to create unified, cinematic stories.
GitHub repository with 5 stars and 1 forks.
Language: Python