Black Forest Labs’ new Self-Flow technique makes training multimodal AI models 2.8x more efficient

via bfl.ai

Short excerpt below. Read at the original source.

To create coherent images or videos, generative AI diffusion models like Stable Diffusion or FLUX have typically relied on external “teachers”—frozen encoders like CLIP or DINOv2—to provide the semantic understanding they couldn’t learn on their own. But this reliance has come at a cost: a “bottleneck” where scaling up the model no longer yields better […]

Read at Source