Adept Fuyu-Heavy introduces the world’s third-most-capable multimodal model, designed for digital agents, excelling in multimodal reasoning, and maintaining strong performance on traditional benchmarks. Adept aims to build Useful General Intelligence and has successfully scaled up the Fuyu architecture, overcoming challenges associated with image modeling. Fuyu-Heavy outperforms other models in various benchmarks, showcasing both language modeling and multimodal prowess, and is set to power the enterprise product. Additionally, Fuyu-Heavy demonstrates impressive capabilities in long-form conversations and complex calculations.