Microsoft DragNUWA pushes the bar in AI video with trajectory-based generation

Key Points:

  • Microsoft AI has introduced DragNUWA, a model that combines text, images, and trajectory for enhanced control over video production.
  • The collaboration with Stability AI’s Stable Video Diffusion model in DragNUWA’s latest version heralds a potentially groundbreaking advancement in video generation and editing technology.
  • The AI community is eagerly anticipating the real-world performance of DragNUWA and its potential to advance controllable video generation in creative applications.


Several AI companies, including Stability AI and Pika Labs, have been making waves in the video generation space. Microsoft AI has entered the arena with DragNUWA – a model that introduces trajectory-based generation to enhance control over video production. By open-sourcing the model, Microsoft is inviting the community to explore and experiment with it.


The fusion of text, images, and trajectory in video generation has been a challenge, often resulting in limited control over the desired output. Addressing these limitations, Microsoft’s DragNUWA aims to provide more granular control by combining all three factors. This innovative approach allows users to define text, images, and trajectory inputs to precisely control camera movements, object motion, and more in the output video.


In the latest development, Microsoft has released the 1.5 version of DragNUWA, showcasing its collaboration with Stability AI’s Stable Video Diffusion model. This technology promises to simplify video generation and editing, potentially revolutionizing the creative AI landscape.



Prompt Engineering Guides



©2024 The Horizon