Hear your imagination: ElevenLabs to launch model for AI sound effects

Key Points:

  • ElevenLabs expanding portfolio with new text-to-sound model
  • Ability to generate sound effects by describing imagination in words
  • Early access signup available for potential users


ElevenLabs, a two-year-old AI startup founded by former Google and Palantir employees, has unveiled plans to introduce a new text-to-sound model. This innovative AI technology will enable creators to generate sound effects by describing their ideas in words, adding a new dimension to content creation in the era of AI-driven digital experiences.


Although the model is not yet publicly available, ElevenLabs has teased its capabilities by enhancing videos produced by OpenAI’s Sora with AI-generated sounds. The company has initiated a signup process for early access to the model, encouraging interested users to join the waitlist.


Since its establishment in 2022, ElevenLabs has focused on making audio and video content more accessible across languages and regions. The startup has previously launched text-to-speech and speech-to-speech models supporting multiple languages, catering to diverse content creation needs.


With the increasing adoption of AI tools for content creation, there is a growing trend towards entirely AI-generated content. While existing platforms like Runway and Pika generate realistic videos from text prompts, the absence of default audio has been a limitation. ElevenLabs’ upcoming text-to-sound model aims to address this gap by allowing users to effortlessly add background audio, ranging from nature sounds to human activities, to their content.


By providing a sneak peek of their latest offering, ElevenLabs aims to complement the visual elements generated by AI with realistic sound effects, enhancing the overall quality of AI-created content. Interested individuals can sign up for early access to the model and submit sample prompts to further refine the AI sound generation capabilities.


While the official launch date of the text-to-sound model remains undisclosed, ElevenLabs’ initiative aligns with the projected growth of the global AI voice generator market. Market research suggests that the market size is expected to increase significantly, reaching an estimated value of nearly $5 billion by 2032, highlighting the potential for continued innovation and competition in this sector.


As ElevenLabs prepares to introduce its pioneering text-to-sound technology, the landscape of AI speech applications is poised for further evolution, with established players and emerging startups vying for a share of the expanding market.



Prompt Engineering Guides



©2024 The Horizon