VibeVoice-Large

Create 90 minutes long posts with up to 4 distinct speakers.

Create 90 minutes long posts with up to 4 distinct speakers.

VibeVoice is a novel framework designed for generating expressive, long-form, multi-speaker conversational audio, such as podcasts, from text. It addresses significant challenges in traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and natural turn-taking.

[View Product]

Discover more from NextBigWhat

Subscribe now to keep reading and get access to the full archive.

Continue reading