- Microsoft has demonstrated their latest AI research with a model called VALL-E.
- VALL-E can simulate a person’s voice from just a three-second audio sample.
- The speech can match not only the timbre, but also the emotional tone and acoustics of the speaker.
- VALL-E could be used for customized or high-end text-to-speech applications, though it carries risks of misuse.
Microsoft’s VALL-E AI mimics voices from short audio samples

[Via]