ByteDance Suspends Seedance 2 Feature That Turns Facial Photos Into Personal Voices Over Potential Risks
1 18hackingbear writes: China's Bytedance has released Seedance 2.0, an AI video generator which handles up to four types of input at once: images, videos, audio, and text. Users can combine up to nine images, three videos, and three audio files, up to a total of twelve files. Generated videos run between 4 and 15 [or 60] seconds long and automatically come with sound effects or music.
Its performance is unfortunately so good that it has forced the firm to block its facial-to-voice feature after the model reportedly demonstrated the ability to generate highly accurate personal voice characteristics using only facial images, even without user authorization.
In a recent test, Pan Tianhong, founder of tech media outlet MediaStorm, discovered that uploading a personal facial photo caused the model to produce audio nearly identical to his real voice -- without using any voice samples or authorized data. [...]
1 comments
Re:Are there any examples? (Score: 5, Interesting)
by mattr ( 78516 ) on Wednesday February 11, 2026 @02:01AM (#65981690)
Not an expert in this area. But apparently it is a thing. Funnily enough the feature they are worried about is actually a security attack... ha ha. Welp, this cat is out of the bag unfortunately, so now just the criminals will have it.
1. Foice - Generate voice based on an image as an attack on voiceprint systems
https://www.usenix.org/system/... [usenix.org]
2. Speech2Face - the reverse process. https://speech2face.github.io/ [github.io]
3. Predict physical attributes from voice with ML
https://www.researchgate.net/p... [researchgate.net]