Transhumanism & Artificial Intelligence
Microsoft AI Releases
Misleading Deepfake Tech
Microsoft Research Asia is forging on with a new transhumanist program called VASA that creates “lifelike talking faces of virtual characters with appealing visual effective skills (VAS), given a single static image and a speech audio clip.”
The artificial intelligence (AI) division of Microsoft in Asia has been working on the program by compiling real single images of people, real audio, and in many cases various control signals such as the movements of people’s faces as they talk. Using all this data, Microsoft Research Asia is generating moving images of fake people that could someday replace actual newscasters and podcasters – at least those with so little personality and soul that robots could basically do their job.
“Our premiere model, VASA-1, is capable of not only producing lip movements that are exquisitely synchronized with the audio but also capturing a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness,” the research team wrote in a paper about these latest developments.
“The core innovations include a holistic facial dynamics and head movement generation model that works in a face latent space, and the development of such an expressive and disentangled face latent space using videos. Through extensive experiments including evaluation on a set of new metrics, we show that our method significantly outperforms previous methods along various dimensions comprehensively.”
High-quality deepfakes
The methods used by Microsoft Research Asia to develop these sort-of human-like deepfakes produce high-quality video coupled with realistic facial and head dynamics. Such video can be generated online at 512×512 with up to 40 frames per second (FPS) and negligible starting latency.
In layman’s terms, the technology is so believable that many people would probably fall for it and think these are real people on their screens. Only the most discerning will be able to tell that something is not quite right with what they are seeing.
“It paves the way for real-time engagements with lifelike avatars that emulate human conversational behaviors,” Microsoft Research Asia proudly claims.
If you are interested in seeing a few examples of these creepy AI moving and speaking images, you can do so at Microsoft.com.
“Our method is capable of not only producing precious lip-audio synchronization, but also generating a large spectrum of expressive facial nuances and natural head motions,” the company says.
“It can handle arbitrary-length [sic] audio and stably output seamless talking face videos.”
The purpose of the research is to unleash an entire society or army of virtual AI avatars, Microsoft says, but don’t worry: it’s all “aiming for positive applications,” the company insists.
“It is not intended to create content that is used to mislead or deceive,” reads a disclaimer on the site. “However, like other related content generation techniques, it could still potentially be misused for impersonating humans.”
“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection. Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”
The alleged positive use cases for such technology read like a parody, with Microsoft claiming that it can create “educational equity” while “improving accessibility for individuals with communication challenges, offering companionship or therapeutic support to those in need …”
The powers that be are trying to make humans obsolete by unleashing human-impersonating AI and other tech-based abominations. Learn more at Transhumanism.news.
This article was first published in Transhumanism News on April 26, 2024, under the title “Microsoft AI releases scary new deepfake technology that could make many newscasters, podcasters obsolete”
Read more articles by Ethan Huff here
The artificial intelligence (AI) division of Microsoft in Asia has been working on the program by compiling real single images of people, real audio, and in many cases various control signals such as the movements of people’s faces as they talk. Using all this data, Microsoft Research Asia is generating moving images of fake people that could someday replace actual newscasters and podcasters – at least those with so little personality and soul that robots could basically do their job.
“Our premiere model, VASA-1, is capable of not only producing lip movements that are exquisitely synchronized with the audio but also capturing a large spectrum of facial nuances and natural head motions that contribute to the perception of authenticity and liveliness,” the research team wrote in a paper about these latest developments.
“The core innovations include a holistic facial dynamics and head movement generation model that works in a face latent space, and the development of such an expressive and disentangled face latent space using videos. Through extensive experiments including evaluation on a set of new metrics, we show that our method significantly outperforms previous methods along various dimensions comprehensively.”
High-quality deepfakes
The methods used by Microsoft Research Asia to develop these sort-of human-like deepfakes produce high-quality video coupled with realistic facial and head dynamics. Such video can be generated online at 512×512 with up to 40 frames per second (FPS) and negligible starting latency.
In layman’s terms, the technology is so believable that many people would probably fall for it and think these are real people on their screens. Only the most discerning will be able to tell that something is not quite right with what they are seeing.
A slew of speakers, all of them fakes, spells the end of trust in what we watch on movies & television
If you are interested in seeing a few examples of these creepy AI moving and speaking images, you can do so at Microsoft.com.
“Our method is capable of not only producing precious lip-audio synchronization, but also generating a large spectrum of expressive facial nuances and natural head motions,” the company says.
“It can handle arbitrary-length [sic] audio and stably output seamless talking face videos.”
The purpose of the research is to unleash an entire society or army of virtual AI avatars, Microsoft says, but don’t worry: it’s all “aiming for positive applications,” the company insists.
“It is not intended to create content that is used to mislead or deceive,” reads a disclaimer on the site. “However, like other related content generation techniques, it could still potentially be misused for impersonating humans.”
“We are opposed to any behavior to create misleading or harmful contents of real persons, and are interested in applying our technique for advancing forgery detection. Currently, the videos generated by this method still contain identifiable artifacts, and the numerical analysis shows that there’s still a gap to achieve the authenticity of real videos.”
The alleged positive use cases for such technology read like a parody, with Microsoft claiming that it can create “educational equity” while “improving accessibility for individuals with communication challenges, offering companionship or therapeutic support to those in need …”
The powers that be are trying to make humans obsolete by unleashing human-impersonating AI and other tech-based abominations. Learn more at Transhumanism.news.
Read more articles by Ethan Huff here
Posted May 1, 2024
______________________
______________________
Volume I |
Volume II |
Volume III |
Volume IV |
Volume V |
Volume VI |
Volume VII |
Volume VIII |
Volume IX |
Volume X |
Volume XI |
Special Edition |