So basically the old open-source live-portrait hooked up with audio output. Was very glitchy and low res on my side. btw: Wondering if it's legal to use characters you don't have rights to. (how do you justify possible IP infringement)
One way this differs is in the model architecture. Our approach relies on a single pass of a diffusion transformer (DiT), whereas Live Portrait relies on intermediate representations and multiple distinct modules. Getting a DiT to be real-time was a big part of our work. Quoting the Live Portrait paper: "Diffusion-based portrait animation methods [...] are usually [too] computationally expensive." As you hinted at, we had to compromise on resolution to get there (this demo is 256x256), but we think that will improve over time.
Not relying on facial keypoints means we can animate a wide range of non-humanoid characters. My favorite is talking to the Doge meme.
> Wondering if it's legal to use characters you don't have rights to. (how do you justify possible IP infringement)
IP law tends to be "richer party wins". There's going to be a bunch of huge fights over this, as both individual artists and content megacorps are furious about this copyright infringment, but OpenAI and friends will get the "we're a hundred-billion-dollar company, we can buy our own legislation" treatment.
e.g. https://www.theguardian.com/technology/2024/dec/17/uk-propos... a straightforward nationalisation of all UK IP so that it can be instantly given away for free to US megacorps.