How Cubic Motion is achieving new levels of photorealism with digital humans like Siren
With each passing year, the uncanny valley becomes shallower and shallower. Continual technological advancements and innovation have produced the first digital humans that can convincingly stand in for the real thing without obvious tells or deficiencies. And it’s only the beginning: these hyper-realistic, real-time avatars will only become more and more human through further refinement.
Cubic Motion is helping to lead that charge. The Manchester-based company has been producing standout facial animation for video games and other media for years based on its computer vision technology—a pairing of software and algorithms designed by a talented team led by experienced Ph.D. scientists.
In collaboration with fellow industry leaders Epic Games, Tencent, 3Lateral, and Vicon, Cubic Motion has helped develop Siren, the highest-fidelity digital human created to date. Driven in real-time by a live actress, Siren looks, sounds, and acts just like the real thing, and the potential applications are seemingly endless—from live characters in video games to broadcasts and more. Here is a look at the technology behind this fascinatingly lifelike creation.
Meeting the challenge
Siren began as a challenge from Epic Games and Tencent to showcase the immense capabilities of Epic’s Unreal Engine 4, the latest edition of the popular game engine that has powered hundreds of major games, from Gears of War to Fortnite. As the technology has become much more flexible and sophisticated, Epic Games has seen opportunities to expand its usage beyond video games and into other applications and media.
Epic sought to create a proof-of-concept digital human, to demonstrate just how far Unreal Engine 4 can be pushed today to render lifelike characters in real time. To do so, the company needed to work with some of the top specialists in capture technology and animation.
Motion capture camera company Vicon provided its Shōgun cameras and technology to stream the live performance from actress Alexa Lee into the game engine, while 3Lateral’s detailed scans and real-time facial solver Rig Logic helped bring Siren’s face to life.
Cubic Motion’s role, then, was to provide the missing link between those two elements via its computer vision technology, which tracks more than 200 facial features of the actress in real time at more than 60 frames per second, transferring even the subtlest facial movements of the actress to the digital character without any perceptible delay.
It’s what transforms impressive-looking scans and a complex rig into a truly believable digital human. Cubic Motion’s ability to transfer the performance of human actors to digital assets with an unrivaled combination of speed, accuracy, and efficiency delivers a new level of realism in high-quality animated faces.
Computer vision at work
Translating a live performance into a live, real-time rendered character requires capturing the performer, intricately tracking his or her facial features to measure what the face is doing, solving that data onto the CG character controls, and then ultimately streaming the results to be rendered in the game engine.
That’s the simplified, top-level version, of course—completing this process is hardly simple or straightforward, and requires extensive planning, work, and iteration to ensure a successful result.
It’s essential to have a real-time tracking model that runs at 60 frames per second and remains robust in a real-time environment. We also must calibrate the camera system, so that we can align the cameras and obtain 3D information. 3Lateral provided the brilliant CG asset for Siren, but then we needed to create the solver to drive the character in real time.
Cubic Motion’s machine-learning algorithms are based on years of experience and extensive work in the field of computer vision, allowing us to precisely track facial features even without physical markers. It requires us to train the tracker to capture the right information, such as separating parts of the face, digitally marking facial elements, and allowing for the level of dynamic flexibility of a human face.
We also track from a side view to supplement the forward camera, which presents its own challenges. It is a critical component, however, as the side view provides very good information about rolling in and out of lips, dimples, and jaw movement.
Just like the real thing
There are plenty of additional details that make a significant impact in the overall look and feel of Siren, from scanning teeth casts from the actress to adding peach fuzz on her face and ever-more realistic skin shaders from the Unreal Engine team. They’ve also enhanced the backscatter algorithms and introduced dual specular lobes to lifelike eyes—one of the most crucial details in selling the performance of a digital human.
Siren’s impact was immediately felt following its debut at GDC and FMX 2018, as the impressive performance showed what is possible with real-time rendered characters and the potential for live performances streamed into game engines.
Via collaboration with leading industry partners and tapping on its own considerable computer vision experience, Cubic Motion is helping to define the future of digital humans—and will continue to enhance and innovate until the uncanny valley ceases to exist.