17 de octubre de 2019
Meet Vincent: a real-time digital human created in-house by a team of just five
Harnessing the power of Unreal Engine, these teams have proven that creating photorealistic digital humans using real-time rendering, motion capture, and facial capture is achievable, as evidenced by Hellblade: Senua's Sacrifice (2016), Meet Mike (2017), Osiris Black (performed by Andy Serkis, 2018), the demo presentation for Siren (2018), and the awe-inspiring DigiDoug (2018).
The success of the likes of 3Lateral and Cubic Motion motivated Korea-based creative studio Giantstep to conceive of their own digital human: Project Vincent. After much deliberation and research into potential joint partnerships, the studio’s R&D arm GxLab took on the challenge of developing the technology in-house. The team proceeded with just five artists drawn from their existing talent pool, including VFX artist Daisuke Sakamoto, and quickly identified three key challenges they’d need to overcome.
Skin and hair: the visual technology challenge
Sungku Kang, Director of Research and Development at Giantstep, explains that first technical issue the team faced lay in ensuring that they had access to the necessary shading technology—and that this is one of the reasons they chose Unreal Engine: “Unreal Engine's material editor and powerful skin shading features such as Transmission and Dual Lobe Specularity in the skin lighting model played a major role in boosting Vincent's skin quality up to offline rendering levels without any additional development,” he says. “An exorbitant amount of additional time, manpower, and cost would have been required to develop these features if they had not been supported.”The team made good use of Unreal Engine’s online learning courses to get to grips with the technology’s development processes and properties. Its developers also took the initiative to access the source code when necessary to understand the fundamentals of the technology, identifying exactly what the features do, what formulas are being used, and what the properties are. “By leveraging all the available information, we were able to accurately understand how changing different parameters would affect the outcome, and use that information with more precise intent, rather than entering random numbers and leaving the results to chance,” says Kang. “Also, using material instancing to make immediate parameter changes and preview the results was very useful. This dramatically reduced the time spent in the final fine-tuning stage.”
The Meet Mike free asset sample provided by Epic Games also gave the team hints about the technologies they’d require to create the visuals for Vincent. Exploring the asset, they gathered important information including how to express fine facial hairs and what kind of data is necessary for the final hair shape.
They learned that additional data was required on top of the basic data from DCC tools such as Maya or 3ds Max. This provided clues on how to effectively structure the data in advance. “By using all this information, our developer was able to reduce development time by clearly setting development objectives for Maya plugins or scripts,” says Kang.
Using the data from Meet Mike as a template gave the team confidence that the proprietary tools they developed for their own project would work. A case in point is the Maya hair export tool they created. Initially developed based on the basic data in Meet Mike, the team systematically expanded the tool to include features unique to Vincent. With a solid and proven template to work off, the team was able to avoid the time-consuming R&D process into all the possible outcomes that would have been required if they were starting from scratch. As Kang points out, “even after the R&D stage, there would have been constant doubts as to whether the selected method was correct leading up to completion, resulting in a far less stable and much slower development process.”
Similarly, on top of the Transmission and Dual Lobe Specularity, Meet Mike’s skin shaders offered useful hints on how to effectively divide the areas of the face and blend them. Using this information, the team was able to create a tool that sets and exports the facial area information in Maya.
“Because we were confident in the minimum features required for this tool and how the facial area's hierarchy should be structured, we were able to save a lot of time in designing and developing the Maya facial tool,” says Kang. “Leveraging Unreal Engine’s features and samples, the developer laid down the foundation for Vincent and then created the final look by developing extra features that met our additional needs.”
Creating lifelike facial expressions in real time
The second technical issue the team faced related to choosing and implementing technologies for effective facial expressions. After evaluating many solutions, they found most were developed for offline rendering by default, were of low, video-game quality, or used the facial capture feature offered in iPhone, which limited the level of detail or the ability to customize. This steered their direction toward researching the technology that was the nearest fit, and then developing their own solution in-house based on that.Their first task was to assess which candidate offered an accurate three-dimensional location data while providing a high degree of freedom for the data format. They settled on Vicon Cara as the facial animation capture system that best met their needs. Vicon Cara is a head-mounted camera apparatus that uses four cameras with global shutters in the front to capture marker locations. Using this device, marker locations can be set with great flexibility, and the marker locations can be translated into three-dimensional data with extremely high accuracy.
“Most solutions at that time only scanned two-dimensional location information data of facial landmarks; the fact that Cara was able to capture three-dimensional data made it a good choice,” says Kang. “However Cara is for offline rendering, making real-time data transfer impossible out of the box. To resolve this, we decided to create a neural network that uses deep learning to infer 3D marker locations from 2D images.”
First, the team added one additional camera to Cara. When capturing the facial movements of the actor, video from the additional camera was separately saved and used as the learning data, based on a method of establishing the correspondence of the video footage with the calculated three-dimensional marker location value.
From this, the developers created artificial intelligence that can infer the 3D marker locations with high accuracy from a 2D image input. Machine learning was further leveraged in as many areas as possible, including for emphasizing facial expressions and setting up blend shape weight. “Giantstep gained lots of experience in machine learning through this process. We were encouraged by the fact that the effective use of machine learning can help a small team overcome its limitations,” says Kang.
Optimizing the digital human pipeline
The last major technical hurdle the team faced was in securing technology for optimizing the pipeline. Given the small size of the team, they knew right away efficiency would be key. The most important initiative was to reduce the burden of manual iteration and to facilitate automation as much as possible. “This is another major reason why Unreal Engine was the best choice,” says Kang. “Unreal Engine's support for Python and the convenience that brings to creating plugins allowed us to easily resolve iteration issues for designers in many areas, as well as to easily develop relevant tools.”A case in point is the work the team did on photorealistic facial expressions, which involved trying many combinations for the shape and number of areas into which the face is divided, followed by a preview of the results as quickly as possible. Because changing the facial areas required textures to be newly combined, assets had to be re-imported, and material composition details had to be changed according to the configurations. “If the whole process were to be manually handled, an artist would have to spend a full day on simple iterations in order to preview the results,” says Kang.
Instead, using Python scripting in Unreal Engine, the team was able to automate the management process of laborious tasks such as importing assets. The material composition configuration plugin enabled the automation of texture combination, generation, and material configuration without modifying any of the facial material nodes whenever the facial area information changed, further saving valuable artist time.
“Tasks which previously required a full day whenever the data changed were automatically completed in just a few minutes,” continues Kang. “By maximizing automation through various Maya plugins and scripts, and developing Unreal Engine plugins, Project Vincent was completed in a short time frame, despite the small size of the team.”
Beyond movies: the future of real-time digital humans
When Project Vincent began, the family of existing photorealistic real-time digital humans was limited to a few key members, most of which had been co-developed in partnership with Epic. Vincent is part of the second generation of real-time digital humans, having been developed fully in-house by a small team. Giantstep’s success led to an invite to the Epic booth in SIGGRAPH 2019, where they presented a session showcasing their work.As others continue to build on the foundations laid down by the likes of Digital Domain, Giantstep, 3Lateral, and Cubic Motion, real-time digital human development technology is expected to change the production of movies and other media and entertainment that traditionally relied on offline rendering technology. It’s even predicted to move into other industries. Synced with AI speakers or assistants, the technology could enable users to experience more intuitive AI services. We could start to see high-quality, hyperreal character creation used for effective cutting-edge marketing and promotional activities. Giantstep intends to be a key part of this story, committed to advancing their technology and with plans to reveal their next development results as soon as possible.
Interested in creating your own digital human? Download Unreal Engine and take a look at the the free Meet Mike asset sample to get started.