Image Courtesy of Metaverse Entertainment

Meet MAVE: the virtual K-pop stars created with Unreal Engine and MetaHuman

Jinyoung Choi
Created by combining the technology of Netmarble F&C and the sensibility of Kakao Entertainment, Metaverse Entertainment is a multimedia content production company that has the capabilities and infrastructure to produce movies and dramas, create a wide range of content from VFX to virtual humans, and expand such IPs to various content.
Virtual K-pop band MAVE: launched their first music video at the end of January 2023. After the release, they made their debut on Show! Music Core, one of South Korea's leading music programs, creating a new wave in the genre. MAVE: has attracted tremendous attention for their realistic characters and convincing animation, and above all, their catchy music. At the time of writing this spotlight, MAVE:'s music video recorded 21 million views and its debut live stage performance video recorded 3 million views. MAVE: is communicating with fans in various ways, including TV shows and social media.

We spoke with Director Sungkoo Kang, CTO of Metaverse Entertainment, to find out how Metaverse Entertainment utilized Unreal Engine and MetaHuman to create authentic digital humans, and how they were able to create content for multiple platforms within a short timeframe.
 

Q: I'm assuming the first step of MAVE: was to create the group members. What were your goals when you created the digital characters?

Our goal when creating the four-person virtual group MAVE: was to create appealing characters, each with a completely new appearance that didn't exist anywhere else in the world. To create an attractive character, you can't just focus on appearance; it also has to include a wide range of facial expressions for different situations. That's why we focused on building and developing a pipeline and technology to achieve this.

Q: I heard that you used MetaHuman to create your characters. Can you explain why?

As I mentioned, as well as having an attractive appearance, it's very important for a compelling character to have a range of detailed facial expressions for different situations. However, creating and modifying such facial expressions is a time-consuming and expensive task, because it always involves rigging and modeling and requires iterative revisions and verification. That's why Epic’s MetaHuman technology, which has been developed with the help of decades of experience in creating digital humans, was the perfect choice. It was a crucial part of building the pipeline for our characters.

With the MetaHuman facial rig, we were able to easily create the facial expressions we wanted, and share animations between characters. Also, we were able to focus on R&D (e.g. improvement of rig control) by referring to the Rig Logic: Runtime Evaluation of MetaHuman Face Rigs white paper released by Epic Games. In addition, the high level of compatibility with external tools such as NVIDIA's Audio2Face, Live Link Face App for iPhone, Faceware, and FACEGOOD allowed us to apply MetaHuman animation and drastically reduce the actual production time by sharing the underlying mesh topology, UV, joint structure, and controls.

Q: Why did you choose Unreal Engine along with MetaHuman?

When we were planning MAVE:, we put a lot of thought into how our project should be positioned, and the sort of activities we’d want the virtual band to take part in. The productivity of our content was the most important consideration. A lot of activities means a lot of content production, and that requires production efficiency. Otherwise, we may have to make a compromise on visual quality. So we chose Unreal Engine not only for efficiency, but also for its real-time rendering quality. We used Unreal Engine to extend the boundary of MAVE:’s activities in various areas, including producing a transmedia music video within a short time, social media activities, and upcoming TV shows and commercials.

Social media is an important channel to engage with and create a bond, and in order to make this happen, it requires various forms and amounts of high-quality content. This is why we chose Unreal Engine above other tools. By using Unreal Engine, we were able to create various forms of content including photorealistic images, and videos to engage with fans across multiple social platforms.

Q: What kind of pipeline was used to create each character of MAVE:?

The MAVE: creation team is made up of talented individuals from a variety of backgrounds such as gaming and film industries, which means the team members have all used different DCC tools depending on their specialty. For example, team members from the gaming industry have a good understanding of real-time rendering, while those from the M&E industry have expertise in video media production, so we've built a special pipeline to maximize the synergy between each team member.

The pipeline consists of character planning and character creation. Character creation is divided into detailed steps such as modeling, facial expression creation and rigging, hair creation, and body calibration.

Character planning is the stage where each character's appearance is designed. This process was conducted in close collaboration with the experts from Kakao Entertainment, who have a great deal of experience in successful K-pop band planning. However, in traditional K-pop bands, the members are selected from an existing trainee pool and their appearance is completed with make-up and styling. But for virtual bands, we have to create virtual humans as a completely new and attractive person, not only in their appearance, but also with detailed facial expressions, movements, speech patterns, and so on.

To bridge this gap and provide a working environment as close as possible to the planning team's original one, the production team built a pipeline that uses a GAN network to automatically generate target images and manually modify or combine eigenvectors. This enabled the planning team to select an existing character and modify its parameters to fit the plans, instead of having to create a character's appearance from scratch. The planning team helped us by sharing their insights for the formula of a successful K-pop band that they’ve built over the years.
 
Image Courtesy of Metaverse Entertainment
Image composite using GAN network

Since a facial model is directly affected by the style of a character, we worked with professional stylists experienced in K-pop bands' outfits and hair to determine successful styling before proceeding with modeling in the face definition step. If we scanned a real person, we would have been able to create a realistic appearance much faster, but there are some problems with that, like the difficulty of finding a person who looks exactly how we want and issues with portrait rights. So we created the faces of MAVE: with modeling tools.
 
Image Courtesy of Metaverse Entertainment
3D modeling of MAVE:

At the facial expression creation and modification steps, we utilized our own tool that analyzes the model and automatically generates around 800 facial expressions while employing information about the location and size of each area, muscle flow, and so on. The process is similar to features that automatically generate facial expressions when you enter a basic mesh type, like Mesh to MetaHuman plugin. We developed our own tool because the Mesh to MetaHuman plugin hadn't been released yet at the time, but it helped us a lot in modifying the algorithm as needed and building an automated pipeline.

Also, we built a feature to customize unique facial expressions that reflect the character's personality, in addition to the standard facial expressions. The addition of these new facial expressions required appropriate rigging, so we automatically generated Control Rigs in Unreal Engine and set them up for the character.
Image Courtesy of Metaverse Entertainment
Process of removing wrinkles created when the eyebrows are raised, eyes are closed, and pupils are lowered.
The hair basis was created using Maya’s XGen toolset. Unreal Engine's hair rendering using grooms was real time but incredibly high quality, and we were able to save a lot of time. However, sometimes we couldn't use grooms because we needed to achieve even higher performance. In this case, we created a tool to turn groom-based hair into a card. We optimized our workflow through automation to eliminate the need to perform manual tasks when modifying and applying hair, such as having to create a binding asset if it doesn't exist during the hair-swap process.

We also applied automation to the body calibration step and utilized dozens of calibration shapes to modify the shape based on the pose. We developed a new solving algorithm to avoid the problems that can occur when using the Maya Radial Basis Function Solver (RBF), including the inability to apply hierarchy to interpolation, and the increased probability of unintended body shape when detailed settings were applied.

In addition, we utilized Unreal Engine's Physics, Cloth Simulation, and AnimDynamic nodes as well as a variety of other solutions to create natural reactions of the clothing and accessories. The team also used Unreal Engine's DMX support to successfully create a spectacular stage.
 
Image Courtesy of Metaverse Entertainment
(Left) Before applying the calibration shape to interpolate the shape of the hand (Right) After applying the calibration shape

Q: It must have been difficult to organize and create a realistic stage that incorporates the emotion of K-pop with a virtual band. What was it like?

We worked with a director who has directed actual K-pop music videos, a director of photography, a grip team, jib operators, and an actual K-pop dance team to create a music video that captures the sensibility of K-pop. We also tried to recreate and provide a stage environment identical to those used in traditional processes and circumstances so that the K-pop production team could perform to their full extent. As a part of this effort, we built a large—20m x 20m x 8m—VFX center to perform motion capture in a space as large as an actual music video set and organized it to capture not only the actors’ performance, but also the movement of the actual filming equipment, to recreate the dazzling camerawork of the music video later on.

The actors performed the choreography wearing mocap suits, and we filmed them as though it were a real-world live music performance. This camera data, which was tracked using the Mo-Sys StarTracker, was used directly in the final virtual performance in Unreal Engine, giving it a very convincing feeling. The camera angles and the actors’ motion were previewed in Unreal Engine so that the results could be checked immediately on the spot, and then recorded simultaneously in Vicon Shogun and Unreal.

The recording was also done the same way as real K-pop music videos, capturing the four members performing at the same time. We first edited cuts with the motion capture data taken with the camera on site, and then cleaned the motion capture based on the edited cuts. As a result, we were able to focus on the parts that would actually be used in the final version. We were able to acquire more natural movements because our work was based on the movement of real humans.
 

Q: Please tell me about MAVE:’s upcoming content and the future of Metaverse Entertainment.

As a virtual celebrity utilizing Unreal Engine, MAVE: is preparing a whole new level of content that will set it apart from other K-pop groups. We're really working hard, so keep an eye out for us!

We plan to further expand our business areas by using our IPs in movies, dramas, and games, but we are also going to expand with our specialties, like virtual humans and the metaverse. We'll need Unreal Engine along the way to provide a strong foundation for various content, including real-time fandom content, interactive content, and new media.

Check out the official website and social media channels for updates on MAVE:.

    Get Unreal Engine today!

    Get the world’s most open and advanced creation tool.
    With every feature and full source code access included, Unreal Engine comes fully loaded out of the box.