November 4, 2019
Pushing the boundaries of mobile VR with stunning environments
In this tech blog, we will explain how we heavily optimize our static environments so that we can spend the expensive pixels where they add the most, the interactive and up close objects in the scene. This may sound trivial, but in practice often contradicts with the creative and iterative workflow of our studio and, we assume, a lot of other studios. We will explain our ideas behind optimization and how we build upon the tools in UE4 to create highly optimized environment content without restricting the workflow required to create awesome experiences. We’re going to highlight Time Stall as an example in this post but a lot of these optimizations have been implemented in our other titles including National Geographic: Explore VR (Oculus Quest), Coaster Combat and Pet Lab (Oculus Go). Our studio mission is to push the boundaries of what is deemed possible in VR. We strive for high-quality visuals on a solid technical foundation. In order to meet these ambitious goals, we defined an internal optimization philosophy, which is built on the following principles:
- Optimization work needs to be automated as much as possible.
- Optimization work needs to be non-destructive and allow for artistic freedom and iterations.
- Optimization is part of our workflow. We strive to be performant early in the project, proven by regular automated tests, and aim to keep it that way.
It isn’t news that the key focal points for VR optimization include the following items. They are the same for mobile VR; it's just that the budgets are a tiny bit tighter.
- Shader Complexity: We draw a lot of pixels, sometimes multiple times, so make them cheap whenever possible.
- Draw call count: Changing state is expensive on both GPU and CPU. Limit the changes (We aim for around 150 on Quest).
- Triangle count: The raw vertex data is, especially on mobile VR, a limiting factor. We aim at roughly 150k.
During the development of Landfall, we started to look for ways to optimize the static scene. Quickly, we stumbled upon the Hierarchical Level of Detail (HLOD) system available within Unreal Engine. It can batch several objects together to reduce draw calls, and it can also bake down Materials into a new single Material. This sounded like the perfect match for what we wanted to achieve with our optimization workflow improvements.
Here is an example of the HLOD clusters in Time Stall to show how much performance was gained.
The HLODing process generally consists of two phases:- Generating clusters
- The process of grouping actors in the scene to form a Proxy Mesh.
- It can be done manually or automatically.
- We've changed quite a few things in the automatic cluster generation, which we'll highlight below.
- Building proxy meshes
- The process of merging the source meshes (and Materials) to be used for rendering the Proxy Mesh.
- The merged mesh is optimized in various ways:
- Separate meshes are merged into one, reducing the number of draw calls.
- Material Merging. Reduces the number of Materials and simplifies them.
An example of the proxy meshes and Material swapping in action.
Orange: HLOD level 2. Number of Actors: 1
Blue: HLOD Level 1. Number of Actors: 2
Green: HLOD Level 0. Number of Actors: 8
The options provided by the HLOD system provide a good start for anyone who wants to use it for optimizations. However, during the development of our titles, we continually improved and extended the HLOD system to do all of the above but yield better, faster results, and at higher quality. In the sections below, we'll explain what we've changed in the process and why.
Orange: HLOD level 2. Number of Actors: 1
Blue: HLOD Level 1. Number of Actors: 2
Green: HLOD Level 0. Number of Actors: 8
Improved Automated Cluster Generation
Cluster generation is the first step of the HLOD process. Here, static mesh actors are selected to be clustered into a proxy mesh. Initially, our art team did all the clustering by hand for three reasons. Firstly, the automated clustering delivered unexpected or undesired results. Additionally, we found that creating the clusters was rather slow and time consuming. Finally, the amount of parameters to control the clustering was deemed too limited. For example, by default, you aren't able to differentiate the different types of shader models of Materials, which caused transparent and masked Materials to be baked down as an opaque Material.Over time, we made the following improvements to the clustering system.
- Speed improvement of cluster generation by implementing an alternative clustering algorithm (for the curious, we settled on hierarchical agglomerative clustering).
- Added cluster constraints to gain more control over the type of Materials that are clustered together.
- Cluster constraints could be Material type, and even static switch values of Materials.
- Added more cluster generation options.
Added Visibility Culling
One of the bigger improvements we made was the ability to remove mesh triangles that would never be visible to the player. In our games, we have a defined play-space that the player is constrained to. The play-space is represented with one or multiple volumes or even spline through a level. During the creation of HLOD assets, the tool will remove all the triangles that are never visible from the play-space.
Top: Diner Scene from Timestall before triangle culling.
Bottom: The same scene with invisible triangles culled.
Improved Atlas Methods
During the Material merging, textures get put into a texture atlas. By default, the HLOD system uses a grid-based solution. It is possible to get different resolutions per packed asset, but this is based on texture size of the asset. We noticed that sometimes, some objects used in the background get a lot of texture space allocated, because they happened to have a big texture of any kind in their Material. The big texture map could have multiple reasons: Combined assets in one big texture map, or a prop designed to be close to the play-space was used in the background as well. In any of these cases, the asset would get a lot of space in the combined atlas.The volumes we used for Triangle Culling offered us the ability to pack the atlas in a better way. Here's a list of improvements we made on the atlasing side.
- Atlas is created after Triangle Culling
- UVs are regenerated to gain optimal packing
- Objects closer to the play-space get more resolution than objects further away
- Lightmap resolution is also taken into consideration when defining the size of an object in the packed texture
On the left, Unreal's default Grid Layout. Each mesh is allocated a square region in the atlas texture.
On the right, our Weighted UV atlasing layout.
Material Baking extensions
Another change we made to the Material merging process was to provide more options for baking textures to the atlas. The system offered the ability to bake down and combine textures, but it was limited to the PBR workflow. Baking down the final output into one texture was not possible. We implemented a system that allows the artists to control what got baked back into the final texture.- Bake Base Color
- The default UE4 baking method
- Only combine the base color input into a new generated texture
- Keep the PBR shading
- A combined Normal map, and packed Metallic, Roughness, and Ambient Occlusion map were usually also baked down with this option
- Mostly used for objects close to the play-space that still required dynamic shading
- Bake Diffuse
- Bake the lightmap and the Base Color into a single texture map
- Bake Diffuse and Specular
- Bake the Lightmap, Base Color, and all shading into a single texture map. This is very useful for objects in the distance.
- The location where the specular shading is baked from is defined by an actor in the scene
- The result is one single texture map that can be put in an unlit Material. This is the cheapest way of rendering
- Bake Diffuse, Specular, and Fog
- The same as the previous option, but also includes the exponential height fog
On the left, HLOD level 0. It uses the original PBR Materials. The individual meshes are combined into a single new mesh where invisible triangles are culled. This LOD level is only shown when the player stands close to it.
Middle: HLOD Level 1. This mesh has a very simple Material, where all the lighting and shading has been baked into the texture.
Right: HLOD Level 2. Like HLOD level 1, this has a simple Material with only one texture map input. All the lighting and shading is baked in.
It's been over two years since we began extending the HLOD system to our specific needs. This has given us enough time to evaluate the pros and cons of the results, and the impact on our development process.
On the positive side, it hits almost all our initial goals:
- The process is non-destructive as it can (re)generate new content based on the original setup. Making adjustments usually just requires running the HLOD process again.
- Having this process work inside the Unreal codebase, e.g. the editor and commandlets, has a lot of benefits. It integrates well with all kinds of properties set within the editor. Additionally, we can repurpose existing code and use and extend existing code paths.
- It is almost a fully automated process, which is pretty sweet.
There are some points that require attention in the future:
- The application footprint increases, sometimes significantly. Baking "all the things" back into a set HLODs does wonders for performance as all we have to render is a diffuse texture. However, all these assets get their own uniquely baked texture atlases. Be careful with your package size and loading times because of this.
- Currently, it still requires an initial manual setup; it’s mainly a per-project configuration. It also needs some initial assets for this setup to make sense.
- The final and perhaps major drawback is the amount of changes we have to maintain. Fortnite brought a lot of improvements to the HLOD system which at times conflict with our own optimizations. As this is a core system in UE4, it comes with the costs of a lot of changes that we have to merge with our changes with almost every engine upgrade.
Unreal is an amazing engine to work with and thanks to the open-source license, we're able to make adjustments to the engine in order to bend it to our needs!
We’re also always looking to hire talent that can help us build even better UE4 experiences. If you’d like to work in Amsterdam on challenging VR and AR titles with us, have a look at our job openings.
As noted above, here’s reference material for developers who want to optimize for the Oculus Quest:
- UE4 Performance and Profiling | Unreal Dev Day Montreal 2017 [October 9, 2017]
- Optimizing Oculus Go for Performance [March 22, 2018]
- One Mesh To Rule Them All [April 3, 2018]
- Tech Note: Profiling & Optimizing on Mobile Devices [May 30, 2018]
- OC5 Talk - Porting Your App to Oculus Quest [September 26, 2018]
- Oculus Connect 5 | Reinforcing Mobile Performance with RenderDoc [September 26, 2018]
- Understanding Gameplay Latency for Oculus Quest, Oculus Go and Gear VR [April 2019]
- Developer Perspective: UE4 Logging and Console Commands for Mobile VR [July 2, 2019]
- Developer Perspective: Improving Memory Usage and Load Times in UE4 [July 25, 2019]
- Developer Insights: How to Develop with Vulkan for Mobile VR Rendering [August 2, 2019]
- Common Rendering Mistakes: How to Find Them and How to Fix Them [August 22, 2019]
- Official Oculus documentation on Rendering in Unreal Engine 4 [September 4, 2019]
- Official Unreal Engine 4 documentation on HLOD [September 4, 2019]
- OC6 Talk - Using Vulkan for Mobile VR [September 26, 2019]