Understanding Gaussian Splatting for XR Development
Gaussian splatting is an exciting new rasterization technique gaining significant attention in the extended reality landscape. It enables 3D reconstruction and rendering in virtual environments. Often, it’s described as an alternative to NeRF (we’ll look at how they compare in a moment).
The concept of Gaussian splatting has only recently begun to emerge in the XR space, following some interesting demonstrations released in 2023. Now, however, countless companies and developers are beginning to embrace this model, including Varjo (with Varjo Teleport), Unity, and Unreal.
Here, we’ll give you a beginner-friendly rundown of what Gaussian splatting is, how it works, and why it’s so beneficial to the future of extended reality development.
What is Gaussian Splatting?
Gaussian splatting is a rasterization approach for 3D rendering and reconstruction. It allows systems to render ultra-realistic images using numerous scans of an object. Content created with this method can be viewed from any angle and explored in real-time, making it a valuable solution for companies creating “digital twins.”
There are plenty of great things that set Gaussian splatting apart from other rasterization strategies in XR, from its ultra-high rendering speed to the fact that it doesn’t rely on neural networks. The technology can also be trained quickly to create smaller files to represent 3D scenes in spatial computing, metaverse, and extended reality applications.
Though the files created are pretty large, Gaussian splats generally produce higher-quality visuals.
The term “Gaussian splatting” actually comes from Carl Friedrich Gauss, who pioneered discrete probability distribution techniques in the 1800s. The technology is unique in that it represents content using blurry clouds of pixels rather than well-defined triangles and shapes.
Though the concept has been around for a while, a popular research paper published at SIGGRAPH 2023 renewed interest in the methodology, so we’re seeing many instances of it in the XR space today.
A Quick History of Gaussian Splatting
I won’t bore you with too many details about the history of Gaussian splatting. However, the core idea behind this concept actually emerged quite a while ago in a 1991 thesis created by Lee Alan Westover. He conceived “splatting” by comparing it to how snowballs hit a brick wall and spread snow across the surface.
He did develop a few algorithms to demonstrate his theory, but at the time, the tech space’s hardware couldn’t process them. That’s why the graphics industry turned to other methods for producing 3D scenes, like meshes, voxels, and point clouds.
Still, researchers began experimenting with new 3D rendering methods, such as photogrammetry techniques, which create models using overlapping images. In 2006, researchers discovered new ways to add details to projects using a process called SFM (structure from motion), which now contributes to the Gaussian splatting method.
In 2020, researchers developed the NeRFs used by many developers today, which could fill the gaps left by SFMs to create higher-quality images. These tools use neural networks to create a model of light radiance from different angles. Unfortunately, NeRFs are difficult to train and limited in enabling navigation in virtual worlds.
Finally, in 2023, a team of German and French researchers built on the innovations of NeRFs and changed how they stored data with Gaussian splatting. They then trained the model using gradient descent, introducing a new era for 3D content development.
Since then, development in the field has continued. Researchers and vendors are exploring new ways to capture and render motion, shrink file sizes, and create ever-more realistic 3D spaces. Adobe, Apple, Google, and Meta are incorporating the technique into their enterprise apps.
Developers are also rolling out plugins for popular platforms like Unreal Engine, Unity, and Nvidia Omniverse. Even consumer apps from Poly and Luma AI enable this method.
How Does Gaussian Splatting Work?
Understanding the mechanisms of Gaussian splatting can be a little tricky, so I’ll break it down into simple terms. First, developers using the Gaussian splatting techniques take images or videos of a scene from various angles, the same way you would if you were creating a digital twin.
Those images are entered into a Gaussian splatting application, which uses the “Structure from Motion” to create a point cloud from images. The system essentially estimates what a scene would look like from all different directions.
Next, each point in the cloud is converted into overlapping Gaussian splats, creating an image that looks like a selection of blurry clouds. At first, these initial Gaussians only represent color and position, using SFM data. That’s why the next step is a training process.
The follow-on training stage cycles through thousands of different calculations using gradient descent to fill in all of the extra details about how a point might exist in a space. Gradient descent is similar to a neural network, helping you to create a file that contains millions of particles representing color, position, transparency, and covariance.
When rendering is necessary, Gaussian rasterization changes each Gaussian particle into appropriately colored pixels for each view point. This process is similar to the rasterization process used to transform 3D data in other techniques. Users can then adjust Gaussian parameters to accommodate loss and apply automated densification to enhance the finished image.
Comparing Gaussian Splatting to Other Models
As mentioned above, Gaussian splatting builds on existing work in the 3D development space, from NeRF techniques, to SFM and radiance fields. However, there are some major differences worth noting. NeRF, for instance, uses neural networks to store files with weights that capture radiance fields using hashes, grids, voxels, and points.
Gaussian splatting, on the other hand, stores files as a collection of Gaussian points trained with gradient descent. While gradient descent is a machine learning technique frequently used for training neural networks, the first instances of Gaussian splatting used this methodology alone, without the need for neural networks.
Since then, other researchers have built on the techniques with neural nets and other machine-learning techniques to help compress the available data.
Additionally, NeRF relies on ray tracing for image generation. Gaussian splatting uses a brand-new rasterizing approach to create pixels considering position, color, transparency, and how pixels might be scaled or stretched. On top of all that, it’s almost 50 times faster to train Gaussian splatting solutions than it is to train NeRFs for the same purpose.
These tools can also render images at over 135 FPS, compared to just .1 to 8 FPS for NeRFs. That doesn’t mean NeRFs don’t have value, however.
They are better at managing storage and memory. Some of the earlier Gaussian splatting files were up to 100 times larger than those created by NeRFs. They required much more video random access memory and computer processing power. This is why users have relied heavily on the cloud for so far.
However, researchers are looking for ways to eliminate these limitations. Teams have begun to discover ways to use neural networks to shrink splats to a tenth of their original size.
Why is Gaussian Splatting Important for XR?
So, why does all of this matter to extended reality development? Gaussian splatting essentially allows us to streamline and enhance the process of capturing high-definition images efficiently, which has been one of the biggest challenges facing metaverse creators up until now.
For instance, even in apps like Google Street View, you still need to jump 15 feet from one place to the next to see a new scene. With Gaussian splatting, you could walk around a 3D-captured space naturally, just like you were moving through a real environment.
Gaussian splatting eliminates many challenges common with photogrammetry and lidar for 3D asset creation. For instance, photogrammetry relies on algorithms that are extremely hard to train and struggle with fine details. Plus, Lidar can capture high-fidelity content, but it doesn’t include any color details.
On a broad scale, it will help companies capture and create amazing 3D models. Plus, it will allow them to update those models over time, enhancing user experience. If Gaussian splatting solutions are integrated into AI and machine learning solutions, we’ll see ever-more advanced digital twins and metaverse environments.
Of course, there’s a potential downside, too. With any new metaverse technology, there are always privacy concerns to consider. This could mean regulatory risks could arise for vendors in the years ahead.
Real-World Use Cases: Examples and Ideas
Notably, Gaussian splatting is still in its early stages of development. But that doesn’t mean innovators aren’t already finding ways to use this technique in the extended reality space. In the broad extended reality landscape, Gaussian splatting will be incredible at creating highly realistic VR environments and augmented reality or mixed reality objects.
They’ll pave the way for more immersive digital twin experiences, enabling precise camera tracking and high-fidelity reconstruction in various scenarios. For instance, SplaTAM is an application in RGB-D SLAM already using Gaussian splatting for these purposes.
Plus, Meta’s experiments with avatars show that this technology could transform the realism of avatars for telepresence and immersive collaboration. Splatting will pave the way for better lighting and textures for our avatars.
On a more precise scale, there are plenty of examples of how Gaussian splatting could enhance various XR experiences in different industries, for instance:
- Real estate: Gaussian splatting could transform virtual property tours. It could give potential buyers a realistic experience of walking through a property and examining different details.
- Urban planning: Gaussian splatting allows us to create realistic digital twins of entire cities and locations. This could enable incredible urban development strategies.
- Ecommerce: Gaussian splatting could help to revolutionize online shopping experiences. It would create more immersive product images and 3D representations of items.
A New Era for 3D Development
Ultimately, compared to other 3D development techniques, Gaussian splatting offers faster results, quicker training, and incredible photorealistic scenes. It may still have issues with high VRAM uses, but researchers are working on addressing this problem.
At the same time, countless innovators are making Gaussian splatting more accessible to today’s creators. We’ve already mentioned Unity and Unreal, for instance. Unity offers a Gaussian splatting package on its asset store, which allows users to instantly start experimenting with new rasterization techniques. Unreal Engine has also created a dedicated plugin.
Plus, the 3D platform Spline recently showcased its 3D Gaussian splatting support with a demo on its website. Other organizations are also getting involved. Companies like Luma AI, Poly.cam, and Kiri offer solutions for Android and iOS devices.
Even solutions like Varjo Teleport are giving everyday people the opportunity to experiment with the benefits of Gaussian splatting. As researchers find ways to overcome common concerns with this development method, adoption will only continue to grow.
It seems safe to say that we can expect more immersive experiences powered by a new generation of 3D development methods.
Quelle: