Ray-aligned Occupancy Map Array for Fast Approximate Ray Tracing

Zheng Zeng, Zilin Xu, Lu Wang, Lifan Wu, Ling-Qi Yan

Eurographics Symposium on Rendering 2023 (CGF track)

ROMA wasn't built in a day, but in 0.x milliseconds!

Abstract

We present a new software ray tracing solution that efficiently computes visibilities in dynamic scenes. We first introduce a novel scene representation: ray-aligned occupancy map array (ROMA) that is generated by rasterizing the dynamic scene once per frame. Our key contribution is a fast and low-divergence tracing method computing visibilities in constant time, without constructing and traversing the traditional intersection acceleration data structures such as BVH. To further improve accuracy and reduce aliasing, we use a spatiotemporal scheme to stochastically distribute the candidate ray samples. We demonstrate the practicality of our method by integrating it into a modern real-time renderer and showing better performance compared to existing techniques based on distance fields (DFs). Our method is free of the typical artifacts caused by incomplete scene information, and is about 2.5x–10x faster than generating and tracing DFs at the same resolution and equal storage.

What is ROMA?

ROMA (Ray-aligned Occupancy Map Array) is a ray tracing alternative, which is fast to build and fast to trace: building ROMA only requires one pass of rasterization; tracing ray against ROMA only takes O(1) time, without any hierarchical traversal (as opposed to Hardware Ray Tracing) and without iterations (as opposed to Distance Fields).

Why do we need ROMA?

Hardware Ray Tracing builds fast, traces fast, but requires specific hardwares. Distance Fields traces fast, but builds prohibitively slow (3.31ms at 128^3 resolution); this is why Distance Fields are limited to static objects.

What is the idea of ROMA?

3D scene geometries can be approximately represented by voxel bit bricks and compactly stored in a 2D occupancy map (OM).

Ray tracing in OM can be fast: a group of binary voxels along z-axis can be checked at once with one texture fetch and few bit operations. But the fastest case is when tracing the ray along the z-axis (one iteration in total). Therefore, we want to make some preparation—making multiple copies of OMs with different rotations—so that every ray can be traced along z-axis.

How does ROMA work?

Step 1: build a BOM (Base Occupancy Map). This is a standard OM which can be quickly generated using rasterization.

Step 2: Copy the BOM and rotate towards different directions. The best part of this step is that it does not requires any further rasterization, but only performing within a compute shader.

Step 3: Given any ray, “snap” it to its closest rotation direction in Step 2, and perform 1D ray tracing in O(1) time by bit operations.

Any other notes?

ROMA is scalable between performance and quality, by tuning the resolution of BOM (spatial resolution) and the number of rotated OMs (angular resolution).
ROMA is suitable for spatiotemporal rendering, by using differently randomized directions in Step 2 over time.
ROMA, as a ray tracing alternative, is not a solution to any specific light transport methods, e.g., ReSTIR for direct illumination, or DDGI for indirect illumination, and so on. One should expect to use ROMA in combination with these methods, whenever ray tracing is needed.

Results

The following videos compare ROMA with Distance Field (DF) and Hardware Ray Tracing (Ref.). More specificially:

We want to show soft shadows from direct illumination and color bleeding from indirect illumination.
Direct illumination is sampled from an area (disk) light source on the roof. All methods are sampling and tracing towards it.
Indirect illumination is queried from a Reflective Shadow Map (RSM). All methods first tracing to get the secondary shading point, and then use it to query the RSM.
ROMA has a spatial resolution of 128^2 and a angular resolution of 8^2.
DF has a resolution of 128^3.
The videos themselves do not reflect the performance, they are just 60FPS playback.

All experiments and timings are conducted on a desktop with a 3.70 GHz Interl i9-10900K and an NVIDIA GeForce RTX 3080 Ti. The table reblow reports the average performance on Morphing Spot Scene, Morphing Spikes Scene, and BrainStem Scene (similar scene complixity):

	Generation	Tracing
DF	~2.86 ms (3.2x)	~0.90 ms (2.0x)	The generation of DF is slow. This is mainly due to the time complexity of the 3D Jump Flooding Algorithm.
ROMA	~0.89 ms	~0.45 ms	Compared with DF, ROMA is consistently faster in both generation and tracing. ROMA also achieves faster tracing than HWRT (~0.50ms) even without hardware acceleration!