posted on 2024-11-03, 12:26authored byJesse Archer, Geoffrey Leach
Transparency requires geometry to be blended in depth sorted order. Order independent transparency (OIT) allows geometry to be rendered in any order, with exact OIT capturing all fragment data during rasterization before sorting and blending. The sorting stage is the only super-linear operation thus is more dominant with increasing scene depth complexity, and still remains costly for deep scenes despite many improvements. The current fastest approach for OIT uses an insertion sort network of fast registers, sorting fragment data in blocks before writing them to local memory and performing a k-way merge. We show that the sort network performance is improved by modularising parts of the network and tuning loop unrolling, thus reducing total sort code size for better cache behaviour. This further improves sort performance by up to 1.8× and total frametime by up to 1.2×, compounding with previous sorting improvements.