Min/Max Buffer Precision Improvement

This is a simple trick I came up with years ago. I’ve finally decided to create a blog and share a little with the community, wish me luck!

In computer graphics you sometimes end up storing min/max data pairs. An obvious example is in deferred lighting, where a game engine will compute and store the min and max of all depth values in a screen tile, say 16×16 pixels large. Lights are then culled against this depth range. (or multiple ranges, in the case of clustered schemes)

Of course, graphics engineers are concious of memory use and moreover implications for bandwidth and cache pressure. Therefore its common to quantize data to the smallest type we can get away with. So for example we might chose to store min/max linear depth values as 16bit unorm’s. e.g. using a 2D texture with the format DXGI_FORMAT_R16G16_UNORM. Probably as I do, converting from reverse non-linear Z during rasterization to forward linear Z for deferred passes.

The min/max texture for a terrain scene looks like this:

The red channel stores the minimum depth value, and the green channel the maximum in each screen tile. RG 1,0 (an illegal value) is being used to denote a clear tile, i.e. sky. Where min and max depth are similar, we have some shade of yellow based on distance. When there is a large depth range, the colour tends towards green, as the green channel is storing a max value substantially larger than the min value. Intuitively this occurs on silouettes. Such storage is common, but wasteful of precision, since both min and max channels can store anything in the range 0..1.

Depth was originally a 32bit float and in converting to 16bits we lost a lot of information.

Fortunately we have an exploitable constraint in our data, i.e. min <= max and conversely max >= min. The trick is to make one of these values relative to the other. In the following I choose to gift min more precision, but its just as easy to do the same for max instead.

Using the same texture format, instead of storing min and max directly, I store max and a delta which interpolates between 0 and max. So as long as max is less than 1.0, we have improved the precision of min. This is trivial to code:

// encode for texture write
encodedRG = float2(min / max, max);
// decode after texture read
min = encodedRG.x * encodedRG.y;
max = encodedRG.y;

In this scheme the green channel of the texture looks exactly as it did before, however the red channel is drastically altered.

When there is very little difference between min and max, the encoded delta value is close to 1. Only when a large depth discontinuity exists do we see smaller values in the red channel.

If the max value is 0.25 for example, which the far mountains are in this scene. The minimum value benefits from effecively four times the precision, since the same 16 bits are now being used to store values in the range 0 – 0.25, instead of 0 – 1.

(Note I have modified the histogram of the images slightly to make the colours stand out more)

This results in ~0.4% speed up in my deferred lighting, due to less pixels processed with zero attenuation. Not bad for such a small change, but not earth shattering either. YMMV, and of course improvements in precision are sometimes about quality rather than optimisation.

A future extension would be to stop using a 2 channel texture and instead pack delta min and max into a single R32_UINT. The potential benefit here would be to gift a different number of bits to each of delta min and max. Say giving max 17 bits and delta min 15 bits. This of course requires the shader to perform more operations in packing and unpacking.

Advertisement