Since the OGL pipeline was added in Tiger, we have required the availability of
a stencil buffer for use in complex (non-rectangular shape) clipping cases.
Basically, we "render" the clip shape into the stencil buffer, and then we enable
the stencil test in order to discard fragments that don't fall within that clip
region. This procedure allows for hardware accelerated complex clipping, and it has
worked great for us. However, it is also possible to use the depth buffer in much
the same fashion to achieve the same effect, and I think we should probably use that
technique (instead of the current stencil technique) for the following reasons:
1. FBO support: We just putback support for the FBO extension (see 6255507), which
in theory allows you to attach a stencil "renderbuffer" to an FBO. Unfortunately,
recent drivers from both Nvidia and ATI that support the FBO extension do not allow
you to create an FBO with stencil attachment. This means that currently if you
turn on FBO support for our OGL pipeline, you will see that complex clipping is
hosed. This is partially due to a bug in 6255507 where we do not correctly detect
that a stencil attachment is not allowed and create an FBO anyway without stencil
support. Both Nvidia and ATI have been working with the ARB superbuffers working
group to add support for FBO with stencil, but that is still a ways off, so we
need to find a better solution in the near term. Current drivers from both Nvidia
and ATI do support FBO with depth attachment, so this will work for our complex
clipping needs.
2. Footprint savings: When we search for a GLXFBConfig (or PixelFormat on Windows),
we currently request a config with at least 1-bit stencil buffer. In theory, all
we need is 1-bit per pixel of stencil and no depth buffer capabilities. But in
practice, most drivers enumerate configs such that if there is stencil support,
it is at least 8 bpp for the stencil buffer, and there will almost always be
at least 16 bpp for the depth buffer. In other words, it is rare that you will ever
see a config that supports only stencil and not depth. So currently, since we
request stencil capabilities, we will usually end up with at least 8+16=24 bpp
for the combined stencil+depth buffer. We currently do not use the depth buffer
for any purpose, and we only use 1-bit of the stencil buffer in cases where a complex
clip is involved. So if we can just use the depth buffer to achieve the same effect
we were getting with the stencil buffer, why not just request a depth buffer (without
a stencil buffer)? By requesting only the depth buffer, we can save at least 8 bpp.
3. More common codepath for drivers: There are some known bugs in certain driver/
hardware combinations (e.g. Nvidia GF 2-based boards on Windows) where stencil-based
clipping does not work properly. If we use depth-based clipping instead, we will
hit the more well-tested parts of the drivers, and in turn hit fewer bugs related
to stencil test.
4. Potentially support more drivers: Some older drivers/hardware (e.g. Intel graphics)
do not have any stencil buffer capabilities, only depth buffer. Therefore, our
OGL-based pipeline currently does not work on those boards. If we move to using
depth buffer instead, we can potentially run on these older boards.
5. Consistency with our D3D pipeline: When we beefed up our D3D-based pipeline
in Mustang, we found that stencil support in DX7 for most drivers/hardware was
flakey at best, so we ended up using depth testing techniques instead, which works
much better on the D3D side.