Archive for March, 2009

Mixed Resolution Rendering

Thursday, March 26th, 2009

Here are the slides for my presentation ‘Mixed Resolution Rendering’ at GDC’09.

Here are a few relevant previous blog entries:

Keepin’ it Low Res

Let’s Have a Min/Max Party

Imperfect Shadow Maps

Geometry-Aware Framebuffer Level of Detail

GDC’09, ShaderX7

Tuesday, March 24th, 2009

I’m out at GDC now. The first two days have been pretty low-key. I spent the day yesterday in the Advanced D3D tutorial and today in the Insomniac PS3 programming tutorial. AD3D had some interesting tidbits, I think ATI and NVIDIA will have slides up shortly. Unfortunately I can’t access the slides that GDC has online for attendees for some reason. The Insomniac session had a really good introduction to Cell programming, how they do gameplay updates on SPUs, a talk on debugging the SPUs that went a bit over my weary head, and a talk on some of their PS3 graphics work for Resistance and Ratchet & Clank. They’re using a sort of halfway lighting pre-pass technique where they render out the normals and specular power, then compute deferred lighting into a buffer, then do a forward pass where they just grab the direct lighting from the previous pass’ results. It wasn’t exactly clear to me how they were doing this with MSAA. All the deferred lighting is computed once per pixel but their forward pass is MSAA. So inevitably their MSAA samples are going to be grabbing incorrect lighting information from the lighting buffer, or so it seems to me.

Overall it looks like attendance is way down this year. This is a shame because I think this year has the best content out of the three years I’ve gone. The lineup for the next three days is solid: Killzone 2 rendering, Gears of War 2 rendering, Larrabee talks, terrain rendering in Halo Wars, a few talks on PS3 programming, etc.

I got my copy of ShaderX7 in the mail the other day. There are lots of neat little articles packed within the monstrous 800 page book. Unfortunately,  myself and my co-authors were excluded from the bios and authors list due to some error, but the article found its way in (under the title “Deferred Occlusion from Analytic Surfaces”).

Mixed Resolution Rendering talk @ GDC 2009

Tuesday, March 3rd, 2009

Organizers finally scheduled my talk at GDC 2009. It will be happening Friday 27th from 10:30am — 10:50am in Room 130, North Hall. The description of the talk is here (mirrored link). I’ll be showing one or more short demos as time allows. Come out and say hi.

How Did I Forget About Humus?

Sunday, March 1st, 2009

Somehow I haven’t checked Humus’ website for about six months. Not sure how that happened, as that guy always has great little demos and tricks to share. Of particular interest to me is his trick for detecting where to perform multisampled deferred shading in his Deferred Shading 2 app. By passing SV_Position to the pixel shader with centroid interpolation, you can detect if you are at an edge by examining if the centroid sampled position has moved from the pixel center. Awesome! Saves you an edge detection pass.

Also of particular interest to me are Shader Programming Tips #1, 2, and 3. It’s great to see someone talking about GPU Shader Analyzer. The program is super-handy and I use it almost every day. I can’t imagine optimizing without it.

His little experiment with Alpha to Coverage here is pretty cool too. I’ve always thought A2C was lousy. I never thought that maybe that was partially because the HW implementation was not so good.

Thanks Humus for adding an RSS feed so that this will never happen again!

DX11 is Swell

Sunday, March 1st, 2009

Microsoft released the DX11 tech preview in their Nov 2008 SDK and I haven’t heard too much in the way of public developer reaction. Tesselation and Compute Shaders are cool and everything but I like a lot of the small improvements. Append and Consume buffers are a good example of this.

In DX11, shaders can now effectively ‘stream out’ variable amounts of data to special ‘append’ buffers, whereas previously you could only get this type of behavior from geometry shaders. The problem with geometry shaders is that the API puts restrictions on the streamed out data. For example, the output has to maintain the ordering of the input. But if the ordering of the output doesn’t matter to you, which in the majority of cases I’ve run into it doesn’t, you get a performance penalty because of the overhead of enforcing these restrictions.

The append and consume buffers can be RAW (byte address based) or structured (define arbitrary element structure). So effectively what you get is a two-pass Producer/Consumer model. You make an append view of your buffer, output to it, then use a consume view of your buffer to subsequently access that data. This is how we would have done our scene management in the Froblins demo rather than using successive stream out passes from the Geometry Shader, because the ordering does not matter. So if you’re simply trying to do variable feedback, append buffers are for you.

Other awesome little features in DX11 are:

  • Read-only depth buffers: You can sample your depth buffer while you’re using it for depth culling. Now your depth-based splatting techniques can use 3D proxies and have them culled against the depth buffer
  • Conservative depth: You can output depth from a pixel shader without trashing Early-Z. Basically you provide the limit on what depth you’ll write out and Early-Z will use that information to do early depth culling. This will be great for relief texture mapping and the like
  • DrawIndirect: You can write the amount of data streamed out from a shader and use that to invoke an instanced draw call. Previously you had to issue a stream out statistics query to know the instance count. Queries == bad. This is another improvement we could have used in Froblins
  • Coverage as PS input: You can get the coverage mask, woo hoo! Now you can figure out where to do per-sample shading in your deferred shading pipeline without a separate edge detection pass (amongst other uses, for sure)
  • Gather4 improvements: Specify which channel of a multi-channel texture to fetch from. Can also use programmable offsets

I don’t want to give the wrong impression here, I am quite enthused about compute shaders too. There was more than one talk at I3D that complained of high interop costs between CUDA and their graphics API. Having your compute capability as a part of your graphics API will be nice.

While I’m on the topic, I am also into the Feature Level support in DX11. What this does is allow the DX11 API to be used for non-DX11 hardware platforms. It can go all of the way back to DX9, SM2.0. This is going to be great, and a big reason why a lot of PC developers are going to go straight to DX11 engine development.

One thing that is still missing from DX: OR blending. Come on! OGL has had this forever.