Archive for the ‘APIs and Related’ Category

DX11 is Swell

Sunday, March 1st, 2009

Microsoft released the DX11 tech preview in their Nov 2008 SDK and I haven’t heard too much in the way of public developer reaction. Tesselation and Compute Shaders are cool and everything but I like a lot of the small improvements. Append and Consume buffers are a good example of this.

In DX11, shaders can now effectively ‘stream out’ variable amounts of data to special ‘append’ buffers, whereas previously you could only get this type of behavior from geometry shaders. The problem with geometry shaders is that the API puts restrictions on the streamed out data. For example, the output has to maintain the ordering of the input. But if the ordering of the output doesn’t matter to you, which in the majority of cases I’ve run into it doesn’t, you get a performance penalty because of the overhead of enforcing these restrictions.

The append and consume buffers can be RAW (byte address based) or structured (define arbitrary element structure). So effectively what you get is a two-pass Producer/Consumer model. You make an append view of your buffer, output to it, then use a consume view of your buffer to subsequently access that data. This is how we would have done our scene management in the Froblins demo rather than using successive stream out passes from the Geometry Shader, because the ordering does not matter. So if you’re simply trying to do variable feedback, append buffers are for you.

Other awesome little features in DX11 are:

  • Read-only depth buffers: You can sample your depth buffer while you’re using it for depth culling. Now your depth-based splatting techniques can use 3D proxies and have them culled against the depth buffer
  • Conservative depth: You can output depth from a pixel shader without trashing Early-Z. Basically you provide the limit on what depth you’ll write out and Early-Z will use that information to do early depth culling. This will be great for relief texture mapping and the like
  • DrawIndirect: You can write the amount of data streamed out from a shader and use that to invoke an instanced draw call. Previously you had to issue a stream out statistics query to know the instance count. Queries == bad. This is another improvement we could have used in Froblins
  • Coverage as PS input: You can get the coverage mask, woo hoo! Now you can figure out where to do per-sample shading in your deferred shading pipeline without a separate edge detection pass (amongst other uses, for sure)
  • Gather4 improvements: Specify which channel of a multi-channel texture to fetch from. Can also use programmable offsets

I don’t want to give the wrong impression here, I am quite enthused about compute shaders too. There was more than one talk at I3D that complained of high interop costs between CUDA and their graphics API. Having your compute capability as a part of your graphics API will be nice.

While I’m on the topic, I am also into the Feature Level support in DX11. What this does is allow the DX11 API to be used for non-DX11 hardware platforms. It can go all of the way back to DX9, SM2.0. This is going to be great, and a big reason why a lot of PC developers are going to go straight to DX11 engine development.

One thing that is still missing from DX: OR blending. Come on! OGL has had this forever.