Cluster Culling Shading
This shader type has an execution environment similar to that of a compute shader, where a collection of shader invocations form a workgroup and cooperate to perform coarse level geometry culling and LOD selection. A shader invocation can emit a set of built-in output variables via a new built-in function. The cluster culling shader organizes these emitted variables into a drawing command used by the subsequent rendering pipeline.
Cluster Culling Shader Input
The only inputs available to the cluster culling shader are variables identifying the specific workgroup and invocation.
Cluster Culling Shader Output
If a cluster survives after culling in a cluster culling shader invocation, a drawing command to draw this cluster should be emitted by this shader invocation for further rendering processing. There are two types of drawing command, indexed mode and non-indexed mode. Both type of drawing commands consist of a set of built-in output variables which have a similar definition to VkDrawIndexedIndirectCommand and VkDrawIndirectCommand members.
Cluster culling shaders have the following built-in output variables:
- built-in variable
IndexCountHUAWEI
is the number of vertices to draw. - built-in variable
VertexCountHUAWEI
is the number of vertices to draw. - built-in variable
InstanceCountHUAWEI
is the number of instances to draw. - built-in variable
FirstIndexHUAWEI
is the base index within the index buffer. - built-in variable
FirstVertexHUAWEI
is the index of the first vertex to draw - built-in variable
VertexOffsetHUAWEI
is the value added to the vertex index before indexing into the vertex buffer. - built-in variable
FirstInstanceHUAWEI
is the instance ID of the first instance to draw. - built-in variable
ClusterIDHUAWEI
is the index of cluster being rendered by this drawing command. When cluster culling shader is enabled,ClusterIDHUAWEI
will replacegl_DrawID
pass to vertex shader. - built-in variable
ClusterShadingRate
is the shading rate of cluster being rendered by this drawing command.
Cluster Culling Shader Cluster Ordering
- When a cluster culling shader is used, all output clusters generated by
DispatchClusterHUAWEI
() in a given workgroup are passed to subsequent pipeline stage before any cluster generated from subsequent workgroup. - In a workgroup, the order of output clusters generated by
DispatchClusterHUAWEI
() is specified by the local invocation id, from lower to higher values. - If any cluster culling invocation in the workgroup does not call
DispatchClusterHUAWEI
(), no cluster will be sent to the subsequent rendering pipeline. - Any cluster culling shader invocation may also call
DispatchClusterHUAWEI
() many times as shown below:
// Cluster Culling Shader sample code:
......
DispatchClusterHUAWEI(); // dispatch 0
......
DispatchClusterHUAWEI(); // dispatch 1
......
DispatchClusterHUAWEI(); // dispatch 2
......
In this case, the output sequence of clusters in a workgroup are specified as shown below ( in case of 32 shader invocations in a workgroup):
1. shader invocation0.dispatch0
2. shader invocation1.dispatch0,
..........
32. shader invocation31.dispatch0
33. shader invocation0.dispatch1
34. shader invocation1.dispatch1
..........
64. shader invocation31.dispatch1
65. shader invocation0.dispatch2
66. shader invocation1.dispatch2
..........
96. shader Invocation31.dispatch2
Cluster Culling Shader Primitive Ordering
Following guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order.
- Limited guarantees are provided for the relative ordering of primitives produced by a cluster culling shader, as they pertain to primitive order.
- The order of primitives in a given cluster is specified by the content
of
DispatchClusterHUAWEI
() with indexed output built-in variables, vertices sourced from a lower index buffer addresses to higher addresses.DispatchClusterHUAWEI
() with non-indexed output built-in variables, from vertices with a lower numbered vertexIndex to a higher numbered vertexIndex.