Ray Tracing
Ray tracing uses a separate rendering pipeline from both the graphics and compute pipelines (see Ray Tracing Pipeline).
Within the ray tracing pipeline, a pipeline
trace ray instruction can be called to perform a ray
traversal that invokes the various ray tracing shader stages during its
execution.
The relationship between the ray tracing pipeline object and the geometries
present in the acceleration structure traversed is passed into the ray
tracing command in a VkBuffer object known as a shader binding
table.
OpExecuteCallableKHR
can also be used in ray tracing pipelines to
invoke a callable shader.
During execution, control alternates between scheduling and other operations. The scheduling functionality is implementation-specific and is responsible for workload execution. The shader stages are programmable. Traversal, which refers to the process of traversing acceleration structures to find potential intersections of rays with geometry, is fixed function.
The programmable portions of the pipeline are exposed in a single-ray
programming model, with each invocation handling one ray at a time.
Memory operations can be synchronized using standard memory barriers.
The Workgroup
scope and variables with a storage class of
Workgroup
must not be used in the ray tracing pipeline.
Shader Call Instructions
A shader call is an instruction which may cause execution to continue elsewhere by creating one or more invocations that execute a different shader stage.
The following table lists all shader call instructions and which stages each one can directly call.
Instruction | Intersection | Any-Hit | Closest Hit | Miss | Callable |
---|---|---|---|---|---|
X | X | X | X | ||
X | X | X | X | ||
X | |||||
X | |||||
X | X | ||||
X | X | ||||
X | X |
Pipeline trace ray instructions can be used recursively; invoked shaders
can themselves execute pipeline trace ray instructions, to a maximum depth
defined by the
maxRecursionDepth
or
maxRayRecursionDepth
limit.
Shaders directly invoked from the API always have a recursion depth of 0;
each shader executed by a pipeline trace ray instruction has a recursion
depth one higher than the recursion depth of the shader which invoked it.
Applications must not invoke a shader with a recursion depth greater than
the value of
maxRecursionDepth
or
maxPipelineRayRecursionDepth
specified in the pipeline.
There is no explicit recursion limit for other shader call instructions
which may recurse (e.g. OpExecuteCallableKHR
) but there is an upper
bound determined by the stack size.
An invocation repack instruction is a ray tracing instruction where the
implementation may change the set of invocations that are executing.
When a repack instruction is encountered, the invocation is suspended and a
new invocation begins and executes the instruction.
After executing the repack instruction (which may result in other ray
tracing shader stages executing) the new invocation ends and the original
invocation is resumed, but it may be resumed in a different subgroup or at
a different SubgroupLocalInvocationId
within the same subgroup.
When a subset of invocations in a subgroup execute the invocation repack
instruction, those that do not execute it remain in the same subgroup at the
same SubgroupLocalInvocationId
.
The OpTraceRayKHR
,
OpTraceRayMotionNV
,
OpReorderThreadWithHintNV
, OpReorderThreadWithHitObjectNV
,
OpReportIntersectionKHR
, and OpExecuteCallableKHR
instructions are
invocation repack instructions.
When a ray tracing shader executes a dynamic instance of an invocation repack instruction which results in another ray tracing shader being invoked, their instructions are related by shader-call-order.
For ray tracing invocations that are shader-call-related:
- memory operations on
StorageBuffer
,Image
, andShaderRecordBufferKHR
storage classes can be synchronized using theShaderCallKHR
scope. - the
CallableDataKHR
,IncomingCallableDataKHR
,RayPayloadKHR
,HitAttributeKHR
, andIncomingRayPayloadKHR
storage classes are system-synchronized and no application availability and visibility operations are required. - memory operations within a single invocation before and after the shader call instruction are ordered by program-order and do not require explicit synchronization.
Ray Tracing Commands
Ray tracing commands provoke work in the ray tracing pipeline. Ray tracing commands are recorded into a command buffer and when executed by a queue will produce work that executes according to the currently bound ray tracing pipeline. A ray tracing pipeline must be bound to a command buffer before any ray tracing commands are recorded in that command buffer.
Shader Binding Table
A shader binding table is a resource which establishes the relationship between the ray tracing pipeline and the acceleration structures that were built for the ray tracing pipeline. It indicates the shaders that operate on each geometry in an acceleration structure. In addition, it contains the resources accessed by each shader, including indices of textures, buffer device addresses, and constants. The application allocates and manages shader binding tables as VkBuffer objects.
Each entry in the shader binding table consists of
shaderGroupHandleSize
bytes of data, either as queried by
vkGetRayTracingShaderGroupHandlesKHR to refer to those specified
shaders, or all zeros to refer to a zero shader group.
A zero shader group behaves as though it is a shader group consisting
entirely of VK_SHADER_UNUSED_KHR
.
The remainder of the data specified by the stride is application-visible
data that can be referenced by a ShaderRecordBufferKHR
block in the
shader.
The shader binding tables to use in a ray tracing pipeline are passed to the vkCmdTraceRaysNV, vkCmdTraceRaysKHR, or vkCmdTraceRaysIndirectKHR commands. Shader binding tables are read-only in shaders that are executing on the ray tracing pipeline.
Shader variables identified with the ShaderRecordBufferKHR
storage
class are used to access the provided shader binding table.
Such variables must be:
- typed as
OpTypeStruct
, or an array of this type, - identified with a
Block
decoration, and - laid out explicitly using the
Offset
,ArrayStride
, andMatrixStride
decorations as specified in Offset and Stride Assignment.
The Offset
decoration for any member of a Block
-decorated variable
in the ShaderRecordBufferKHR
storage class must not cause the space
required for that variable to extend outside the range [0,
maxStorageBufferRange
).
Accesses to the shader binding table from ray tracing pipelines must be
synchronized with the
VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR
pipeline stage and an
access type of
VK_ACCESS_SHADER_READ_BIT
.
Because different shader record buffers can be associated with the same
shader, a shader variable with ShaderRecordBufferKHR
storage class will
not be dynamically uniform if different invocations of the same shader can
reference different data in the shader record buffer, such as if the same
shader occurs twice in the shader binding table with a different shader
record buffer.
In this case, indexing resources based on values in the
ShaderRecordBufferKHR
storage class, the index should be decorated as
NonUniform
.
Indexing Rules
In order to execute the correct shaders and access the correct resources during a ray tracing dispatch, the implementation must be able to locate shader binding table entries at various stages of execution. This is accomplished by defining a set of indexing rules that compute shader binding table record positions relative to the buffer’s base address in memory. The application must organize the contents of the shader binding table’s memory in a way that application of the indexing rules will lead to correct records.
Ray Generation Shaders
Only one ray generation shader is executed per ray tracing dispatch.
For vkCmdTraceRaysKHR, the location of the ray generation shader is
specified by the pRaygenShaderBindingTable→deviceAddress
parameter — there is no indexing.
All data accessed must be less than pRaygenShaderBindingTable→size
bytes from deviceAddress
.
pRaygenShaderBindingTable→stride
is unused, and must be equal to
pRaygenShaderBindingTable→size
.
For vkCmdTraceRaysNV, the location of the ray generation shader is
specified by the raygenShaderBindingTableBuffer
and
raygenShaderBindingOffset
parameters — there is no indexing.
Hit Shaders
The base for the computation of intersection, any-hit, and closest hit
shader locations is the instanceShaderBindingTableRecordOffset
value
stored with each instance of a top-level acceleration structure
(VkAccelerationStructureInstanceKHR).
This value determines the beginning of the shader binding table records for
a given instance.
In the following rule, geometryIndex
refers to the
geometry index of the intersected
geometry within the instance.
The sbtRecordOffset
and sbtRecordStride
values are passed in as
parameters to
traceNV
()
or
traceRayEXT
()
calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, these correspond to the SBTOffset
and SBTStride
parameters to the pipeline trace ray
instructions.
The result of this computation is then added to
pHitShaderBindingTable→deviceAddress
, a device address passed to
vkCmdTraceRaysKHR
, or
hitShaderBindingOffset
, a base offset passed to vkCmdTraceRaysNV
.
For vkCmdTraceRaysKHR, the complete rule to compute a hit shader
binding table record address in the pHitShaderBindingTable
is:
pHitShaderBindingTable→deviceAddress
+pHitShaderBindingTable→stride
× (instanceShaderBindingTableRecordOffset
+geometryIndex
×sbtRecordStride
+sbtRecordOffset
)
All data accessed must be less than pHitShaderBindingTable→size
bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct
parameters, so the full rule to compute a hit shader binding table record
address in the hitShaderBindingTableBuffer
is:
hitShaderBindingOffset
+hitShaderBindingStride
× (instanceShaderBindingTableRecordOffset
+geometryIndex
×sbtRecordStride
+sbtRecordOffset
)
Miss Shaders
A miss shader is executed whenever a ray query fails to find an intersection for the given scene geometry. Multiple miss shaders may be executed throughout a ray tracing dispatch.
The base for the computation of miss shader locations is
pMissShaderBindingTable→deviceAddress
, a device address passed into
vkCmdTraceRaysKHR
, or
missShaderBindingOffset
, a base offset passed into
vkCmdTraceRaysNV
.
The missIndex
value is passed in as a parameter to
traceNV
()
or
traceRayEXT
()
calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the MissIndex
parameter to the
pipeline trace ray instructions.
For vkCmdTraceRaysKHR, the complete rule to compute a miss shader
binding table record address in the pMissShaderBindingTable
is:
pMissShaderBindingTable→deviceAddress
+pMissShaderBindingTable→stride
×missIndex
All data accessed must be less than pMissShaderBindingTable→size
bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct
parameters, so the full rule to compute a miss shader binding table record
address in the missShaderBindingTableBuffer
is:
missShaderBindingOffset
+missShaderBindingStride
×missIndex
Callable Shaders
A callable shader is executed when requested by a ray tracing shader. Multiple callable shaders may be executed throughout a ray tracing dispatch.
The base for the computation of callable shader locations is
pCallableShaderBindingTable→deviceAddress
, a device address passed
into vkCmdTraceRaysKHR
, or
callableShaderBindingOffset
, a base offset passed into
vkCmdTraceRaysNV
.
The sbtRecordIndex
value is passed in as a parameter to
executeCallableNV
()
or
executeCallableEXT
()
calls made in the shaders.
See Section 8.19 (Ray Tracing Functions) of the OpenGL Shading Language
Specification for more details.
In SPIR-V, this corresponds to the SBTIndex
parameter to the
OpExecuteCallableNV
or
OpExecuteCallableKHR
instruction.
For vkCmdTraceRaysKHR, the complete rule to compute a callable shader
binding table record address in the pCallableShaderBindingTable
is:
pCallableShaderBindingTable→deviceAddress
+pCallableShaderBindingTable→stride
×sbtRecordIndex
All data accessed must be less than pCallableShaderBindingTable→size
bytes from the base address.
For vkCmdTraceRaysNV, the offset and stride come from direct
parameters, so the full rule to compute a callable shader binding table
record address in the callableShaderBindingTableBuffer
is:
callableShaderBindingOffset
+callableShaderBindingStride
×sbtRecordIndex
Ray Tracing Pipeline Stack
Ray tracing pipelines have a potentially large set of shaders which may be invoked in various call chain combinations to perform ray tracing. To store parameters for a given shader execution, an implementation may use a stack of data in memory. This stack must be sized to the sum of the stack sizes of all shaders in any call chain executed by the application.
If the stack size is not set explicitly, the stack size for a pipeline is:
- rayGenStackMax + min(1,
maxPipelineRayRecursionDepth
) × max(closestHitStackMax, missStackMax, intersectionStackMax + anyHitStackMax) + max(0,maxPipelineRayRecursionDepth
-1) × max(closestHitStackMax, missStackMax) + 2 × callableStackMax
where rayGenStackMax, closestHitStackMax, missStackMax, anyHitStackMax, intersectionStackMax, and callableStackMax are the maximum stack values queried by the respective shader stages for any shaders in any shader groups defined by the pipeline.
This stack size is potentially significant, so an application may want to provide a more accurate stack size after pipeline compilation. The value that the application provides is the maximum value of the sum of all shaders in a call chain across all possible call chains, taking into account any application specific knowledge about the properties of the call chains.
For example, if an application has two types of closest hit and miss shaders that it can use but the first level of rays will only use the first kind (possibly reflection) and the second level will only use the second kind (occlusion or shadow ray, for example) then the application can compute the stack size by something similar to:
rayGenStack
+ max(closestHit1Stack
,miss1Stack
) + max(closestHit2Stack
,miss2Stack
)
This is guaranteed to be no larger than the default stack size computation which assumes that both call levels may be the larger of the two.
Ray Tracing Capture Replay
In a similar way to
bufferDeviceAddressCaptureReplay,
the rayTracingPipelineShaderGroupHandleCaptureReplay
feature allows the
querying of opaque data which can be used in a future replay.
During the capture phase, capture/replay tools are expected to query opaque data for shader group handle replay using vkGetRayTracingCaptureReplayShaderGroupHandlesKHR.
Providing the opaque data during replay, using
VkRayTracingShaderGroupCreateInfoKHR::pShaderGroupCaptureReplayHandle
at pipeline creation time, causes the implementation to generate identical
shader group handles to those in the capture phase, allowing capture/replay
tools to reuse previously recorded shader binding table buffer contents or
to obtain the same handles by calling
vkGetRayTracingCaptureReplayShaderGroupHandlesKHR again.
Ray Tracing Validation
Ray tracing validation can help root cause application issues and improve performance. Unlike existing validation layers, ray tracing validation performs checks at an implementation level, which helps identify potential problems that may not be caught by the layer.
By enabling the ray tracing validation feature, warnings and errors can be delivered straight from a ray tracing implementation to the application through a messenger callback registered with the implementation, where they can be processed through existing application-side debugging or logging systems.