Execution Graphs

Execution graphs provide a way for applications to dispatch multiple operations dynamically from a single initial command on the host. To achieve this, a new execution graph pipeline is provided, that links together multiple shaders or pipelines which each describe one or more operations that can be dispatched within the execution graph. Each linked pipeline or shader describes an execution node within the graph, which can be dispatched dynamically from another shader within the same graph. This allows applications to describe much richer execution topologies at a finer granularity than would typically be possible with API commands alone.

Pipeline Creation

vkCreateExecutionGraphPipelinesAMDXCreates a new execution graph pipeline object
VkExecutionGraphPipelineCreateInfoAMDXStructure specifying parameters of a newly created execution graph pipeline
VK_SHADER_INDEX_UNUSED_AMDXSentinel for an unused shader index
VkPipelineShaderStageNodeCreateInfoAMDXStructure specifying the shader name and index with an execution graph
vkGetExecutionGraphPipelineNodeIndexAMDXQuery internal id of a node in an execution graph

Initializing Scratch Memory

Implementations may need scratch memory to manage dispatch queues or similar when executing a pipeline graph, and this is explicitly managed by the application.

vkGetExecutionGraphPipelineScratchSizeAMDXQuery scratch space required to dispatch an execution graph
VkExecutionGraphPipelineScratchSizeAMDXStructure describing the scratch space required to dispatch an execution graph
vkCmdInitializeGraphScratchMemoryAMDXInitialize scratch memory for an execution graph

Dispatching a Graph

Initial dispatch of an execution graph is done from the host in the same way as any other command, and can be used in a similar way to compute dispatch commands, with indirect variants available.

vkCmdDispatchGraphAMDXDispatch an execution graph
vkCmdDispatchGraphIndirectAMDXDispatch an execution graph with node and payload parameters read on the device
vkCmdDispatchGraphIndirectCountAMDXDispatch an execution graph with all parameters read on the device
VkDeviceOrHostAddressConstAMDXUnion specifying a const device or host address
VkDispatchGraphCountInfoAMDXStructure specifying count parameters for execution graph dispatch
VkDispatchGraphInfoAMDXStructure specifying node parameters for execution graph dispatch

Shader Enqueue

Compute shaders in an execution graph can use the OpInitializeNodePayloadsAMDX to initialize nodes for dispatch. Any node payload initialized in this way will be enqueued for dispatch once the shader is done writing to the payload. As compilers may be conservative when making this determination, shaders can further call OpFinalizeNodePayloadsAMDX to guarantee that the payload is no longer being written.

The Node Name operand of the PayloadNodeNameAMDX decoration on a payload identifies the shader name of the node to be enqueued, and the Shader Index operand of OpInitializeNodePayloadsAMDX identifies the shader index. A node identified in this way is dispatched as described in the following sections.

Compute Nodes

Compute shaders added as nodes to an execution graph are executed differently based on the presence or absence of the StaticNumWorkgroupsAMDX or CoalescingAMDX execution modes.

Dispatching a compute shader node that does not declare either the StaticNumWorkgroupsAMDX or CoalescingAMDX execution mode will execute a number of workgroups in each dimension specified by the first 12 bytes of the payload, interpreted as a VkDispatchIndirectCommand. The same payload will be broadcast to each workgroup in the same dispatch. Additional values in the payload are have no effect on execution.

Dispatching a compute shader node with the StaticNumWorkgroupsAMDX execution mode will execute workgroups in each dimension according to the x, y, and z size operands to the StaticNumWorkgroupsAMDX execution mode. The same payload will be broadcast to each workgroup in the same dispatch. Any values in the payload have no effect on execution.

Dispatching a compute shader node with the CoalescingAMDX execution mode will enqueue a single invocation for execution. Implementations may combine multiple such dispatches into the same workgroup, up to the size of the workgroup. The number of invocations coalesced into a given workgroup in this way can be queried via the CoalescedInputCountAMDX built-in. Any values in the payload have no effect on execution.