VK_EXT_device_generated_commands.proposal
This document details API design for indirect execution of device generated commands, improving performance by eliminating unnecessary host and device work.
Problem Statement
Device-driven rendering is increasingly used to manage large and complex environments. Scene management in particular is well suited for execution on device and is responsible for:
- Traversing the scene
- Managing LoD
- Performing various culling algorithms
- Generating work to render the visible result
Graphics APIs differ in their expressiveness but most have limitations for device-driven scene management that result in:
- Unnecessary state changes
- Wasted memory for worst-case allocations for intermediate results
- Round-tripping through memory instead of staying on chip
This proposal focuses on reducing unnecessary state changes. For example, enabling a device to use compute shaders to launch other compute shaders rather than requiring explicit dispatch commands to be recorded.
Solution Space
There are several approaches to reduce unnecessary state changes. Here are some potential solutions:
- No API changes
- Work is done on the host to determine the set of shaders that are potentially visible. This might be a simpler problem than object/triangle culling.
- Potentially duplicates work done by the device.
- Work Graphs
- D3D12 supports Work Graphs, which are a more powerful method of moving work generation to the GPU
- Standardization and cross-vendor support of this advanced functionality takes a long time to achieve ecosystem adoption
- Predicated/Conditional Rendering
- Commands are optionally executed depending on a condition evaluated on the device timeline.
- Exposed in D3D12 as
ID3D12GraphicsCommandList::SetPredication - Host encoding overhead of binding the shaders still exists.
- Indirect Command Buffers (Host)
- Similar to secondary command buffers but with different restrictions and inheritance rules.
- Create multiple indirect command buffers (e.g. one per pipeline).
- Indirect execution of multiple indirect command buffers.
- May require patching/fast updating of objects referenced by the indirect command buffer.
- Indirect Command Buffers (Device)
- Created in a compute shader.
- Can be re-created every frame avoiding multiple execution complexity or patching.
- Requires extensive shading language support.
- Enhanced Indirect
- Add support to execute multiple types of operations in a sequence.
- Limited state changes and operations compared to what is available for primary or secondary command buffers.
- Should be able to represent most work in a "pass" (e.g. drawing shadows or opaque geometry)
Many graphics APIs have more expressive indirect capabilities. This proposal pursues that approach to address both the problem statement and provide an emulation target.
Goals
These are the primary goals for the proposal:
- Efficient implementation for many-draws and many-dispatches per set of shaders.
- Device-side binding of shaders.
- Changing shaders for indirect dispatch during application lifetime.
- Emulation of D3D12 indirect execution.
- Emulation of D3D12 work graphs.
- Transition existing uses of
NV_device_generated_commandsandNV_device_generated_commands_compute. - Single framework for all execution-based indirect commands. Other indirect operations (e.g. building acceleration structures) have very different setup and argument management.
Current implementations
Vulkan
Indirect execution in Vulkan typically support only a single type of command:
vkCmdDrawIndirectvkCmdDrawIndexedIndirectvkCmdDispatchIndirectvkCmdDrawIndirectCount(Vulkan 1.2)vkCmdDrawIndexedIndirectCount(Vulkan 1.2)vkCmdDrawMeshTasksIndirectNV(VK_NV_mesh_shader)vkCmdDrawMeshTasksIndirectCountNV(VK_NV_mesh_shader)vkCmdBuildAccelerationStructuresIndirectKHR(VK_KHR_acceleration_structure)vkCmdTraceRaysIndirectKHR(VK_KHR_ray_tracing_pipeline)vkCmdDrawMeshTasksIndirectEXT(VK_EXT_mesh_shader)vkCmdDrawMeshTasksIndirectCountEXT(VK_EXT_mesh_shader)
The VK_NV_device_generated_commands extension enables a more expressive model supporting multiple commands in a sequence that may change the following state:
- Shaders
- Primitive winding
- Index and vertex buffers
- Push constants
and perform the following operations:
- Indexed and non-indexed draws
- Mesh tasks
D3D12
D3D12 indirect execution is similar in expressivity to both VK_NV_device_generated_commands and VK_NV_device_generated_commands_compute but offers no mechanism for changing graphics shaders or pipelines. It is currently possible to emulate D3D12 behavior on top of VK_NV_device_generated_commands and other base Vulkan functionality so it is important to not lose any features required for emulation with this proposal.
D3D12 work graphs are more powerful in certain aspects than indirect execution but are not yet officially supported in Vulkan.
Metal
Metal is similar in expressivity to VK_NV_device_generated_commands and supports full pipeline changes as well as the equivalent of binding descriptor sets.
Indirect buffer layout is opaque and can be encoded on host through the API or on device using a compute shader. For example:
struct arguments { command_buffer cmd_buffer; };
kernel void producer(device arguments& args, ushort cmd_idx [[thread_position_in_grid]])
{
render_command cmd(args.cmd_buffer, cmd_idx);
cmd.set_render_pipeline_state(...);
cmd.set_vertex_buffer(...);
cmd.draw_primitives(...);
}
Command representation
Supporting multiple commands in an indirect buffer can either be done with a homogeneous structure where the layout is fixed and the same pattern of operations is executed. Another alternative is a heterogeneous structure where there is no restriction on command ordering. For heterogeneous layout, the size of the arguments for each command may also vary.
This proposal uses a homogeneous structure which matches D3D12, Metal, and VK_NV_device_generated_commands. This restricted model simplifies construction and interpretation of the data while also introducing an optimization challenge.
Consider a sequence of Bind Shaders/Draw that binds the same shaders multiple times. If the command buffer is constructed on the host, draw calls with the same shaders can be grouped together creating a heterogeneous structure. There are several options to with a homogeneous structure:
- On-device optimization. The implementation could detect/remove duplicates during pre-processing or execution. This may be difficult or impractical for a device to implement.
- Multi-level indirect. One of the indirect operations could be another indirect execution. For example, a two-level solution could be used with low-frequency operations in the first indirect buffer and high-frequency operations in the second indirect buffer.
- IndirectCount commands. Vulkan has pre-existing indirect commands that execute multiple operations with a device-specified count. This is equivalent to a heavily constrained multi-level indirect solution.
This proposal does not expect significant on-device optimization and uses IndirectCount commands which are capable of representing many common application scenarios.
Proposal
This proposal targets Vulkan 1.3 building on functionality from NV_device_generated_commands to address the problem statement and also provide an emulation target for other APIs.
Indirect buffers contain work elements (sequences) of uniform structure. The memory layout of a sequence is described by an Indirect Commands Layout that specifies a fixed number of command buffer operations:
- Shaders
- Push constants
- Index and vertex buffers
- Draws and dispatches
- Multi-draws with device-specified count
- Trace rays
The extension provides a common framework for all existing and future indirect commands. An implementation does not need to support every command (see the Features section for more detail).
Sequences of compute commands that change shaders must refer to elements of an Indirect Execution Set, a table that references multiple shaders of similar state.
Implementations may also require a preprocess buffer to translate to a device-specific format. With Multi-draw commands being available, optimization of the preprocess buffer to remove duplicates is not expected.
The generation of device generated commands uses the following principle steps:
- Define via
VkIndirectCommandsLayoutEXTthe sequence of commands which can be generated. - Optionally create and update an
VkIndirectExecutionSetEXTto support changing shaders. - Retrieve device addresses and handles for objects stored in indirect buffers.
- Fill a
VkBufferwith the content that matches the indirect command layout. - Create a preprocess
VkBufferthat satisfies the allocation information fromvkGetGeneratedCommandsMemoryRequirementsEXT. - Optionally preprocess the input data using
vkCmdPreprocessGeneratedCommandsEXTin a separate action. - Generate and execute the actual commands via
vkCmdExecuteGeneratedCommandsEXTpassing all required data.
vkCmdPreprocessGeneratedCommandsEXT executes in a separate logical pipeline from either graphics or compute. When preprocessing commands in a separate step they must be explicitly synchronized against the command execution. When not preprocessing, the preprocessing is automatically synchronized against the command execution.
Key differences with VK_NV_device_generated_commands
- Common indirect commands under one unified framework (graphics, compute, and ray tracing)
- Incremental update of shaders available for use
- Adds IndirectCount commands
- Adds compute dispatch support
- Single-interleaved stream
- VK_EXT_shader_object support
Indirect Execution Sets
Indirect buffers that bind shaders reference shaders (pipelines or shader objects) managed by a collection represented by:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectExecutionSetEXT)
Indirect execution sets group both pipelines with the same VkPipelineLayout and shader stages with matching per-stage descriptor layouts.
Indirect execution sets contain a maximum number of N execution slots that can be updated when not referenced by indirect buffers currently in flight. Drivers should ensure that updating a set is a pretty cheap operation as it is expected to be modified as application content changes.
Modifications to an indirect execution set may change the sizing requirements of the preprocess buffer. Applications must call vkGetGeneratedCommandsMemoryRequirementsEXT and update the preprocess buffer if needed when modifications are complete.
Creation and Deletion
Indirect execution sets are created by:
VKAPI_ATTR VkResult VKAPI_CALL vkCreateIndirectExecutionSetEXT(
VkDevice device,
const VkIndirectExecutionSetCreateInfoEXT* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkIndirectExecutionSetEXT* pIndirectExecutionSet);
deviceis the logical device that creates the indirect execution set.pCreateInfois a pointer to aVkIndirectExecutionSetCreateInfoEXTstructure containing parameters affecting creation of the indirect execution set.pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter.pIndirectExecutionSetis a pointer to aVkIndirectExecutionSetEXThandle in which the resulting indirect execution set is returned.
The VkIndirectExecutionSetCreateInfoEXT structure is defined as:
typedef struct VkIndirectExecutionSetCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkIndirectExecutionSetInfoTypeEXT type;
VkIndirectExecutionSetInfoEXT info;
} VkIndirectExecutionSetCreateInfoEXT;
typeis aVkIndirectExecutionSetInfoTypeEXTdescribing the type of set being created and determining which field of the info union will be used.infois aVkIndirectExecutionSetInfoEXTunion containing layout information for the indirect execution set.
The VkIndirectExecutionSetInfoTypeEXT enum is defined as:
typedef enum VkIndirectExecutionSetInfoTypeEXT
{
VK_INDIRECT_EXECUTION_SET_INFO_TYPE_PIPELINES_EXT = 0x00000001,
VK_INDIRECT_EXECUTION_SET_INFO_TYPE_SHADER_OBJECTS_EXT = 0x00000002,
} VkIndirectExecutionSetInfoTypeEXT;
VK_INDIRECT_EXECUTION_SET_INFO_TYPE_PIPELINES_EXTindicates that theVkIndirectExecutionSetEXTcontainsVkPipelineobjects.VK_INDIRECT_EXECUTION_SET_INFO_TYPE_SHADER_OBJECTS_EXTindicates that theVkIndirectExecutionSetEXTcontainsVkShaderEXTobjects.
The VkIndirectExecutionSetInfoEXT union is defined as:
typedef union VkIndirectExecutionSetInfoEXT {
const VkIndirectExecutionSetPipelineInfoEXT *pPipelineInfo;
const VkIndirectExecutionSetShaderInfoEXT *pShaderInfo;
}
pPipelineInfois a pointer to aVkIndirectExecutionSetPipelineInfoEXTstruct containing pipeline layout information for the indirect execution set.pShaderInfois a pointer to aVkIndirectExecutionSetShaderInfoEXTstruct containing shader object layout information for the indirect execution set.
The VkIndirectExecutionSetPipelineInfoEXT structure is defined as:
typedef struct VkIndirectExecutionSetPipelineInfoEXT {
VkStructureType sType;
const void* pNext;
VkPipeline initialPipeline;
uint32_t maxPipelineCount;
} VkIndirectExecutionSetPipelineInfoEXT;
initialPipelineis the pipeline to validate other pipelines in the set against. Its state will be used for validation even if it is removed from the set. This pipeline will be automatically added to the set at index0. The bind point must be supported byVkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT::supportedIndirectCommandsShaderStagesPipelineBinding.maxPipelineCountis the maximum number of pipelines stored in the set.
The VkIndirectExecutionSetShaderInfoEXT structure is defined as:
typedef struct VkIndirectExecutionSetShaderInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t shaderCount;
const VkShaderEXT *pInitialShaders;
const VkIndirectExecutionSetShaderLayoutInfoEXT *pSetLayoutInfos;
uint32_t maxShaderCount;
uint32_t pushConstantRangeCount;
const VkPushConstantRange *pPushConstantRanges;
} VkIndirectExecutionSetShaderInfoEXT;
shaderCountis the number of members in thepInitialShadersandpSetLayoutInfosarrays.pInitialShadersis a pointer to an array containing aVkShaderEXTobject for each shader stage that will be used in the set. These shaders will be used to validate other shaders in the set against. Their state will be used for validation even if they are removed from the set. These shaders will be automatically added to the set beginning at index0. The stages of the shaders must be supported byVkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT::supportedIndirectCommandsShaderStagesShaderBinding.pSetLayoutInfosis a pointer to array containingVkIndirectExecutionSetShaderLayoutInfoEXTinfos used by each correspondingpInitialShadersshader stage in the set.maxShaderCountis the maximum number of corresponding shader objects stored in the set.pushConstantRangeCountis the number of members in thepPushConstantRangesarray.pPushConstantRangesis a pointer to the array ofVkPushConstantRangeranges used by all shaders in the set.
The VkIndirectExecutionSetShaderLayoutInfoEXT structure is defined as:
typedef struct VkIndirectExecutionSetShaderLayoutInfoEXT {
uint32_t setLayoutCount;
const VkDescriptorSetLayout *pSetLayouts;
} VkIndirectExecutionSetShaderLayoutInfoEXT;
setLayoutCountis the number ofVkDescriptorSetLayoutin thepSetLayoutsarray.pSetLayoutsis a pointer to an array containingVkDescriptorSetLayoutobjects used by a given shader stage.
Indirect execution sets are destroyed by:
VKAPI_ATTR void VKAPI_CALL vkDestroyIndirectExecutionSetEXT(
VkDevice device,
VkIndirectExecutionSetEXT indirectExecutionSet,
const VkAllocationCallbacks* pAllocator);
deviceis the logical device that owns the indirect execution set.indirectExecutionSetis the indirect execution set to destroy.pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter.
Updates
Once created, execution slots in indirect execution sets can be updated with one of the following functions depending on how it was created:
VKAPI_ATTR void VKAPI_CALL vkUpdateIndirectExecutionSetPipelineEXT(
VkDevice device,
VkIndirectExecutionSetEXT indirectExecutionSet,
uint32_t executionSetWriteCount,
const VkWriteIndirectExecutionSetPipelineEXT* pExecutionSetWrites);
deviceis the logical device that owns the indirect execution set.indirectExecutionSetis the indirect execution set to update.executionSetWriteCountis the number of elements inpExecutionSetWrites.pExecutionSetWritesis a pointer to aVkWriteIndirectExecutionSetPipelineEXTstructure describing the elements to update.
VKAPI_ATTR void VKAPI_CALL vkUpdateIndirectExecutionSetShaderEXT(
VkDevice device,
VkIndirectExecutionSetEXT indirectExecutionSet,
uint32_t executionSetWriteCount,
const VkWriteIndirectExecutionSetShaderEXT* pExecutionSetWrites);
deviceis the logical device that owns the indirect execution set.indirectExecutionSetis the indirect execution set to update.executionSetWriteCountis the number of elements inpExecutionSetWrites.pExecutionSetWritesis a pointer to aVkWriteIndirectExecutionSetShaderEXTstructure describing the elements to update.
It is legal to update an indirect execution set that is used in flight as long as the slot indices in VkWriteIndirectExecutionSetEXT are not in use. Any change to an indirect execution set requires recalculating memory requirements by calling vkGetGeneratedCommandsMemoryRequirementsEXT for commands that use that modified state. Commands that are in flight or those not using the changed state are safe.
The VkWriteIndirectExecutionSetPipelineEXT struct is defined as:
typedef struct VkWriteIndirectExecutionSetPipelineEXT {
VkStructureType sType;
const void* pNext;
uint32_t index;
VkPipeline pipeline;
} VkWriteIndirectExecutionSetPipelineEXT;
indexis the execution slot to updatepipelineis the pipeline to store in the indirect execution set
The VkWriteIndirectExecutionSetShaderEXT struct is defined as:
typedef struct VkWriteIndirectExecutionSetShaderEXT {
VkStructureType sType;
const void* pNext;
uint32_t index;
VkShaderEXT shader;
} VkWriteIndirectExecutionSetShaderEXT;
indexis the execution slot to updateshaderis the shader object to store in the indirect execution set
Indirect Commands Layout
The device-side command generation happens through an iterative processing of an atomic sequence comprised of command tokens, which are represented by:
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectCommandsLayoutEXT)
Creation and Deletion
Indirect command layouts are created by:
VKAPI_ATTR VkResult VKAPI_CALL vkCreateIndirectCommandsLayoutEXT(
VkDevice device,
const VkIndirectCommandsLayoutCreateInfoEXT* pCreateInfo,
const VkAllocationCallbacks* pAllocator,
VkIndirectCommandsLayoutEXT* pIndirectCommandsLayout);
deviceis the logical device that creates the indirect command layout.pCreateInfois a pointer to aVkIndirectCommandsLayoutCreateInfoEXTstructure containing parameters affecting creation of the indirect command layout.pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter.pIndirectCommandsLayoutis a pointer to aVkIndirectCommandsLayoutEXThandle in which the resulting indirect command layout is returned.
The VkIndirectCommandsLayoutCreateInfoEXT structure is defined as:
typedef struct VkIndirectCommandsLayoutCreateInfoEXT {
VkStructureType sType;
const void* pNext;
VkIndirectCommandsLayoutUsageFlagsEXT flags;
VkShaderStageFlags shaderStages;
uint32_t indirectStride;
VkPipelineLayout pipelineLayout;
uint32_t tokenCount;
const VkIndirectCommandsLayoutTokenEXT* pTokens;
} VkIndirectCommandsLayoutCreateInfoEXT;
flagsis a bitmask ofVkIndirectCommandsLayoutUsageFlagBitsEXTspecifying usage rules for this layout.shaderStagesis theVkShaderStageFlagsthat this layout supports.indirectStrideis the stride of the indirect buffer.pipelineLayoutis theVkPipelineLayoutthat this layout supports. If aVK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_EXTorVK_INDIRECT_COMMANDS_TOKEN_TYPE_SEQUENCE_INDEX_EXTis used by the layout, it must not beVK_NULL_HANDLE.tokenCountis the length of the individual command sequence.pTokensis an array describing each command token in detail. SeeVkIndirectCommandsTokenTypeEXTandVkIndirectCommandsLayoutTokenEXTbelow for details.
A VkPipelineLayoutCreateInfo can be passed in pNext if the dynamicGeneratedPipelineLayout feature is enabled.
Bits which can be set in VkIndirectCommandsLayoutCreateInfoEXT::flags, specifying usage rules of an indirect command layout, are:
typedef enum VkIndirectCommandsLayoutUsageFlagBitsEXT
{
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_EXT = 0x00000001,
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_EXT = 0x00000002,
} VkIndirectCommandsLayoutUsageFlagBitsEXT;
typedef VkFlags VkIndirectCommandsLayoutUsageFlagsEXT;
VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_EXTspecifies that the layout is always used with the manual preprocessing step through callingvkCmdPreprocessGeneratedCommandsEXTand executed byvkCmdExecuteGeneratedCommandsEXTwhenisPreprocessedset toVK_TRUE.VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_EXTspecifies that submission order is not affected by the ordering of sequences, and sequences may be processed in any order.
Indirect command layouts are destroyed by:
VKAPI_ATTR void VKAPI_CALL vkDestroyIndirectCommandsLayoutEXT(
VkDevice device,
VkIndirectCommandsLayoutEXT indirectCommandsLayout,
const VkAllocationCallbacks* pAllocator);
deviceis the logical device that owns the layout.indirectCommandsLayoutis the layout to destroy.pAllocatorcontrols host memory allocation as described in the Memory Allocation chapter.
Token layout
Each sequence of commands in the indirect buffer has the same memory layout. The data can contain raw uint32_t values, existing indirect command such as VkDrawIndirectCommand, or additional commands listed in the next section.
The VkIndirectCommandsLayoutTokenEXT structure specifies details to the commands that need to be known at layout creation time:
typedef struct VkIndirectCommandsLayoutTokenEXT {
VkStructureType sType;
const void* pNext;
VkIndirectCommandsTokenTypeEXT type;
VkIndirectCommandsTokenDataEXT data;
uint32_t offset;
} VkIndirectCommandsLayoutTokenEXT;
typespecifies the token command type.dataspecifies token specific details for command execution.offsetis the relative byte offset for the token within one sequence of the indirect buffer. The data stored at that offset is the command data for the token, e.g.VkDispatchIndirectCommand.
Token data is a union of additional information specific to the command:
typedef union VkIndirectCommandsTokenDataEXT {
const VkIndirectCommandsPushConstantTokenEXT *pPushConstant;
const VkIndirectCommandsVertexBufferTokenEXT *pVertexBuffer;
const VkIndirectCommandsIndexBufferTokenEXT *pIndexBuffer;
const VkIndirectCommandsExecutionSetTokenEXT *pExecutionSet;
} VkIndirectCommandsTokenDataEXT;
These structures are described in the next section.
Indirect Commands
This extension defines the following commands for state changes and operations:
All commands can be stored 4-byte aligned, independent of 64-bit alignment of structures due to use of VkDeviceAddress. This provides binary compatibility with D3D12.
The type of tokens in a sequence is specified by VkIndirectCommandsTokenTypeEXT which must be one of the values:
typedef enum VkIndirectCommandsTokenTypeEXT {
VK_INDIRECT_COMMANDS_TOKEN_TYPE_EXECUTION_SET_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_SEQUENCE_INDEX_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_NV_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_COUNT_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_COUNT_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_NV_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_EXT,
VK_INDIRECT_COMMANDS_TOKEN_TYPE_TRACE_RAYS2_EXT,
} VkIndirectCommandsTokenTypeEXT;
Bind Execution Command
An array of 32-bit unsigned integer values are the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_EXECUTION_SET_EXT token.
Each value is an index, specified in canonical pipeline order, into the Indirect Execution Set.
One index value must be passed for each bit set in VkIndirectCommandsExecutionSetTokenEXT::shaderStages.
The VkIndirectCommandsExecutionSetTokenEXT structure specifies additional info used when creating the layout object:
struct VkIndirectCommandsExecutionSetTokenEXT {
VkIndirectExecutionSetInfoTypeEXT type;
VkShaderStageFlags shaderStages;
};
typemust be eitherVK_INDIRECT_EXECUTION_SET_INFO_TYPE_PIPELINES_EXTorVK_INDIRECT_EXECUTION_SET_INFO_TYPE_SHADER_OBJECTS_EXT.shaderStagesspecifies the shaders that will be changed by this token.
This must be the first command in a sequence when used.
Pipelines and shaders bound in indirect buffers must be flagged at creation time:
#define VK_PIPELINE_CREATE_2_INDIRECT_BINDABLE_BIT_EXT ((VkPipelineCreateFlagBits)0x4000000000ULL)
#define VK_SHADER_CREATE_INDIRECT_BINDABLE_BIT_EXT ((VkShaderCreateFlagBitsEXT)0x00000080)
Push Constants Command
Raw 32-bit values are the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_EXT token.
Interpretation of the data is specified at layout creation time:
typedef struct VkIndirectCommandsPushConstantTokenEXT {
VkPushConstantRange updateRange;
} VkIndirectCommandsPushConstantTokenEXT;
updateRangeis the range of push constant data to update.
Sequence Index Command
There is a single uint32_t of placeholder data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_SEQUENCE_INDEX_EXT token which is not accessed by the shader. It writes a single 32-bit value containing the current sequence index to the specified push constant range.
Interpretation of the data is specified at layout creation time:
typedef struct VkIndirectCommandsPushConstantTokenEXT {
VkPushConstantRange updateRange;
} VkIndirectCommandsPushConstantTokenEXT;
updateRangeis the range of push constant data to update.updateRange.sizemust be 4.
Bind Index Buffer Command
The VkBindIndexBufferIndirectCommandEXT structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_EXT token.
typedef struct VkBindIndexBufferIndirectCommandEXT {
VkDeviceAddress bufferAddress;
uint32_t size;
VkIndexType indexType;
} VkBindIndexBufferIndirectCommandEXT;
bufferAddressspecifies a physical address of theVkBufferused as an index buffer.sizeis the byte size range which is available for this operation from the provided address.indexTypeis aVkIndexTypevalue specifying how indices are treated. Instead of the Vulkan enum values, customuint32_tvalues can be mapped to anVkIndexTypeas described below.
The index buffer is bound as specified at layout creation time:
typedef struct VkIndirectCommandsIndexBufferTokenEXT {
VkIndirectCommandsInputModeFlagsEXT mode;
} VkIndirectCommandsIndexBufferTokenEXT;
modeis a singleVkIndirectCommandsInputModeFlagBitsEXTvalue specifying the mode to be used with this token.
The VkIndirectCommandsInputModeFlagsEXT enum is defined as:
typedef enum VkIndirectCommandsInputModeFlagBitsEXT
{
VK_INDIRECT_COMMANDS_INPUT_MODE_VULKAN_INDEX_BUFFER_EXT = 0x00000001,
VK_INDIRECT_COMMANDS_INPUT_MODE_DXGI_INDEX_BUFFER_EXT = 0x00000002,
} VkIndirectCommandsInputModeFlagBitsEXT;
typedef VkFlags VkIndirectCommandsInputModeFlagsEXT;
VK_INDIRECT_COMMANDS_INPUT_MODE_VULKAN_INDEX_BUFFER_EXTindicates that the indirect buffer containsVkBindIndexBufferIndirectCommandEXT.VK_INDIRECT_COMMANDS_INPUT_MODE_DXGI_INDEX_BUFFER_EXTindicates that the indirect buffer containsD3D12_INDEX_BUFFER_VIEW.
This allows for easy layering of Vulkan atop other APIs. When VK_INDIRECT_COMMANDS_INPUT_MODE_DXGI_INDEX_BUFFER_EXT is specified, the indirect buffer can contain a D3D12_INDEX_BUFFER_VIEW instead of VkBindIndexBufferIndirectCommandEXT as D3D’s DXGI format value is mapped to the VkIndexType. It works as both structs are otherwise binary compatible.
Bind Vertex Buffer Command
The VkBindVertexBufferIndirectCommandEXT structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_EXT token.
typedef struct VkBindVertexBufferIndirectCommandEXT {
VkDeviceAddress bufferAddress;
uint32_t size;
uint32_t stride;
} VkBindVertexBufferIndirectCommandEXT;
bufferAddressspecifies a physical address of theVkBufferused as a vertex input binding.sizeis the byte size range which is available for this operation from the provided address.strideis the byte size stride for this vertex input binding as inVkVertexInputBindingDescription::stride.
The vertex buffer is bound as specified at layout creation time:
typedef struct VkIndirectCommandsVertexBufferTokenEXT {
uint32_t vertexBindingUnit;
} VkIndirectCommandsVertexBufferTokenEXT;
vertexBindingUnitis the vertex input binding number to be bound.
Both VkBindVertexBufferIndirectCommandEXT and D3D12_VERTEX_BUFFER_VIEW structs are binary compatible.
Draw Commands
Draws can be executed with following commands:
- The
VkDrawIndexedIndirectCommandstructure specifies the inputs data for theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_EXTtoken. - The
VkDrawIndirectCommandstructure specifies the input data for theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_EXTtoken. - If
EXT_mesh_shaderis enabled, theVkDrawMeshTasksIndirectCommandEXTstructure specifies the input data for theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_EXTtoken. - If
NV_mesh_shaderis enabled, theVkDrawMeshTasksIndirectCommandNVstructure specifies the input data for theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_NV_EXTtoken.
Multi-draw Commands
Multiple draws can be executed using the following commands:
- Indexed draws with the
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_COUNT_EXTtoken. - Non-indexed draws with the
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_COUNT_EXTtoken. - If
EXT_mesh_shaderis enabled, mesh tasks with theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_EXTtoken. - If
NV_mesh_shaderis enabled, mesh tasks with theVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_NV_EXTtoken. - The
DrawIndexshader variable is zero-indexed for each multi-draw token.
All multi-draw commands use VkDrawIndirectCountIndirectCommandEXT data:
typedef struct VkDrawIndirectCountIndirectCommandEXT {
VkDeviceAddress bufferAddress;
uint32_t stride;
uint32_t commandCount;
} VkDrawIndirectCountIndirectCommandEXT;
bufferAddressspecifies a physical address of theVkBufferused for draw commands.strideis the byte size stride for the command argumentscommandCountis the number of commands to execute
The data in bufferAddress depends on the token:
VkDrawIndexedIndirectCommandforVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_COUNT_EXT.VkDrawIndirectCommandforVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_COUNT_EXT.VkDrawMeshTasksIndirectCommandEXTforVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_EXT.VkDrawMeshTasksIndirectCommandNVforVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_NV_EXT.
Dispatch Command
The VkDispatchIndirectCommand structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_EXT token.
Trace Rays Command
If VK_KHR_ray_tracing_maintenance1 is enabled, the VkTraceRaysIndirectCommand2KHR structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_TRACE_RAYS2_EXT token.
Preprocess Buffer
The generation of commands on the device may require a preprocess buffer. Implementations may use this for the storage of device-specific commands or scratch memory.
To retrieve the memory size and alignment requirements of a particular execution state call:
VKAPI_ATTR void VKAPI_CALL vkGetGeneratedCommandsMemoryRequirementsEXT(
VkDevice device,
const VkGeneratedCommandsMemoryRequirementsInfoEXT* pInfo,
VkMemoryRequirements2* pMemoryRequirements);
deviceis the logical device that will create the buffer.pInfois a pointer to aVkGeneratedCommandsMemoryRequirementsInfoEXTstructure containing parameters required for the memory requirements query.pMemoryRequirementsis a pointer to aVkMemoryRequirements2structure in which the memory requirements of the buffer object are returned.
If pMemoryRequirements→memoryRequirements.size is zero then preprocessing is not required.
The VkGeneratedCommandsMemoryRequirementsInfoEXT structure is defined as:
typedef struct VkGeneratedCommandsMemoryRequirementsInfoEXT {
VkStructureType sType;
const void* pNext;
VkIndirectExecutionSetEXT indirectExecutionSet;
VkIndirectCommandsLayoutEXT indirectCommandsLayout;
uint32_t maxSequenceCount;
uint32_t maxDrawCount;
} VkGeneratedCommandsMemoryRequirementsInfoEXT;
shaderStagesis the mask of shader stages that this buffer memory is intended to be used with during the execution.indirectExecutionSetis the indirect execution set to be used for binding shaders. If the token sequence will contain aVK_INDIRECT_COMMANDS_TOKEN_TYPE_EXECUTION_SET_EXTtoken, it must not beVK_NULL_HANDLE.indirectCommandsLayoutis theVkIndirectCommandsLayoutEXTthat this buffer memory is intended to be used with.maxSequenceCountis the maximum number of sequences that this buffer memory can be used with.maxDrawCountis the maximum number of indirect draws that can be executed by any COUNT-type multi-draw indirect tokens (equivalent tomaxDrawCountinvkCmdDrawIndirectCount)
Preprocess buffer memory can be recycled with different execution/preprocessing operations, but must be synchronized using barriers with VK_PIPELINE_STAGE_COMMAND_PREPROCESS_BIT_EXT and VK_ACCESS_COMMAND_PREPROCESS_WRITE/READ_BIT_EXT.
The contents and the layout of this buffer is opaque to applications and must not be modified or copied to another buffer for reuse.
If indirectExecutionSet is VK_NULL_HANDLE, pipeline or shader info must be passed through the pNext pointer using either a VkGeneratedCommandsPipelineInfoEXT or VkGeneratedCommandsShaderInfoEXT struct.
The VkGeneratedCommandsPipelineInfoEXT structure is defined as:
typedef struct VkGeneratedCommandsPipelineInfoEXT {
VkStructureType sType;
const void* pNext;
VkPipeline pipeline;
} VkGeneratedCommandsPipelineInfoEXT;
pipelineis a pipeline comprised of shaders that are compatible with the ones which will be used with the resulting indirect buffer.
The VkGeneratedCommandsShaderInfoEXT structure is defined as:
typedef struct VkGeneratedCommandsShaderInfoEXT {
VkStructureType sType;
const void* pNext;
uint32_t shaderCount;
const VkShaderExt *pShaders;
} VkGeneratedCommandsShaderInfoEXT;
shaderCountis the number of members in thepShadersarray.pShadersis a pointer to an array of shaders that are compatible with the ones which will be used with the resulting indirect buffer.
Command Buffer
Synchronization
Synchronization of preprocessing via vkCmdPreprocessGeneratedCommandsEXT and generation/execution via vkCmdExecuteGeneratedCommandsEXT is supported with a new stage and access flags:
#define VK_PIPELINE_STAGE_COMMAND_PREPROCESS_BIT_EXT ((VkPipelineStageFlagBits)0x00020000)
#define VK_ACCESS_COMMAND_PREPROCESS_READ_BIT_EXT ((VkAccessFlagBits)0x00020000)
#define VK_ACCESS_COMMAND_PREPROCESS_WRITE_BIT_EXT ((VkAccessFlagBits)0x00040000)
VK_PIPELINE_STAGE_COMMAND_PREPROCESS_BIT_EXTspecifies the stage of the pipeline where device-side preprocessing for generated commands viavkCmdPreprocessGeneratedCommandsEXTis handled.VK_ACCESS_COMMAND_PREPROCESS_READ_BIT_EXTspecifies reads from buffer inputs tovkCmdPreprocessGeneratedCommandsEXT. Such access occurs in theVK_PIPELINE_STAGE_COMMAND_PREPROCESS_BIT_EXTpipeline stage.VK_ACCESS_COMMAND_PREPROCESS_WRITE_BIT_EXTspecifies writes to preprocess outputs fromvkCmdPreprocessGeneratedCommandsEXT. Such access occurs in theVK_PIPELINE_STAGE_COMMAND_PREPROCESS_BIT_EXTpipeline stage.
Generated Commands
Device-generated commands are specified by:
typedef struct VkGeneratedCommandsInfoEXT {
VkStructureType sType;
const void* pNext;
VkShaderStageFlags shaderStages;
VkIndirectExecutionSetEXT indirectExecutionSet;
VkIndirectCommandsLayoutEXT indirectCommandsLayout;
VkDeviceAddress indirectAddress;
VkDeviceSize indirectAddressSize;
VkDeviceAddress preprocessAddress;
VkDeviceSize preprocessSize;
uint32_t maxSequenceCount;
VkDeviceAddress sequenceCountAddress;
uint32_t maxDrawCount;
} VkGeneratedCommandsInfoEXT;
shaderStagesis the mask of shader stages used by the commands.indirectExecutionSetis the indirect execution set to be used for binding shaders. If the token sequence contains aVK_INDIRECT_COMMANDS_TOKEN_TYPE_EXECUTION_SET_EXTtoken, it must not beVK_NULL_HANDLE.indirectCommandsLayoutis theVkIndirectCommandsLayoutEXTthat specifies the command sequence data.indirectAddressis an address that holds the indirect buffer data.indirectAddressSizeis the size of the address space that holds the indirect buffer data.preprocessAddressspecifies a physical address of theVkBufferused for preprocessing the input data for execution. It must not be0ifvkGetGeneratedCommandsMemoryRequirementsEXTreturns non-zero size.preprocessSizeis the maximum byte size within thepreprocessAddressthat is available for preprocessing.maxSequenceCountis used to determine the number of sequences to execute. IfsequenceCountAddressis notNULL, thenmaxSequenceCountis the maximum number of sequences that can be executed. The actual number ismin(maxSequenceCount, *sequenceCountAddress). Otherwise ifsequenceCountAddressisNULL, thenmaxSequenceCountis the exact number of sequences to execute.sequenceCountAddressspecifies an optional physical address of a singleuint32_tvalue containing the requested number of sequences to execute.maxDrawCountis the maximum number of indirect draws that can be executed by any COUNT-type multi-draw indirect tokens (equivalent tomaxDrawCountinvkCmdDrawIndirectCount)
When preprocessing, if indirectExecutionSet is VK_NULL_HANDLE then pipeline or shader info must be passed through the pNext pointer using either a VkGeneratedCommandsPipelineInfoEXT or VkGeneratedCommandsShaderInfoEXT struct.
The actual generation of commands as well as their execution on the device is handled as single action with:
VKAPI_ATTR void VKAPI_CALL vkCmdExecuteGeneratedCommandsEXT(
VkCommandBuffer commandBuffer,
VkBool32 isPreprocessed,
const VkGeneratedCommandsInfoEXT* pGeneratedCommandsInfo);
commandBufferis the command buffer into which the command is recorded.isPreprocessedrepresents whether the input data has been previously preprocessed on the device. If it isVK_TRUE,vkCmdPreprocessGeneratedCommandsEXTmust have been previously called. If it isVK_FALSE, any necessary processing will be performed as part of this command.pGeneratedCommandsInfois a pointer to aVkGeneratedCommandsInfoEXTstructure containing parameters affecting the generation of commands.
All state affected by executed tokens is undefined after this command. The view mask of an active rendering pass must be zero.
Commands can be preprocessed prior execution using the following command:
VKAPI_ATTR void VKAPI_CALL vkCmdPreprocessGeneratedCommandsEXT(
VkCommandBuffer commandBuffer,
const VkGeneratedCommandsInfoEXT* pGeneratedCommandsInfo,
VkCommandBuffer stateCommandBuffer);
commandBufferis the command buffer which does the preprocessing.pGeneratedCommandsInfois a pointer to aVkGeneratedCommandsInfoEXTstructure containing parameters affecting the preprocessing step.stateCommandBufferis an command buffer from which to pull state affecting the preprocessing step.
Explicitly preprocessing the indirect buffer provides more control over the scheduling of work. If not performed, the implementation may still have additional work to do that is deferred to execution time.
The bound state in stateCommandBuffer must be identical to the state bound at the time vkCmdExecuteGeneratedCommandsEXT is recorded.
Features
The following features are exposed by this extension:
typedef struct VkPhysicalDeviceDeviceGeneratedCommandsFeaturesEXT
{
VkStructureType sType;
const void* pNext;
VkBool32 deviceGeneratedCommands;
VkBool32 dynamicGeneratedPipelineLayout;
} VkPhysicalDeviceDeviceGeneratedCommandsFeaturesEXT;
deviceGeneratedCommandsis the core feature enabling the extensiondynamicGeneratedPipelineLayoutenables passing aVkPipelineLayoutCreateInfoin thepNextofVkIndirectCommandsLayoutCreateInfoEXTwith aVK_NULL_HANDLEpipelineLayout
Properties
The following properties are exposed by this extension:
typedef struct VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT
{
VkStructureType sType;
const void* pNext;
uint32_t maxIndirectPipelineCount;
uint32_t maxIndirectShaderObjectCount;
uint32_t maxIndirectSequenceCount;
uint32_t maxIndirectCommandsTokenCount;
uint32_t maxIndirectCommandsTokenOffset;
uint32_t maxIndirectCommandsIndirectStride;
VkIndirectCommandsInputModeFlagsEXT supportedIndirectCommandsInputModes;
VkShaderStageFlags supportedIndirectCommandsShaderStages;
VkShaderStageFlags supportedIndirectCommandsShaderStagesPipelineBinding;
VkShaderStageFlags supportedIndirectCommandsShaderStagesShaderBinding;
VkBool32 deviceGeneratedCommandsTransformFeedback;
VkBool32 deviceGeneratedCommandsMultiDrawIndirectCount;
} VkPhysicalDeviceDeviceGeneratedCommandsPropertiesEXT;
The following limits affect indirect execution set creation:
maxIndirectPipelineCountindicates the maximum number of pipelines that can be stored in an indirect execution set.maxIndirectShaderObjectCountindicates the maximum number of shader objects that can be stored in an indirect execution set.supportedIndirectCommandsShaderStagesPipelineBindingis a bitmask of the shader stages which can be used within indirect execution sets comprised of pipelines.supportedIndirectCommandsShaderStagesShaderBindingis a bitmask of the shader stages which can be used within indirect execution sets comprised of shader objects.
The following limits affect indirect command layout creation:
maxIndirectCommandsTokenCountindicates the maximum number of tokens in a sequence.maxIndirectCommandsTokenOffsetindicates the maximum byte offset of a token within a sequence.supportedIndirectCommandsInputModesindicates the supported index buffer modes.
The following limits affect indirect command execution:
maxIndirectSequenceCountindicates the maximum number of sequences that can executed.maxIndirectCommandsIndirectStrideindicates the maximum stride that can be used for the indirect buffer.
If VK_EXT_transform_feedback is also enabled, deviceGeneratedCommandsTransformFeedback enables the use of Transform Feedback with indirect execution.
supportedIndirectCommandsShaderStages is a bitmask of the shader stages which can be active while executing indirect commands as well as the use of certain tokens.
VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_EXTandVK_INDIRECT_COMMANDS_TOKEN_TYPE_SEQUENCE_INDEX_EXTare always supported for the specified stages.
VK_SHADER_STAGE_VERTEX_BIT | VK_SHADER_STAGE_FRAGMENT_BIT enables use of these tokens:
VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_COUNT_EXTifdeviceGeneratedCommandsMultiDrawIndirectCountis supportedVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_COUNT_EXTifdeviceGeneratedCommandsMultiDrawIndirectCountis supported
If EXT_mesh_shader extension is also enabled, VK_SHADER_STAGE_FRAGMENT_BIT | VK_SHADER_STAGE_MESH_BIT_EXT enables use of these tokens:
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_EXTifdeviceGeneratedCommandsMultiDrawIndirectCountis supported
If NV_mesh_shader extension is also enabled, VK_SHADER_STAGE_FRAGMENT_BIT | VK_SHADER_STAGE_MESH_BIT_NV enables use of these tokens:
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_NV_EXTVK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_MESH_TASKS_COUNT_NV_EXTifdeviceGeneratedCommandsMultiDrawIndirectCountis supported
VK_SHADER_STAGE_COMPUTE_BIT enables use of these tokens:
VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_EXT
If VK_KHR_ray_tracing_maintenance1 is also enabled, the presence of ray tracing stages enables use of these tokens:
VK_INDIRECT_COMMANDS_TOKEN_TYPE_TRACE_RAYS2_EXT
D3D12 Emulation
Argument Structures
Most structures have direct equivalents:
Binding of views or constants require translation due to mismatches between the APIs.
Indirect Argument Type
Maps to VkIndirectCommandsTokenTypeEXT:
A root descriptor in D3D12 is a 64-bit virtual address to a raw buffer. To implement this, VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_EXT tokens can be used to update buffer device addresses stored in push constants rather than interacting with the descriptor binding model. Similar techniques can be used to update non-root descriptors as well.
VK_INDIRECT_COMMANDS_TOKEN_TYPE_SEQUENCE_INDEX_EXT can be used to mimic D3D12 DGC TIER_1_1 support.
Command Signature
ByteStrideis specified at execution time withVkGeneratedCommandsInfoEXT::indirectAddressRegion.stride.- Set
VkIndirectCommandsIndexBufferTokenEXT::modetoVK_INDIRECT_COMMANDS_INPUT_MODE_DXGI_INDEX_BUFFER_EXTto remapDXGI_FORMATvalues.
Alignment
Alignment requirements:
ByteStrideis 4 byte alignedCountBufferOffsetis 4 byte alignedArgumentBufferOffsetis 4 byte alignedtokenOffsetis 4 byte aligned
Examples
TODO
Issues
UNRESOLVED: How will future commands be added?
New pointer members will be added to VkIndirectCommandsTokenDataEXT.
RESOLVED: Should additional state be included?
No additional state changes are permitted in order to enable fast and broad adoption.
RESOLVED: What shader stages or pipeline states should be allowed to change?
All implementation-supported shader stagess can be changed indirectly. No pipeline state may be changed. Future extensions may expose additional functionality.
UNRESOLVED: Should Indirect execution sets be merged with either Shader Binding Tables or Indirect Object Sets?
- Significant overlap in functionality with Shader Binding Tables
- Indirect Object Sets would allow for indirect dynamic state groups.
RESOLVED: Should additional alignment properties be added?
Recent extensions have been using fixed rather than queryable alignment rules. It makes sense to use fixed alignments here too.
RESOLVED: Should index type values be remappable?
D3D12_INDEX_BUFFER_VIEW and VkBindIndexBufferIndirectCommandEXT have the same memory layout but DXGI_FORMAT and VkIndexType do not have equivalent values. Providing the ability to remap index type values in the layout simplifies API emulation.
There is explicit mapping from data values to VkIndexType.
RESOLVED: Should indirect buffers be reusable?
Yes, indirect buffers can be reused.
RESOLVED: How should commands with less than 32-bits of data be handled?
No such commands are provided.
RESOLVED: How should applications provide data to the preprocess command in order for drivers to optimize indirect execution?
A stateCommandBuffer is added to vkCmdPreprocessGeneratedCommandsEXT with the requirement that all state must match between this command buffer and the one used to record vkCmdExecuteGeneratedCommandsEXT.
This guarantees that all pipeline state and, specifically for draw commands, other state (e.g., vertex buffers, index buffers) is available at preprocess time.
Further Functionality
- Support for Multi-dispatch (needs something like
gl_drawIDfor compute shaders). - Multi-level indirect execution through a command that is equivalent to
vkCmdExecuteGeneratedCommandsEXT. - Indirect command buffers.
TODO
- Example section