VK_KHR_copy_memory_indirect.proposal
This document details the VK_KHR_copy_memory_indirect extension which adds support for performing copies between memory and image regions using indirect parameters that are read directly from the device memory during execution.
Problem Statement
While it is possible to copy data between memory regions by using existing copy commands, sometimes the copy parameters may not be available at the time of command buffer creation. This is solved by a round-trip between host and device to fetch the copy parameters.
Solution Space
The alternative is to have an indirect version of the copy command that reads the copy parameters directly on the device and these copy parameters need not be known at command buffer creation time.
Proposal
This extension introduces two indirect commands: one for copying between memory regions and another for copying between image and memory regions, with the copy parameters being read from device memory during execution.
API Features
The following provides a basic overview of how this extension can be used:
Feature
The core functionality of this extension is enabled by indirectMemoryCopy and indirectMemoryToImageCopy, which
facilitate copying between memory regions and from memory to image regions, respectively:
typedef struct VkPhysicalDeviceCopyMemoryIndirectFeaturesKHR {
VkStructureType sType;
void* pNext;
VkBool32 indirectMemoryCopy;
VkBool32 indirectMemoryToImageCopy;
} VkPhysicalDeviceCopyMemoryIndirectFeaturesKHR;
Properties
The following property is exposed by this extension:
typedef struct VkPhysicalDeviceCopyMemoryIndirectPropertiesKHR {
VkStructureType sType;
void* pNext;
VkQueueFlags supportedQueues;
} VkPhysicalDeviceCopyMemoryIndirectPropertiesKHR;
supportedQueues is a bitmask of VkQueueFlagBits indicating the family of queues on which indirect copy commands are supported.
Commands
This extension provides two commands for copying between memory regions and from memory to image regions, detailed further in the examples section below:
VKAPI_ATTR void VKAPI_CALL vkCmdCopyMemoryIndirectKHR(
VkCommandBuffer commandBuffer,
const VkCopyMemoryIndirectInfoKHR* pCopyMemoryIndirectInfo);
VKAPI_ATTR void VKAPI_CALL vkCmdCopyMemoryToImageIndirectKHR(
VkCommandBuffer commandBuffer,
const VkCopyMemoryToImageIndirectInfoKHR* pCopyMemoryToImageIndirectInfo);
Examples
vkCmdCopyMemoryIndirectKHR can be used as:
void vkCmdCopyMemoryIndirectKHR(
VkCommandBuffer commandBuffer,
const VkCopyMemoryIndirectInfoKHR* pCopyMemoryIndirectInfo);
VkCopyMemoryIndirectInfoKHR is a structure describing the copy count, copy flags and the address range containing copy parameters:
typedef struct VkCopyMemoryIndirectInfoKHR {
VkStructureType sType;
const void* pNext;
VkAddressCopyFlagsKHR srcCopyFlags;
VkAddressCopyFlagsKHR dstCopyFlags;
uint32_t copyCount;
VkStridedDeviceAddressRangeKHR copyAddressRange;
} VkCopyMemoryIndirectInfoKHR;
srcCopyFlags and dstCopyFlags define copy flags that specify if memory regions are on device memory, are sparse or are of protected type. They are specified with below flags:
typedef enum VkAddressCopyFlagBitsKHR {
VK_ADDRESS_COPY_DEVICE_LOCAL_BIT_KHR = 0x00000001,
VK_ADDRESS_COPY_SPARSE_BIT_KHR = 0x00000002,
VK_ADDRESS_COPY_PROTECTED_BIT_KHR = 0x00000004,
} VkAddressCopyFlagBitsKHR;
copyAddressRange contains the copy parameters as an array of VkCopyMemoryIndirectCommandKHR structures that specify the source and destination copy addresses and size:
typedef struct VkCopyMemoryIndirectCommandKHR {
VkDeviceAddress srcAddress;
VkDeviceAddress dstAddress;
VkDeviceSize size;
} VkCopyMemoryIndirectCommandKHR;
Similarly, use vkCmdCopyMemoryToImageIndirectKHR to perform memory to image copies:
void vkCmdCopyMemoryToImageIndirectKHR(
VkCommandBuffer commandBuffer,
const VkCopyMemoryToImageIndirectInfoKHR* pCopyMemoryToImageIndirectInfo);
VkCopyMemoryToImageIndirectInfoKHR is a structure describing the source copy flag, copy count, the address range containing copy parameters and destination image properties:
typedef struct VkCopyMemoryToImageIndirectInfoKHR {
VkStructureType sType;
const void* pNext;
VkAddressCopyFlagsKHR srcCopyFlags;
uint32_t copyCount;
VkStridedDeviceAddressRangeKHR copyAddressRange;
VkImage dstImage;
VkImageLayout dstImageLayout;
const VkImageSubresourceLayers* pImageSubresources;
} VkCopyMemoryToImageIndirectInfoKHR;
copyAddressRange contains the memory to image copy parameters as an array of VkCopyMemoryToImageIndirectCommandKHR structures that specify the source copy address, destination copy image region and copy offsets/extent:
VkStridedDeviceAddressRangeKHR containing copy parameters and VkCopyMemoryToImageIndirectCommandKHR defining the copy regions are defined as:
typedef struct VkStridedDeviceAddressRangeKHR {
VkDeviceAddress address;
VkDeviceSize size;
VkDeviceSize stride;
} VkStridedDeviceAddressRangeKHR;
typedef struct VkCopyMemoryToImageIndirectCommandKHR {
VkDeviceAddress srcAddress;
uint32_t bufferRowLength;
uint32_t bufferImageHeight;
VkImageSubresourceLayers imageSubresource;
VkOffset3D imageOffset;
VkExtent3D imageExtent;
} VkCopyMemoryToImageIndirectCommandKHR;
Note that the values specified in device memory at imageSubresource must match the values specified in pImageSubresources parameter of VkCopyMemoryToImageIndirectInfoKHR during command recording.
Issues
Should we add copyCount to be also sourced from the GPU and use the minimum of the API specified and GPU value?
No. Though this falls in line with some of the other similar indirect API commands, this can add significant complexity for memory to image copies. So, the consensus is to not add it.