GPU Performance API Interface

The GPU Performance API (GPA) interface gives access to GPU performance counters, streaming performance monitors (SPM), and thread traces (SQTT), which help analyze the performance and execution characteristics of applications.

GPU Performance API Objects

VkGpaSessionAMDOpaque handle to a GPU Performance API object
vkCreateGpaSessionAMDCreate a new GPA session object
VkGpaSessionCreateInfoAMDStructure specifying parameters of a newly created GPA session
vkDestroyGpaSessionAMDDestroy a GPA session object

Beginning, ending, copying, and resetting a session

Performance counters are sampled between calls to vkCmdBeginGpaSessionAMD and vkCmdEndGpaSessionAMD. As long as they are executed in order, vkCmdBeginGpaSessionAMD and vkCmdEndGpaSessionAMD can span multiple command buffers.

If a session is reused after calling vkCmdEndGpaSessionAMD, the session must first be reset using vkResetGpaSessionAMD.

vkCmdBeginGpaSessionAMDBegin a GPA session
vkCmdEndGpaSessionAMDEnd a GPA session

Executing secondary command buffers multiple times, that themselves record results into sessions, causes their results to be overwritten. To prevent results from being lost due to subsequent executions, the results can be copied into another session.

vkCmdCopyGpaSessionResultsAMDCopying GPA session results

The source of the copy is the GPA session handle provided when gpaSession was created.

vkResetGpaSessionAMDReset a GPA session

Resetting a session object has less overhead than destroying and then creating a new one.

Beginning and ending sampling

Once a session has begun, samples can then be captured during the command buffer execution.

vkCmdBeginGpaSampleAMDBeginning a sample
VkGpaSampleBeginInfoAMDStructure specifying parameters of a GPA sample
VkGpaSampleTypeAMDEnum providing the sample type
VkGpaSqShaderStageFlagBitsAMDBitmask specifying GPU shader stage to sample
VkGpaSqShaderStageFlagsAMDBitmask of VkGpaSqShaderStageFlagBitsAMD
VkGpaPerfCounterAMDStructure specifying parameters of a GPA sample
VkGpaPerfBlockAMDEnum providing performance counter types
vkCmdEndGpaSampleAMDEnding a sample

Controlling GPU clocks

For performance counters and thread tracing to produce meaningful results, clock control and querying is available.

vkSetGpaDeviceClockModeAMDSetting a device clock
VkGpaDeviceClockModeInfoAMDStructure containing returned clock ratios or clock mode to set
VkGpaDeviceClockModeAMDEnum providing the clock mode or query
vkGetGpaDeviceClockInfoAMDGetting device clocks and ratios
VkGpaDeviceGetClockInfoAMDStructure containing returned clock ratios or clock mode to set

Session status and results querying

vkGetGpaSessionStatusAMDGetting the status of a GPA session

A return value of VK_SUCCESS indicates that the results are available to be read using vkGetGpaSessionResultsAMD. If results are not available, VK_NOT_READY is returned.

vkGetGpaSessionResultsAMDGetting the status of a GPA session

If pData is NULL, then the number of bytes of data in the results is returned in pSizeInBytes. Otherwise, pSizeInBytes must point to a variable set by the application to the number of elements in the pData array, and on return the variable is overwritten with the number of bytes written to pData. If the value of pSizeInBytes is less than the size required to write the results VK_INCOMPLETE will be returned instead of VK_SUCCESS, to indicate that the results were not written.