VkVideoEncodeAV1CapabilitiesKHR
When calling vkGetPhysicalDeviceVideoCapabilitiesKHR to query the
capabilities for an AV1 encode profile, the
VkVideoCapabilitiesKHR::pNext
chain must include a
VkVideoEncodeAV1CapabilitiesKHR
structure that will be filled with the
profile-specific capabilities.
The VkVideoEncodeAV1CapabilitiesKHR
structure is defined as:
typedef struct VkVideoEncodeAV1CapabilitiesKHR {
VkStructureType sType;
void* pNext;
VkVideoEncodeAV1CapabilityFlagsKHR flags;
StdVideoAV1Level maxLevel;
VkExtent2D codedPictureAlignment;
VkExtent2D maxTiles;
VkExtent2D minTileSize;
VkExtent2D maxTileSize;
VkVideoEncodeAV1SuperblockSizeFlagsKHR superblockSizes;
uint32_t maxSingleReferenceCount;
uint32_t singleReferenceNameMask;
uint32_t maxUnidirectionalCompoundReferenceCount;
uint32_t maxUnidirectionalCompoundGroup1ReferenceCount;
uint32_t unidirectionalCompoundReferenceNameMask;
uint32_t maxBidirectionalCompoundReferenceCount;
uint32_t maxBidirectionalCompoundGroup1ReferenceCount;
uint32_t maxBidirectionalCompoundGroup2ReferenceCount;
uint32_t bidirectionalCompoundReferenceNameMask;
uint32_t maxTemporalLayerCount;
uint32_t maxSpatialLayerCount;
uint32_t maxOperatingPoints;
uint32_t minQIndex;
uint32_t maxQIndex;
VkBool32 prefersGopRemainingFrames;
VkBool32 requiresGopRemainingFrames;
VkVideoEncodeAV1StdFlagsKHR stdSyntaxFlags;
} VkVideoEncodeAV1CapabilitiesKHR;
sType
is a VkStructureType value identifying this structure.pNext
isNULL
or a pointer to a structure extending this structure.flags
is a bitmask of VkVideoEncodeAV1CapabilityFlagBitsKHR indicating supported AV1 encoding capabilities.maxLevel
is aStdVideoAV1Level
value indicating the maximum AV1 level supported by the profile, as defined in section A.3 of the AV1 Specification.codedPictureAlignment
indicates the alignment at which the implementation will code pictures. This capability does not impose any valid usage constraints on the application. However, depending on thecodedExtent
of the encode input picture resource, this capability may result in a change of the resolution of the encoded picture, as described in more detail below.maxTiles
indicates the maximum number of AV1 tile columns and rows the implementation supports.minTileSize
indicates the minimum extent of individual AV1 tiles the implementation supports.maxTileSize
indicates the maximum extent of individual AV1 tiles the implementation supports.superblockSizes
is a bitmask of VkVideoEncodeAV1SuperblockSizeFlagBitsKHR values indicating the supported AV1 superblock sizes.maxSingleReferenceCount
indicates the maximum number of reference pictures the implementation supports when using single reference prediction mode.singleReferenceNameMask
is a bitmask of supported AV1 reference names when using single reference prediction mode.maxUnidirectionalCompoundReferenceCount
indicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode.maxUnidirectionalCompoundGroup1ReferenceCount
indicates the maximum number of reference pictures the implementation supports when using unidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification.unidirectionalCompoundReferenceNameMask
is a bitmask of supported AV1 reference names when using unidirectional compound prediction mode.maxBidirectionalCompoundReferenceCount
indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode.maxBidirectionalCompoundGroup1ReferenceCount
indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 1, as defined in section 6.10.24 of the AV1 Specification.maxBidirectionalCompoundGroup2ReferenceCount
indicates the maximum number of reference pictures the implementation supports when using bidirectional compound prediction mode from reference frame group 2, as defined in section 6.10.24 of the AV1 Specification.bidirectionalCompoundReferenceNameMask
is a bitmask of supported AV1 reference names when using bidirectional compound prediction mode.maxTemporalLayerCount
indicates the maximum number of AV1 temporal layers supported by the implementation.maxSpatialLayerCount
indicates the maximum number of AV1 spatial layers supported by the implementation.maxOperatingPoints
indicates the maximum number of AV1 operating points supported by the implementation.minQIndex
indicates the minimum quantizer index value supported.maxQIndex
indicates the maximum quantizer index value supported.prefersGopRemainingFrames
indicates that the implementation’s rate control algorithm prefers the application to specify the number of frames in each AV1 rate control group
remaining in the current group of pictures when beginning a video coding scope.requiresGopRemainingFrames
indicates that the implementation’s rate control algorithm requires the application to specify the number of frames in each AV1 rate control group
remaining in the current group of pictures when beginning a video coding scope.stdSyntaxFlags
is a bitmask of VkVideoEncodeAV1StdFlagBitsKHR indicating capabilities related to AV1 syntax elements.
singleReferenceNameMask
,
unidirectionalCompoundReferenceNameMask
, and
bidirectionalCompoundReferenceNameMask
are encoded such that when bit
index i is set, it indicates support for the
AV1 reference name
STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME
+ i.
These masks indicate which elements of the referenceNameSlotIndices
member of VkVideoEncodeAV1PictureInfoKHR are supported to be used by
the implementation.
It is important to note that both the bits of these masks and the elements
of referenceNameSlotIndices
are indexed such that the first value
specifies the support bit and DPB slot index, respectively, for the AV1
reference name STD_VIDEO_AV1_REFERENCE_NAME_LAST_FRAME
(i.e. there is
no bit or element for STD_VIDEO_AV1_REFERENCE_NAME_INTRA_FRAME
).
codedPictureAlignment
provides information about implementation
limitations to encode arbitrary resolutions.
In particular, some implementations may not be able to generate bitstreams
aligned to the requirements of the AV1 Specification (8x8).
In such cases, the implementation may override the width and height of the bitstream, in order to produce a
bitstream compliant to the AV1 Specification.
If such an override occurs, the encoded resolution of the coded picture is
enlargened, with the texel values used for the texel coordinates outside of
the bounds of the codedExtent
of the encode input picture resource
being first governed by the rules regarding the
encode input picture granularity.
Any texel values outside of the region described by the encode input picture
granularity are implementation-defined.
Implementations should use well-defined values to minimize impact on the
produced encoded content.
This capability does not impose additional application requirements. However, these overrides change the effective resolution of the bitstream and add padding pixels. Applications sensitive to such overrides can use this capability and the corresponding override behavior to compute the cropping needed to reproduce the original input of the encoding and transmit it in a side channel (i.e. by using cropping fields available in a container). Additionally, applications can explicitly consider this alignment in their coded extent, to avoid implementation-defined texel values being included in the encoded content.
Valid Usage (Implicit)
VUID-VkVideoEncodeAV1CapabilitiesKHR-sType-sType
sType
must be VK_STRUCTURE_TYPE_VIDEO_ENCODE_AV1_CAPABILITIES_KHR