30. Device-Generated Commands

This chapter discusses the generation of command buffer content on the device, for which these principle steps are to be taken:

vkCmdPreprocessGeneratedCommandsNV executes in a separate logical pipeline from either graphics or compute. When preprocessing commands in a separate step they must be explicitly synchronized against the command execution. When not preprocessing, the preprocessing is automatically synchronized against the command execution.

30.1. Indirect Commands Layout

The device-side command generation happens through an iterative processing of an atomic sequence comprised of command tokens, which are represented by:

// Provided by VK_NV_device_generated_commands
VK_DEFINE_NON_DISPATCHABLE_HANDLE(VkIndirectCommandsLayoutNV)

30.1.1. Creation and Deletion

Indirect command layouts are created by:

// Provided by VK_NV_device_generated_commands
VkResult vkCreateIndirectCommandsLayoutNV(
    VkDevice                                    device,
    const VkIndirectCommandsLayoutCreateInfoNV* pCreateInfo,
    const VkAllocationCallbacks*                pAllocator,
    VkIndirectCommandsLayoutNV*                 pIndirectCommandsLayout);
  • device is the logical device that creates the indirect command layout.

  • pCreateInfo is a pointer to an instance of the VkIndirectCommandsLayoutCreateInfoNV structure containing parameters affecting creation of the indirect command layout.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

  • pIndirectCommandsLayout points to a VkIndirectCommandsLayoutNV handle in which the resulting indirect command layout is returned.

Valid Usage (Implicit)
Return Codes
Success
  • VK_SUCCESS

Failure
  • VK_ERROR_OUT_OF_HOST_MEMORY

  • VK_ERROR_OUT_OF_DEVICE_MEMORY

The VkIndirectCommandsLayoutCreateInfoNV structure is defined as:

// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsLayoutCreateInfoNV {
    VkStructureType                           sType;
    const void*                               pNext;
    VkIndirectCommandsLayoutUsageFlagsNV      flags;
    VkPipelineBindPoint                       pipelineBindPoint;
    uint32_t                                  tokenCount;
    const VkIndirectCommandsLayoutTokenNV*    pTokens;
    uint32_t                                  streamCount;
    const uint32_t*                           pStreamStrides;
} VkIndirectCommandsLayoutCreateInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pipelineBindPoint is the VkPipelineBindPoint that this layout targets.

  • flags is a bitmask of VkIndirectCommandsLayoutUsageFlagBitsNV specifying usage hints of this layout.

  • tokenCount is the length of the individual command sequence.

  • pTokens is an array describing each command token in detail. See VkIndirectCommandsTokenTypeNV and VkIndirectCommandsLayoutTokenNV below for details.

  • streamCount is the number of streams used to provide the token inputs.

  • pStreamStrides is an array defining the byte stride for each input stream.

The following code illustrates some of the flags:

void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sequencesCount, indexbuffer, indexbufferOffset)
{
  for (s = 0; s < sequencesCount; s++)
  {
    sUsed = s;

    if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV) {
      sUsed = indexbuffer.load_uint32( sUsed * sizeof(uint32_t) + indexbufferOffset);
    }

    if (indirectCommandsLayout.flags & VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV) {
      sUsed = incoherent_implementation_dependent_permutation[ sUsed ];
    }

    cmdProcessSequence( cmd, pipeline, indirectCommandsLayout, pIndirectCommandsTokens, sUsed );
  }
}
Valid Usage
  • The pipelineBindPoint must be VK_PIPELINE_BIND_POINT_GRAPHICS

  • tokenCount must be greater than 0 and less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsTokenCount

  • If pTokens contains an entry of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV it must be the first element of the array and there must be only a single element of such token type

  • If pTokens contains an entry of VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV there must be only a single element of such token type

  • All state tokens in pTokens must occur prior work provoking tokens (VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV, VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV, VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV)

  • The content of pTokens must include one single work provoking token that is compatible with the pipelineBindPoint

  • streamCount must be greater than 0 and less or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsStreamCount

  • each element of pStreamStrides must be greater than `0`and less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsStreamStride. Furthermore the alignment of each token input must be ensured

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_CREATE_INFO_NV

  • pNext must be NULL

  • flags must be a valid combination of VkIndirectCommandsLayoutUsageFlagBitsNV values

  • flags must not be 0

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • pTokens must be a valid pointer to an array of tokenCount valid VkIndirectCommandsLayoutTokenNV structures

  • pStreamStrides must be a valid pointer to an array of streamCount uint32_t values

  • tokenCount must be greater than 0

  • streamCount must be greater than 0

Bits which can be set in VkIndirectCommandsLayoutCreateInfoNV::flags, specifying usage hints of an indirect command layout, are:

// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectCommandsLayoutUsageFlagBitsNV {
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV = 0x00000001,
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV = 0x00000002,
    VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV = 0x00000004,
} VkIndirectCommandsLayoutUsageFlagBitsNV;
  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV specifies that the layout is always used with the manual preprocessing step through calling vkCmdPreprocessGeneratedCommandsNV and executed by vkCmdExecuteGeneratedCommandsNV with isPreprocessed set to VK_TRUE.

  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV specifies that the input data for the sequences is not implicitly indexed from 0..sequencesUsed but a user provided VkBuffer encoding the index is provided.

  • VK_INDIRECT_COMMANDS_LAYOUT_USAGE_UNORDERED_SEQUENCES_BIT_NV specifies that the processing of sequences can happen at an implementation-dependent order, which is not: guaranteed to be coherent using the same input data.

// Provided by VK_NV_device_generated_commands
typedef VkFlags VkIndirectCommandsLayoutUsageFlagsNV;

VkIndirectCommandsLayoutUsageFlagsNV is a bitmask type for setting a mask of zero or more VkIndirectCommandsLayoutUsageFlagBitsNV.

Indirect command layouts are destroyed by:

// Provided by VK_NV_device_generated_commands
void vkDestroyIndirectCommandsLayoutNV(
    VkDevice                                    device,
    VkIndirectCommandsLayoutNV                  indirectCommandsLayout,
    const VkAllocationCallbacks*                pAllocator);
  • device is the logical device that destroys the layout.

  • indirectCommandsLayout is the layout to destroy.

  • pAllocator controls host memory allocation as described in the Memory Allocation chapter.

Valid Usage
  • All submitted commands that refer to indirectCommandsLayout must have completed execution

  • If VkAllocationCallbacks were provided when indirectCommandsLayout was created, a compatible set of callbacks must be provided here

  • If no VkAllocationCallbacks were provided when indirectCommandsLayout was created, pAllocator must be NULL

  • The VkPhysicalDeviceDeviceGeneratedCommandsFeaturesNV::deviceGeneratedCommands feature must be enabled

Valid Usage (Implicit)
  • device must be a valid VkDevice handle

  • If indirectCommandsLayout is not VK_NULL_HANDLE, indirectCommandsLayout must be a valid VkIndirectCommandsLayoutNV handle

  • If pAllocator is not NULL, pAllocator must be a valid pointer to a valid VkAllocationCallbacks structure

  • If indirectCommandsLayout is a valid handle, it must have been created, allocated, or retrieved from device

Host Synchronization
  • Host access to indirectCommandsLayout must be externally synchronized

30.1.2. Token Input Streams

The VkIndirectCommandsStreamNV structure specifies the input data for one or more tokens at processing time.

// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsStreamNV {
    VkBuffer        buffer;
    VkDeviceSize    offset;
} VkIndirectCommandsStreamNV;
  • buffer specifies the VkBuffer storing the functional arguments for each sequence. These arguments can be written by the device.

  • offset specified an offset into buffer where the arguments start.

Valid Usage
  • The buffer’s usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • The offset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minIndirectCommandsBufferOffsetAlignment

  • If buffer is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

Valid Usage (Implicit)
  • buffer must be a valid VkBuffer handle

The input streams can contain raw uint32_t values, existing indirect commands such as:

or additional commands as listed below. How the data is used is described in the next section.

The VkBindShaderGroupIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV token.

// Provided by VK_NV_device_generated_commands
typedef struct VkBindShaderGroupIndirectCommandNV {
    uint32_t    groupIndex;
} VkBindShaderGroupIndirectCommandNV;
  • index specifies which shader group of the current bound graphics pipeline is used.

Valid Usage
  • The current bound graphics pipeline, as well as the pipelines it may reference, must have been created with VK_PIPELINE_CREATE_INDIRECT_BINDABLE_BIT_NV

  • The index must be within range of the accessible shader groups of the current bound graphics pipeline. See vkCmdBindPipelineShaderGroupNV for further details

The VkBindIndexBufferIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV token.

// Provided by VK_NV_device_generated_commands
typedef struct VkBindIndexBufferIndirectCommandNV {
    VkDeviceAddress    bufferAddress;
    uint32_t           size;
    VkIndexType        indexType;
} VkBindIndexBufferIndirectCommandNV;
  • bufferAddress specifies a physical address of the VkBuffer used as index buffer.

  • size is the byte size range which is available for this operation from the provided address.

  • indexType is a VkIndexType value specifying how indices are treated. Instead of the Vulkan enum values, a custom uint32_t value can be mapped to an VkIndexType by specifying the VkIndirectCommandsLayoutTokenNV::pIndexTypes and VkIndirectCommandsLayoutTokenNV::pIndexTypeValues arrays.

Valid Usage
  • The buffer’s usage flag from which the address was acquired must have the VK_BUFFER_USAGE_INDEX_BUFFER_BIT bit set

  • The bufferAddress must be aligned to the indexType used

  • Each element of the buffer from which the address was acquired and that is non-sparse must be bound completely and contiguously to a single VkDeviceMemory object

Valid Usage (Implicit)

The VkBindVertexBufferIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV token.

// Provided by VK_NV_device_generated_commands
typedef struct VkBindVertexBufferIndirectCommandNV {
    VkDeviceAddress    bufferAddress;
    uint32_t           size;
    uint32_t           stride;
} VkBindVertexBufferIndirectCommandNV;
  • bufferAddress specifies a physical address of the VkBuffer used as vertex input binding.

  • size is the byte size range which is available for this operation from the provided address.

  • stride is the byte size stride for this vertex input binding as in VkVertexInputBindingDescription::stride. It is only used if VkIndirectCommandsLayoutTokenNV::vertexDynamicStride was set, otherwise the stride is inherited from the current bound graphics pipeline.

Valid Usage
  • The buffer’s usage flag from which the address was acquired must have the VK_BUFFER_USAGE_VERTEX_BUFFER_BIT bit set

  • Each element of the buffer from which the address was acquired and that is non-sparse must be bound completely and contiguously to a single VkDeviceMemory object

The VkSetStateFlagsIndirectCommandNV structure specifies the input data for the VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV token. Which state is changed depends on the VkIndirectStateFlagBitsNV specified at VkIndirectCommandsLayoutNV creation time.

// Provided by VK_NV_device_generated_commands
typedef struct VkSetStateFlagsIndirectCommandNV {
    uint32_t    data;
} VkSetStateFlagsIndirectCommandNV;
  • data encodes packed state that this command alters.

    • Bit 0: If set represents VK_FRONT_FACE_CLOCKWISE, otherwise VK_FRONT_FACE_COUNTER_CLOCKWISE

A subset of the graphics pipeline state can be altered using indirect state flags:

// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectStateFlagBitsNV {
    VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV = 0x00000001,
} VkIndirectStateFlagBitsNV;
  • VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV allows to toggle the VkFrontFace rasterization state for subsequent draw operations.

// Provided by VK_NV_device_generated_commands
typedef VkFlags VkIndirectStateFlagsNV;

VkIndirectStateFlagsNV is a bitmask type for setting a mask of zero or more VkIndirectStateFlagBitsNV.

30.1.3. Tokenized Command Processing

The processing is in principle illustrated below:

void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
  for (t = 0; t < indirectCommandsLayout.tokenCount; t++)
  {
    uint32_t stream  = indirectCommandsLayout.pTokens[t].stream;
    uint32_t offset  = indirectCommandsLayout.pTokens[t].offset;
    uint32_t stride  = indirectCommandsLayout.pStreamStrides[stream];
    stream            = pIndirectCommandsStreams[stream];
    const void* input = stream.buffer.pointer( stream.offset + stride * s + offset )

    // further details later
    indirectCommandsLayout.pTokens[t].command (cmd, pipeline, input, s);
  }
}

void cmdProcessAllSequences(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, sequencesCount)
{
  for (s = 0; s < sequencesCount; s++)
  {
    cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s);
  }
}

The processing of each sequence is considered stateless, therefore all state changes must occur prior work provoking commands within the sequence. A single sequence is strictly targeting the VkPipelineBindPoint it was created with.

The primary input data for each token is provided through VkBuffer content at preprocessing using vkCmdPreprocessGeneratedCommandsNV or execution time using vkCmdExecuteGeneratedCommandsNV, however some functional arguments, for example binding sets, are specified at layout creation time. The input size is different for each token.

Possible values of those elements of the VkIndirectCommandsLayoutCreateInfoNV::pTokens array which specify command tokens (other elements of the array specify command parameters) are:

// Provided by VK_NV_device_generated_commands
typedef enum VkIndirectCommandsTokenTypeNV {
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV = 0,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV = 1,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV = 2,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV = 3,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV = 4,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV = 5,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV = 6,
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV = 7,
} VkIndirectCommandsTokenTypeNV;
Table 41. Supported indirect command tokens
Token type Equivalent command

VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV

vkCmdBindPipelineShaderGroupNV

VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV

-

VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV

vkCmdBindIndexBuffer

VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV

vkCmdBindVertexBuffers

VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV

vkCmdPushConstants

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV

vkCmdDrawIndexedIndirect

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV

vkCmdDrawIndirect

VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_TASKS_NV

vkCmdDrawMeshTasksIndirectNV

The VkIndirectCommandsLayoutTokenNV structure specifies details to the function arguments that need to be known at layout creation time:

// Provided by VK_NV_device_generated_commands
typedef struct VkIndirectCommandsLayoutTokenNV {
    VkStructureType                  sType;
    const void*                      pNext;
    VkIndirectCommandsTokenTypeNV    tokenType;
    uint32_t                         stream;
    uint32_t                         offset;
    uint32_t                         vertexBindingUnit;
    VkBool32                         vertexDynamicStride;
    VkPipelineLayout                 pushconstantPipelineLayout;
    VkShaderStageFlags               pushconstantShaderStageFlags;
    uint32_t                         pushconstantOffset;
    uint32_t                         pushconstantSize;
    VkIndirectStateFlagsNV           indirectStateFlags;
    uint32_t                         indexTypeCount;
    const VkIndexType*               pIndexTypes;
    const uint32_t*                  pIndexTypeValues;
} VkIndirectCommandsLayoutTokenNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • tokenType specifies the token command type.

  • stream is the index of the input stream that contains the token argument data.

  • offset is a relative starting offset within the input stream memory for the token argument data.

  • vertexBindingUnit is used for the vertex buffer binding command.

  • vertexDynamicStride sets if the vertex buffer stride is provided by the binding command rather than the current bound graphics pipeline state.

  • pushconstantPipelineLayout is the VkPipelineLayout used for the push constant command.

  • pushconstantShaderStageFlags are the shader stage flags used for the push constant command.

  • pushconstantOffset is the offset used for the push constant command.

  • pushconstantSize is the size used for the push constant command.

  • indirectStateFlags are the active states for the state flag command.

  • indexTypeCount is the optional size of the pIndexTypes and pIndexTypeValues array pairings. If not zero, it allows to register a custom uint32_t value to be treated as specific VkIndexType.

  • pIndexTypes is the used VkIndexType for the corresponding uint32_t value entry in pIndexTypeValues.

Valid Usage
  • stream must be smaller than VkIndirectCommandsLayoutCreateInfoNV::streamCount

  • offset must be less than or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectCommandsTokenOffset

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV, vertexBindingUnit must stay within device supported limits for the appropriate commands

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, pushconstantPipelineLayout must be valid

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, pushconstantOffset must be a multiple of 4

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, pushconstantSize must be a multiple of 4

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, pushconstantOffset must be less than VkPhysicalDeviceLimits::maxPushConstantsSize

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, pushconstantSize must be less than or equal to VkPhysicalDeviceLimits::maxPushConstantsSize minus pushconstantOffset

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, for each byte in the range specified by pushconstantOffset and pushconstantSize and for each shader stage in pushconstantShaderStageFlags, there must be a push constant range in pushconstantPipelineLayout that includes that byte and that stage

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, for each byte in the range specified by pushconstantOffset and pushconstantSize and for each push constant range that overlaps that byte, pushconstantShaderStageFlags must include all stages in that push constant range’s VkPushConstantRange::pushconstantShaderStageFlags

  • If tokenType is VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV, indirectStateFlags must not be ´0´

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_INDIRECT_COMMANDS_LAYOUT_TOKEN_NV

  • pNext must be NULL

  • tokenType must be a valid VkIndirectCommandsTokenTypeNV value

  • If pushconstantPipelineLayout is not VK_NULL_HANDLE, pushconstantPipelineLayout must be a valid VkPipelineLayout handle

  • pushconstantShaderStageFlags must be a valid combination of VkShaderStageFlagBits values

  • indirectStateFlags must be a valid combination of VkIndirectStateFlagBitsNV values

  • If indexTypeCount is not 0, pIndexTypes must be a valid pointer to an array of indexTypeCount valid VkIndexType values

  • If indexTypeCount is not 0, pIndexTypeValues must be a valid pointer to an array of indexTypeCount uint32_t values

The following code provides detailed information on how an individual sequence is processed. For valid usage, all restrictions from the regular commands apply.

void cmdProcessSequence(cmd, pipeline, indirectCommandsLayout, pIndirectCommandsStreams, s)
{
  for (uint32_t t = 0; t < indirectCommandsLayout.tokenCount; t++){
    token = indirectCommandsLayout.pTokens[t];

    uint32_t stride   = indirectCommandsLayout.pStreamStrides[token.stream];
    stream            = pIndirectCommandsStreams[token.stream];
    uint32_t offset   = stream.offset + stride * s + token.offset;
    const void* input = stream.buffer.pointer( offset )

    switch(input.type){
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV:
      VkBindShaderGroupIndirectCommandNV* bind = input;

      vkCmdBindPipelineShaderGroupNV(cmd, indirectCommandsLayout.pipelineBindPoint,
        pipeline, bind->groupIndex);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_STATE_FLAGS_NV:
      VkSetStateFlagsIndirectCommandNV* state = input;

      if (token.indirectStateFlags & VK_INDIRECT_STATE_FLAG_FRONTFACE_BIT_NV){
        if (state.data & (1 << 0)){
          set VK_FRONT_FACE_CLOCKWISE;
        } else {
          set VK_FRONT_FACE_COUNTER_CLOCKWISE;
        }
      }
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV:
      uint32_t* data = input;

      vkCmdPushConstants(cmd,
        token.pushconstantPipelineLayout
        token.pushconstantStageFlags,
        token.pushconstantOffset,
        token.pushconstantSize, data);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_INDEX_BUFFER_NV:
      VkBindIndexBufferIndirectCommandNV* data = input;

      // the indexType may optionally be remapped
      // from a custom uint32_t value, via
      // VkIndirectCommandsLayoutTokenNV::pIndexTypeValues

      vkCmdBindIndexBuffer(cmd,
        deriveBuffer(data->bufferAddress),
        deriveOffset(data->bufferAddress),
        data->indexType);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_VERTEX_BUFFER_NV:
      VkBindVertexBufferIndirectCommandNV* data = input;

      // if token.vertexDynamicStride is VK_TRUE
      // then the stride for this binding is set
      // using data->stride as well

      vkCmdBindVertexBuffers(cmd,
        token.vertexBindingUnit, 1,
        &deriveBuffer(data->bufferAddress),
        &deriveOffset(data->bufferAddress));
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_INDEXED_NV:
      vkCmdDrawIndexedIndirect(cmd,
        stream.buffer, offset, 1, 0);
    break;

    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DRAW_NV:
      vkCmdDrawIndirect(cmd,
        stream.buffer,
        offset, 1, 0);
    break;

    // only available if VK_NV_mesh_shader is supported
    VK_INDIRECT_COMMANDS_TOKEN_TYPE_DISPATCH_NV:
      vkCmdDrawMeshTasksIndirectNV(cmd,
        stream.buffer, offset, 1, 0);
    break;
    }
  }
}

30.2. Indirect Commands Generation And Execution

The generation of commands on the device requires a preprocess buffer. To retrieve the memory size and alignment requirements of a particular execution state call:

// Provided by VK_NV_device_generated_commands
void vkGetGeneratedCommandsMemoryRequirementsNV(
    VkDevice                                    device,
    const VkGeneratedCommandsMemoryRequirementsInfoNV* pInfo,
    VkMemoryRequirements2*                      pMemoryRequirements);
  • device is the logical device that owns the buffer.

  • pInfo is a pointer to an instance of the VkGeneratedCommandsMemoryRequirementsInfoNV structure containing parameters required for the memory requirements query.

  • pMemoryRequirements points to an instance of the VkMemoryRequirements2 structure in which the memory requirements of the buffer object are returned.

Valid Usage (Implicit)
// Provided by VK_NV_device_generated_commands
typedef struct VkGeneratedCommandsMemoryRequirementsInfoNV {
    VkStructureType               sType;
    const void*                   pNext;
    VkPipelineBindPoint           pipelineBindPoint;
    VkPipeline                    pipeline;
    VkIndirectCommandsLayoutNV    indirectCommandsLayout;
    uint32_t                      maxSequencesCount;
} VkGeneratedCommandsMemoryRequirementsInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pipelineBindPoint is the VkPipelineBindPoint of the pipeline that this buffer memory is intended to be used with during the execution.

  • pipeline is the VkPipeline that this buffer memory is intended to be used with during the execution.

  • indirectCommandsLayout is the VkIndirectCommandsLayoutNV that this buffer memory is intended to be used with.

  • maxSequencesCount is the maximum number of sequences that this buffer memory in combination with the other state provided can be used with.

Valid Usage
Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_GENERATED_COMMANDS_MEMORY_REQUIREMENTS_INFO_NV

  • pNext must be NULL

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • pipeline must be a valid VkPipeline handle

  • indirectCommandsLayout must be a valid VkIndirectCommandsLayoutNV handle

  • Both of indirectCommandsLayout, and pipeline must have been created, allocated, or retrieved from the same VkDevice

The actual generation of commands as well as their execution on the device is handled as single action with:

// Provided by VK_NV_device_generated_commands
void vkCmdExecuteGeneratedCommandsNV(
    VkCommandBuffer                             commandBuffer,
    VkBool32                                    isPreprocessed,
    const VkGeneratedCommandsInfoNV*            pGeneratedCommandsInfo);
  • commandBuffer is the command buffer into which the command is recorded.

  • isPreprocessed represents whether the input data has already been preprocessed on the device. If it is VK_FALSE this command will implicitly trigger the preprocessing step, otherwise not.

  • pGeneratedCommandsInfo is a pointer to an instance of the VkGeneratedCommandsInfoNV structure containing parameters affecting the generation of commands.

Valid Usage
  • If a VkImageView is sampled with VK_FILTER_LINEAR as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_LINEAR_BIT

  • If a VkImageView is accessed using atomic operations as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_STORAGE_IMAGE_ATOMIC_BIT

  • If a VkImageView is sampled with VK_FILTER_CUBIC_EXT as a result of this command, then the image view’s format features must contain VK_FORMAT_FEATURE_SAMPLED_IMAGE_FILTER_CUBIC_BIT_EXT

  • Any VkImageView being sampled with VK_FILTER_CUBIC_EXT as a result of this command must have a VkImageViewType and format that supports cubic filtering, as specified by VkFilterCubicImageViewImageFormatPropertiesEXT::filterCubic returned by vkGetPhysicalDeviceImageFormatProperties2

  • Any VkImageView being sampled with VK_FILTER_CUBIC_EXT with a reduction mode of either VK_SAMPLER_REDUCTION_MODE_MIN or VK_SAMPLER_REDUCTION_MODE_MAX as a result of this command must have a VkImageViewType and format that supports cubic filtering together with minmax filtering, as specified by VkFilterCubicImageViewImageFormatPropertiesEXT::filterCubicMinmax returned by vkGetPhysicalDeviceImageFormatProperties2

  • Any VkImage created with a VkImageCreateInfo::flags containing VK_IMAGE_CREATE_CORNER_SAMPLED_BIT_NV sampled as a result of this command must only be sampled using a VkSamplerAddressMode of VK_SAMPLER_ADDRESS_MODE_CLAMP_TO_EDGE

  • For each set n that is statically used by the VkPipeline bound to the pipeline bind point used by this command, a descriptor set must have been bound to n at the same pipeline bind point, with a VkPipelineLayout that is compatible for set n, with the VkPipelineLayout used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • For each push constant that is statically used by the VkPipeline bound to the pipeline bind point used by this command, a push constant value must have been set for the same pipeline bind point, with a VkPipelineLayout that is compatible for push constants, with the VkPipelineLayout used to create the current VkPipeline, as described in Pipeline Layout Compatibility

  • Descriptors in each bound descriptor set, specified via vkCmdBindDescriptorSets, must be valid if they are statically used by the VkPipeline bound to the pipeline bind point used by this command

  • A valid pipeline must be bound to the pipeline bind point used by this command

  • If the VkPipeline object bound to the pipeline bind point used by this command requires any dynamic state, that state must have been set for commandBuffer, and done so after any previously bound pipeline with the corresponding state not specified as dynamic

  • There must not have been any calls to dynamic state setting commands for any state not specified as dynamic in the VkPipeline object bound to the pipeline bind point used by this command, since that pipeline was bound

  • If the VkPipeline object bound to the pipeline bind point used by this command accesses a VkSampler object that uses unnormalized coordinates, that sampler must not be used to sample from any VkImage with a VkImageView of the type VK_IMAGE_VIEW_TYPE_3D, VK_IMAGE_VIEW_TYPE_CUBE, VK_IMAGE_VIEW_TYPE_1D_ARRAY, VK_IMAGE_VIEW_TYPE_2D_ARRAY or VK_IMAGE_VIEW_TYPE_CUBE_ARRAY, in any shader stage

  • If the VkPipeline object bound to the pipeline bind point used by this command accesses a VkSampler object that uses unnormalized coordinates, that sampler must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions with ImplicitLod, Dref or Proj in their name, in any shader stage

  • If the VkPipeline object bound to the pipeline bind point used by this command accesses a VkSampler object that uses unnormalized coordinates, that sampler must not be used with any of the SPIR-V OpImageSample* or OpImageSparseSample* instructions that includes a LOD bias or any offset values, in any shader stage

  • If the robust buffer access feature is not enabled, and if the VkPipeline object bound to the pipeline bind point used by this command accesses a uniform buffer, it must not access values outside of the range of the buffer as specified in the descriptor set bound to the same pipeline bind point

  • If the robust buffer access feature is not enabled, and if the VkPipeline object bound to the pipeline bind point used by this command accesses a storage buffer, it must not access values outside of the range of the buffer as specified in the descriptor set bound to the same pipeline bind point

  • If commandBuffer is an unprotected command buffer, any resource accessed by the VkPipeline object bound to the pipeline bind point used by this command must not be a protected resource

  • If a VkImageView is accessed using OpImageWrite as a result of this command, then the Type of the Texel operand of that instruction must have at least as many components as the image view’s format.

  • The current render pass must be compatible with the renderPass member of the VkGraphicsPipelineCreateInfo structure specified when creating the VkPipeline bound to VK_PIPELINE_BIND_POINT_GRAPHICS

  • The subpass index of the current render pass must be equal to the subpass member of the VkGraphicsPipelineCreateInfo structure specified when creating the VkPipeline bound to VK_PIPELINE_BIND_POINT_GRAPHICS

  • Every input attachment used by the current subpass must be bound to the pipeline via a descriptor set

  • Image subresources used as attachments in the current render pass must not be accessed in any way other than as an attachment by this command

  • If the draw is recorded in a render pass instance with multiview enabled, the maximum instance index must be less than or equal to VkPhysicalDeviceMultiviewProperties::maxMultiviewInstanceIndex

  • If the bound graphics pipeline was created with VkPipelineSampleLocationsStateCreateInfoEXT::sampleLocationsEnable set to VK_TRUE and the current subpass has a depth/stencil attachment, then that attachment must have been created with the VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT bit set

  • If the bound graphics pipeline state was created with the VK_DYNAMIC_STATE_VIEWPORT_WITH_COUNT_EXT dynamic state enabled, but not the VK_DYNAMIC_STATE_SCISSOR_WITH_COUNT_EXT dynamic state enabled, then then vkCmdSetViewportWithCountEXT must have been called in the current command buffer prior to this draw command, and the viewportCount parameter of vkCmdSetViewportWithCountEXT must match the VkPipelineViewportStateCreateInfo::scissorCount of the pipeline

  • If the bound graphics pipeline state was created with the VK_DYNAMIC_STATE_SCISSOR_WITH_COUNT_EXT dynamic state enabled, but not the VK_DYNAMIC_STATE_VIEWPORT_WITH_COUNT_EXT dynamic state enabled, then then vkCmdSetScissorWithCountEXT must have been called in the current command buffer prior to this draw command, and the scissorCount parameter of vkCmdSetScissorWithCountEXT must match the VkPipelineViewportStateCreateInfo::viewportCount of the pipeline

  • If the bound graphics pipeline state was created with both the VK_DYNAMIC_STATE_SCISSOR_WITH_COUNT_EXT and VK_DYNAMIC_STATE_VIEWPORT_WITH_COUNT_EXT dynamic states enabled then both vkCmdSetViewportWithCountEXT and vkCmdSetScissorWithCountEXT must have been called in the current command buffer prior to this draw command, and the viewportCount parameter of vkCmdSetViewportWithCountEXT must match the scissorCount parameter of vkCmdSetScissorWithCountEXT

  • If the bound graphics pipeline state was created with the VK_DYNAMIC_STATE_PRIMITIVE_TOPOLOGY_EXT dynamic state enabled then vkCmdSetPrimitiveTopologyEXT must have been called in the current command buffer prior to this draw command, and the primitiveTopology parameter of vkCmdSetPrimitiveTopologyEXT must be of the same topology class as the pipeline VkPipelineInputAssemblyStateCreateInfo::topology state

  • All vertex input bindings accessed via vertex input variables declared in the vertex shader entry point’s interface must have either valid or VK_NULL_HANDLE buffers bound

  • If the nullDescriptor feature is not enabled, all vertex input bindings accessed via vertex input variables declared in the vertex shader entry point’s interface must not be VK_NULL_HANDLE

  • For a given vertex buffer binding, any attribute data fetched must be entirely contained within the corresponding vertex buffer binding, as described in Vertex Input Description

  • commandBuffer must not be a protected command buffer

  • If isPreprocessed is VK_TRUE then vkCmdPreprocessGeneratedCommandsNV must have already been executed on the device, using the same pGeneratedCommandsInfo content as well as the content of the input buffers it references (all except VkGeneratedCommandsInfoNV::preprocessBuffer). Furthermore pGeneratedCommandsInfo`s indirectCommandsLayout must have been created with the VK_INDIRECT_COMMANDS_LAYOUT_USAGE_EXPLICIT_PREPROCESS_BIT_NV bit set

  • VkGeneratedCommandsInfoNV::pipeline must match the current bound pipeline at VkGeneratedCommandsInfoNV::pipelineBindPoint

  • Transform feedback must not be active

  • The VkPhysicalDeviceDeviceGeneratedCommandsFeaturesNV::deviceGeneratedCommands feature must be enabled

Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pGeneratedCommandsInfo must be a valid pointer to a valid VkGeneratedCommandsInfoNV structure

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called inside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Inside

Graphics
Compute

// Provided by VK_NV_device_generated_commands
typedef struct VkGeneratedCommandsInfoNV {
    VkStructureType                      sType;
    const void*                          pNext;
    VkPipelineBindPoint                  pipelineBindPoint;
    VkPipeline                           pipeline;
    VkIndirectCommandsLayoutNV           indirectCommandsLayout;
    uint32_t                             streamCount;
    const VkIndirectCommandsStreamNV*    pStreams;
    uint32_t                             sequencesCount;
    VkBuffer                             preprocessBuffer;
    VkDeviceSize                         preprocessOffset;
    VkDeviceSize                         preprocessSize;
    VkBuffer                             sequencesCountBuffer;
    VkDeviceSize                         sequencesCountOffset;
    VkBuffer                             sequencesIndexBuffer;
    VkDeviceSize                         sequencesIndexOffset;
} VkGeneratedCommandsInfoNV;
  • sType is the type of this structure.

  • pNext is NULL or a pointer to a structure extending this structure.

  • pipelineBindPoint is the VkPipelineBindPoint used for the pipeline.

  • pipeline is the VkPipeline used in the generation and execution process.

  • indirectCommandsLayout is the VkIndirectCommandsLayoutNV that provides the command sequence to generate.

  • streamCount defines the number of input streams

  • pStreams provides an array of VkIndirectCommandsStreamNV that provide the input data for the tokens used in indirectCommandsLayout.

  • sequencesCount is the maximum number of sequences to reserve. If sequencesCountBuffer is VK_NULL_HANDLE, this is also the actual number of sequences generated.

  • preprocessBuffer is the VkBuffer that is used for preprocessing the input data for execution. If this structure is used with vkCmdExecuteGeneratedCommandsNV with its isPreprocessed set to VK_TRUE, then the preprocessing step is skipped and data is only read from this buffer.

  • preprocessOffset is the byte offset into preprocessBuffer where the preprocessed data is stored.

  • preprocessSize is the maximum byte size within the preprocessBuffer after the preprocessOffset that is available for preprocessing.

  • sequencesCountBuffer is a VkBuffer in which the actual number of sequences is provided as single uint32_t value.

  • sequencesCountOffset is the byte offset into sequencesCountBuffer where the count value is stored.

  • sequencesIndexBuffer is a VkBuffer that encodes the used sequence indices as uint32_t array.

  • sequencesIndexOffset is the byte offset into sequencesIndexBuffer where the index values start.

Valid Usage
  • The provided pipeline must match the pipeline bound at execution time

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV, then the pipeline must have been created with multiple shader groups

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_SHADER_GROUP_NV, then the pipeline must have been created with VK_PIPELINE_CREATE_INDIRECT_BINDABLE_BIT_NV set in VkGraphicsPipelineCreateInfo::flags

  • If the indirectCommandsLayout uses a token of VK_INDIRECT_COMMANDS_TOKEN_TYPE_PUSH_CONSTANT_NV, then the pipeline`s VkPipelineLayout must match the VkIndirectCommandsLayoutTokenNV::pushconstantPipelineLayout

  • streamCount must match the indirectCommandsLayout’s streamCount

  • sequencesCount must be less or equal to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::maxIndirectSequenceCount and VkGeneratedCommandsMemoryRequirementsInfoNV::maxSequencesCount that was used to determine the preprocessSize

  • preprocessBuffer must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set in its usage flag

  • preprocessOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minIndirectCommandsBufferOffsetAlignment

  • If preprocessBuffer is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

  • preprocessSize must be at least equal to the memory requirement`s size returned by vkGetGeneratedCommandsMemoryRequirementsNV using the matching inputs (indirectCommandsLayout, …​) as within this structure

  • sequencesCountBuffer can be set if the actual used count of sequences is sourced from the provided buffer. In that case the sequencesCount serves as upper bound

  • If sequencesCountBuffer is not VK_NULL_HANDLE, its usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • If sequencesCountBuffer is not VK_NULL_HANDLE, sequencesCountOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minSequencesCountBufferOffsetAlignment

  • If sequencesCountBuffer is not VK_NULL_HANDLE and is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

  • If indirectCommandsLayout’s VK_INDIRECT_COMMANDS_LAYOUT_USAGE_INDEXED_SEQUENCES_BIT_NV is set, sequencesIndexBuffer must be set otherwise it must be VK_NULL_HANDLE

  • If sequencesIndexBuffer is not VK_NULL_HANDLE, its usage flag must have the VK_BUFFER_USAGE_INDIRECT_BUFFER_BIT bit set

  • If sequencesIndexBuffer is not VK_NULL_HANDLE, sequencesIndexOffset must be aligned to VkPhysicalDeviceDeviceGeneratedCommandsPropertiesNV::minSequencesIndexBufferOffsetAlignment

  • If sequencesIndexBuffer is not VK_NULL_HANDLE and is non-sparse then it must be bound completely and contiguously to a single VkDeviceMemory object

Valid Usage (Implicit)
  • sType must be VK_STRUCTURE_TYPE_GENERATED_COMMANDS_INFO_NV

  • pNext must be NULL

  • pipelineBindPoint must be a valid VkPipelineBindPoint value

  • pipeline must be a valid VkPipeline handle

  • indirectCommandsLayout must be a valid VkIndirectCommandsLayoutNV handle

  • pStreams must be a valid pointer to an array of streamCount valid VkIndirectCommandsStreamNV structures

  • preprocessBuffer must be a valid VkBuffer handle

  • If sequencesCountBuffer is not VK_NULL_HANDLE, sequencesCountBuffer must be a valid VkBuffer handle

  • If sequencesIndexBuffer is not VK_NULL_HANDLE, sequencesIndexBuffer must be a valid VkBuffer handle

  • streamCount must be greater than 0

  • Each of indirectCommandsLayout, pipeline, preprocessBuffer, sequencesCountBuffer, and sequencesIndexBuffer that are valid handles of non-ignored parameters must have been created, allocated, or retrieved from the same VkDevice

Referencing the functions defined in Indirect Commands Layout, vkCmdExecuteGeneratedCommandsNV behaves as:

uint32_t sequencesCount = sequencesCountBuffer ?
      min(maxSequencesCount, sequencesCountBuffer.load_uint32(sequencesCountOffset) :
      maxSequencesCount;


cmdProcessAllSequences(commandBuffer, pipeline,
                       indirectCommandsLayout, pIndirectCommandsStreams,
                       sequencesCount,
                       sequencesIndexBuffer, sequencesIndexOffset);

// The stateful commands within indirectCommandsLayout will not
// affect the state of subsequent commands in the target
// command buffer (cmd)
Note

It is important to note that the values of all state related to the pipelineBindPoint used are undefined after this command.

Commands can be preprocessed prior execution using the following command:

// Provided by VK_NV_device_generated_commands
void vkCmdPreprocessGeneratedCommandsNV(
    VkCommandBuffer                             commandBuffer,
    const VkGeneratedCommandsInfoNV*            pGeneratedCommandsInfo);
  • commandBuffer is the command buffer which does the preprocessing.

  • pGeneratedCommandsInfo is a pointer to an instance of the VkGeneratedCommandsInfoNV structure containing parameters affecting the preprocessing step.

Valid Usage
Valid Usage (Implicit)
  • commandBuffer must be a valid VkCommandBuffer handle

  • pGeneratedCommandsInfo must be a valid pointer to a valid VkGeneratedCommandsInfoNV structure

  • commandBuffer must be in the recording state

  • The VkCommandPool that commandBuffer was allocated from must support graphics, or compute operations

  • This command must only be called outside of a render pass instance

Host Synchronization
  • Host access to commandBuffer must be externally synchronized

  • Host access to the VkCommandPool that commandBuffer was allocated from must be externally synchronized

Command Properties
Command Buffer Levels Render Pass Scope Supported Queue Types Pipeline Type

Primary
Secondary

Outside

Graphics
Compute