triton.experimental.gluon.language.amd.AMDWMMALayout

class triton.experimental.gluon.language.amd.AMDWMMALayout(self, version: int, transposed: bool, warp_bases: ~typing.List[~typing.List[int]], reg_bases: ~typing.List[~typing.List[int]] | None = None, instr_shape: ~typing.List[int] | None = None, cga_layout: ~typing.List[~typing.List[int]] = <factory>, rank: int | None = None)

Represents a layout for AMD WMMA (matrix core) operations.

Parameters:
  • version (int) – Indicates the GPU architecture.

  • transposed (bool) – Indicates the result tensor is transposed.

  • warp_bases (List[List[int]]) – Warp bases for CTA layout.

  • reg_bases (Optional[List[List[int]]]) – Repetition (register) bases for CTA layout.

  • instr_shape (Optional[List[int]]) – Instruction shape (M, N, K). Defaults to (16, 16, 16).

  • cga_layout (Optional[List[List[int]]]) – Bases describing CTA tiling.

  • rank (Optional[int]) – rank of warp and register bases. Default to 2 if missing.

Current supported versions:

  • 1: RDNA3; e.g., gfx1100, gfx1101

  • 2: RDNA4; e.g., gfx1200, gfx1201

  • 3: gfx1250

__init__(self, version: int, transposed: bool, warp_bases: ~typing.List[~typing.List[int]], reg_bases: ~typing.List[~typing.List[int]] | None = None, instr_shape: ~typing.List[int] | None = None, cga_layout: ~typing.List[~typing.List[int]] = <factory>, rank: int | None = None) None

Methods

__init__(self, version, transposed, ...)

format_hardware_view(self, shape)

format_tensor_view(self, shape)

mangle(self)

verify(self)

Attributes

instr_shape

rank

reg_bases

type

version

transposed

warp_bases

cga_layout