triton.experimental.gluon.language.amd.AMDWMMALayout

class triton.experimental.gluon.language.amd.AMDWMMALayout(self, version: int, transposed: bool, warp_bases: ~typing.List[~typing.List[int]], reg_bases: ~typing.List[~typing.List[int]] | None = None, instr_shape: ~typing.List[int] | None = None, cga_layout: ~typing.List[~typing.List[int]] = <factory>, rank: int | None = None)

Represents a layout for AMD WMMA (matrix core) operations.

Parameters:

version (int) – Indicates the GPU architecture.
transposed (bool) – Indicates the result tensor is transposed.
warp_bases (List[List[int]]) – Warp bases for CTA layout.
reg_bases (Optional[List[List[int]]]) – Repetition (register) bases for CTA layout.
instr_shape (Optional[List[int]]) – Instruction shape (M, N, K). Defaults to (16, 16, 16).
cga_layout (Optional[List[List[int]]]) – Bases describing CTA tiling.
rank (Optional[int]) – rank of warp and register bases. Default to 2 if missing.

Current supported versions:

1: RDNA3; e.g., gfx1100, gfx1101
2: RDNA4; e.g., gfx1200, gfx1201
3: gfx1250

__init__(self, version: int, transposed: bool, warp_bases: ~typing.List[~typing.List[int]], reg_bases: ~typing.List[~typing.List[int]] | None = None, instr_shape: ~typing.List[int] | None = None, cga_layout: ~typing.List[~typing.List[int]] = <factory>, rank: int | None = None) → None

Methods

`__init__`(self, version, transposed, ...)
`format_hardware_view`(self, shape)
`format_tensor_view`(self, shape)
`mangle`(self)
`verify`(self)

Attributes

`instr_shape`
`rank`
`reg_bases`
`type`
`version`
`transposed`
`warp_bases`
`cga_layout`