TritonAMDGPUOps

amdgpu.buffer_load (triton::amdgpu::BufferLoadOp)

Load from a scalar base pointer and a tensor offset

Syntax:

operation ::= `amdgpu.buffer_load` $ptr `[` $offsets `]` (`,` $mask^)? (`,` $other^)?
              attr-dict `:` type($result)

AMD Buffer load operation. Buffer store is similar to a normal store but it accesses global memory via a scalar base pointer and a tensor of offsets instead of a tensor of pointers. The other fields are similar to a normal load, i.e., the mask is a boolean vector that determines if a given element should be read from memory, and other is the element that should be returned on lane i when mask[i] == 0.

Traits: AttrSizedOperandSegments, SameLoadStoreOperandsAndResultEncoding

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

ptr

ptr

offsets

tensor of 32-bit signless integer values

mask

ranked tensor of 1-bit signless integer values

other

ranked tensor of floating-point or integer or ptr values

Results:

Result

Description

result

ranked tensor of floating-point or integer or ptr values

amdgpu.buffer_store (triton::amdgpu::BufferStoreOp)

Store into scalar base pointer and a tensor offset

Syntax:

operation ::= `amdgpu.buffer_store` $value `,` $ptr `[` $offsets `]` (`,` $mask^)?
              attr-dict `:` type($value)

AMD Buffer store operation. Buffer store is similar to normal store but it accesses global memory via a scalar base pointer and a tensor of offsets instead of a tensor of pointers. The other fields are similar to a normal store , i.e., the mask is a boolean vector that determines if a given element should be written to memory, and value is the tensor of elements that should be written on lane i when mask[i] == 1.

Traits: SameLoadStoreOperandsEncoding

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

value

ranked tensor of floating-point or integer or ptr values

ptr

ptr

offsets

tensor of 32-bit signless integer values

mask

ranked tensor of 1-bit signless integer values

amdgpu.instruction_sched_hint (triton::amdgpu::InstructionSchedHint)

A placeholder op for instruction scheduling hints within a basic block

Syntax:

operation ::= `amdgpu.instruction_sched_hint` attr-dict

A placeholder op for instruction scheduling hints applied to instructions within a basic block where the placeholder op is located. This op is primarily intended to be used to adjust instruction scheduling inside the resulting main loop of a tt.dot operation. It’s easier to identify dot ops at a high level and, thus, to mark intended scheduling regions. The hint ops are eventually lowered into LLVM AMDGPU instruction scheduling primitives, which are meant to control how different kinds of instructions (valu/mfma, global/shared memory, etc.) should interleave for better instruction level parallelism.