TritonAMDGPUOps¶
amdgpu.buffer_load
(triton::amdgpu::BufferLoadOp)¶
Load from a scalar base pointer and a tensor offset
Syntax:
operation ::= `amdgpu.buffer_load` $ptr `[` $offsets `]` (`,` $mask^)? (`,` $other^)?
attr-dict `:` type($result)
AMD Buffer load operation. Buffer store is similar to
a normal store but it accesses global memory via a scalar base pointer
and a tensor of offsets instead of a tensor of pointers. The other fields
are similar to a normal load, i.e., the mask
is a boolean vector that
determines if a given element should be read from memory, and other
is the
element that should be returned on lane i
when mask[i] == 0
.
Traits: AttrSizedOperandSegments
, SameLoadStoreOperandsAndResultEncoding
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::triton::GlobalMemory}
Operands:¶
Operand |
Description |
---|---|
|
ptr |
|
tensor of 32-bit signless integer values |
|
ranked tensor of 1-bit signless integer values |
|
ranked tensor of floating-point or integer or ptr values |
Results:¶
Result |
Description |
---|---|
|
ranked tensor of floating-point or integer or ptr values |
amdgpu.buffer_store
(triton::amdgpu::BufferStoreOp)¶
Store into scalar base pointer and a tensor offset
Syntax:
operation ::= `amdgpu.buffer_store` $value `,` $ptr `[` $offsets `]` (`,` $mask^)?
attr-dict `:` type($value)
AMD Buffer store operation. Buffer store is similar to
normal store but it accesses global memory via a scalar base pointer
and a tensor of offsets instead of a tensor of pointers. The other fields
are similar to a normal store , i.e., the mask
is a boolean vector that
determines if a given element should be written to memory, and value
is the
tensor of elements that should be written on lane i
when mask[i] == 1
.
Traits: SameLoadStoreOperandsEncoding
Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)
Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}
Operands:¶
Operand |
Description |
---|---|
|
ranked tensor of floating-point or integer or ptr values |
|
ptr |
|
tensor of 32-bit signless integer values |
|
ranked tensor of 1-bit signless integer values |
amdgpu.instruction_sched_hint
(triton::amdgpu::InstructionSchedHint)¶
A placeholder op for instruction scheduling hints within a basic block
Syntax:
operation ::= `amdgpu.instruction_sched_hint` attr-dict
A placeholder op for instruction scheduling hints applied to instructions within
a basic block where the placeholder op is located. This op is primarily intended
to be used to adjust instruction scheduling inside the resulting main loop
of a tt.dot
operation. It’s easier to identify dot ops at a high level and, thus,
to mark intended scheduling regions. The hint ops are eventually lowered
into LLVM AMDGPU instruction scheduling primitives, which are meant to control
how different kinds of instructions (valu/mfma, global/shared memory, etc.) should
interleave for better instruction level parallelism.