ProtonGPUOps

proton_gpu.circular_store (triton::proton::gpu::CircularStoreOp)

Store the value into a circular buffer

Syntax:

operation ::= `proton_gpu.circular_store` (`start` $isStart^):(`end`)? $segment `,` $counter attr-dict `:`
              qualified(type($segment)) `,` type($counter)

Store a metric counter into a circular buffer backed by the internal memory segment. automatically updated. Older metric counters are dropped if the segment buffer is full.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource, MemoryEffects::Write on ::mlir::SideEffects::DefaultResource}

Attributes:

AttributeMLIR TypeDescription
isStart::mlir::UnitAttrunit attribute
scopeId::mlir::IntegerAttr32-bit signless integer attribute

Operands:

Operand

Description

segment

A segment in the internal buffer

counter

32-bit signless integer or 64-bit signless integer

proton_gpu.finalize (triton::proton::gpu::FinalizeOp)

Finalize the intra kernel profiler

Syntax:

operation ::= `proton_gpu.finalize` $segment `,` $scratchPtr attr-dict `:` qualified(type($segment)) `,` qualified(type($scratchPtr))

Write back the metadata and profile to global memory. segment is the segment of the internal profiling buffer that contains the profiling data. scratchPtr is the address of the profiling scratch buffer.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::triton::GlobalMemory}, MemoryEffects::Effect{MemoryEffects::Read on ::mlir::triton::gpu::SharedMemory}, MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

segment

A segment in the internal buffer

scratchPtr

ptr

proton_gpu.global_scratch_alloc (triton::proton::gpu::GlobalScratchAllocOp)

Allocate a global memory profile buffer

Syntax:

operation ::= `proton_gpu.global_scratch_alloc` attr-dict `:` qualified(type($result))

This operation allocates a profile buffer in global memory. Each Triton program being profiled will hold a pointer to the buffer until profiling is finalized.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Allocate on ::mlir::triton::GlobalMemory}

Attributes:

AttributeMLIR TypeDescription
nbytes::mlir::IntegerAttr32-bit signless integer attribute
alignment::mlir::IntegerAttr32-bit signless integer attribute

Results:

Result

Description

result

ptr

proton_gpu.init_ctx (triton::proton::gpu::InitCtxOp)

Initialize the intra kernel profiler warp-level contexts

Syntax:

operation ::= `proton_gpu.init_ctx` $scratchPtr attr-dict `:` qualified(type($scratchPtr))

Initialize the intra kernel profiler warp-level contexts for all warps in scratchPtr (base address of the profiling scratch buffer). It can’t be called inside ttg.warp_specialize.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

scratchPtr

ptr

proton_gpu.read_counter (triton::proton::gpu::ReadCounterOp)

Read a GPU metric counter into a scalar register

Syntax:

operation ::= `proton_gpu.read_counter` attr-dict `:` type($counter)

Read a GPU metric counter into a scalar register.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::SideEffects::DefaultResource, MemoryEffects::Write on ::mlir::SideEffects::DefaultResource}

Attributes:

AttributeMLIR TypeDescription
metric::mlir::triton::proton::MetricTypeAttr
The type of metric to be profiled{{% markdown %}} Attribute to indicate the metric to be profiled. The following metrics are supported: - CYCLE: Cycle count metric. {{% /markdown %}}

Results:

Result

Description

counter

32-bit signless integer or 64-bit signless integer

proton_gpu.restore_ctx (triton::proton::gpu::RestoreCtxOp)

Restore the current warp-level context

Syntax:

operation ::= `proton_gpu.restore_ctx` $segment `,` $scratchPtr attr-dict `:` qualified(type($segment)) `,` qualified(type($scratchPtr))

Restore the current warp context in $segment from scratchPtr (base address of the profiling scratch buffer).

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Read on ::mlir::triton::GlobalMemory}, MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

segment

A segment in the internal buffer

scratchPtr

ptr

proton_gpu.save_ctx (triton::proton::gpu::SaveCtxOp)

Save the current warp-level context

Syntax:

operation ::= `proton_gpu.save_ctx` $segment `,` $scratchPtr attr-dict `:` qualified(type($segment)) `,` qualified(type($scratchPtr))

Save the current warp context from $segment to scratchPtr (base address of the profiling scratch buffer).

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

segment

A segment in the internal buffer

scratchPtr

ptr

proton_gpu.segment_alloc (triton::proton::gpu::SegmentAllocOp)

Get the base offset of the segment of the internal buffer

Syntax:

operation ::= `proton_gpu.segment_alloc` $buffer attr-dict `:` qualified(type($buffer)) `->` type($segment)

The internal buffer is partitioned into segments for each profiling “unit”. This operation gets the location of the memory segment in the internal buffer.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand

Description

buffer

memory descriptor type (::mlir::triton::gpu::MemDescType) in Triton IR type system

Results:

Result

Description

segment

A segment in the internal buffer