TritonInstrumentOps

tti.dot_i8 (triton::instrument::DotI8Op)

Non-saturating NVIDIA MMAv2 i8 dot

Syntax:

operation ::= `tti.dot_i8` $a `,` $b `,` $c `,` `aSigned` `=` $aSigned `,` `bSigned` `=` $bSigned
              attr-dict `:` type($a) `*` type($b) `->` type($d)

Performs a wrapping i8 matrix multiplication into an i32 accumulator using NVIDIA MMAv2. The A and B operands have independent signedness.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, DotOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

AttributeMLIR TypeDescription
aSigned::mlir::BoolAttrbool attribute
bSigned::mlir::BoolAttrbool attribute

Operands:

Operand

Description

a

ranked tensor of 8-bit signless integer values

b

ranked tensor of 8-bit signless integer values

c

ranked tensor of 32-bit signless integer values

Results:

Result

Description

d

ranked tensor of 32-bit signless integer values

tti.experimental_assert_uniform (triton::instrument::ExperimentalAssertUniformOp)

Assert the uniform condition

Syntax:

operation ::= `tti.experimental_assert_uniform` $condition `,` $message attr-dict-with-keyword

Assert that the condition is true given all threads in the warp group have the same value, so only one thread needs to evaluate the assert and print the message.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Attributes:

AttributeMLIR TypeDescription
message::mlir::StringAttrstring attribute

Operands:

Operand

Description

condition

1-bit signless integer

tti.experimental_buffer_descriptors (triton::instrument::ExperimentalBufferDescriptorsOp)

Define an array of buffer descriptors

Syntax:

operation ::= `tti.experimental_buffer_descriptors` $offsets `,` $lengths `,` $memType attr-dict `:` type($result)

Create a tensor of buffer descriptors packing 32-bit pointer offsets and 32-bit lengths into 64-bit elements.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Attributes:

AttributeMLIR TypeDescription
offsets::mlir::DenseI32ArrayAttri32 dense array attribute
lengths::mlir::DenseI32ArrayAttri32 dense array attribute
memType::mlir::triton::instrument::MemTypeAttrallowed 32-bit signless integer cases: 0, 1

Results:

Result

Description

result

ranked tensor of floating-point or integer or ptr values

tti.experimental_cluster_cta_id (triton::instrument::ExperimentalClusterCTAIdOp)

Get the CTA id within the current cluster

Syntax:

operation ::= `tti.experimental_cluster_cta_id` attr-dict `:` type($result)

Return the cluster-local CTA id used to index ConSan’s multi-CTA scratch slabs. For single-CTA kernels this is always zero.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Results:

Result

Description

result

32-bit signless integer

tti.experimental_fpsan_embed (triton::instrument::ExperimentalFPSanEmbedOp)

Embed float value in the fpsan integer ring

Syntax:

operation ::= `tti.experimental_fpsan_embed` $val attr-dict `:` functional-type(operands, $result)

Traits: AlwaysSpeculatableImplTrait, Elementwise, SameOperandsAndResultEncoding, SameOperandsAndResultShape

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand

Description

val

floating-point or ranked tensor of floating-point values

Results:

Result

Description

result

integer or ranked tensor of integer values

tti.experimental_fpsan_unembed (triton::instrument::ExperimentalFPSanUnembedOp)

Unembed fpsan integer payload as a float value

Syntax:

operation ::= `tti.experimental_fpsan_unembed` $val attr-dict `:` functional-type(operands, $result)

Traits: AlwaysSpeculatableImplTrait, Elementwise, SameOperandsAndResultEncoding, SameOperandsAndResultShape

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand

Description

val

integer or ranked tensor of integer values

Results:

Result

Description

result

floating-point or ranked tensor of floating-point values

tti.experimental_gsan_atomic_cas (triton::instrument::ExperimentalGSanAtomicCASOp)

Lower a GSan-instrumented atomic cas

Syntax:

operation ::= `tti.experimental_gsan_atomic_cas` $sem `,` $scope `,` $ptr `,` $cmp `,` $val attr-dict `:`
              functional-type(operands, $result)

Traits: SameOperandsAndResultEncoding, SameOperandsAndResultShape

Attributes:

AttributeMLIR TypeDescription
sem::mlir::triton::MemSemanticAttrallowed 32-bit signless integer cases: 1, 2, 3, 4
scope::mlir::triton::MemSyncScopeAttrallowed 32-bit signless integer cases: 1, 2, 3

Operands:

Operand

Description

ptr

ptr or ranked tensor of ptr values

cmp

floating-point or ranked tensor of floating-point values or integer or ranked tensor of integer values or ptr or ranked tensor of ptr values

val

floating-point or ranked tensor of floating-point values or integer or ranked tensor of integer values or ptr or ranked tensor of ptr values

Results:

Result

Description

result

floating-point or ranked tensor of floating-point values or integer or ranked tensor of integer values or ptr or ranked tensor of ptr values

tti.experimental_gsan_atomic_rmw (triton::instrument::ExperimentalGSanAtomicRMWOp)

Lower a GSan-instrumented atomic rmw

Syntax:

operation ::= `tti.experimental_gsan_atomic_rmw` $atomic_rmw_op `,` $sem `,` $scope `,` $ptr `,` $val (`,` $mask^)? attr-dict `:`
              functional-type(operands, $result)

Traits: SameOperandsAndResultEncoding, SameOperandsAndResultShape

Attributes:

AttributeMLIR TypeDescription
atomic_rmw_op::mlir::triton::RMWOpAttrallowed 32-bit signless integer cases: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
sem::mlir::triton::MemSemanticAttrallowed 32-bit signless integer cases: 1, 2, 3, 4
scope::mlir::triton::MemSyncScopeAttrallowed 32-bit signless integer cases: 1, 2, 3

Operands:

Operand

Description

ptr

ptr or ranked tensor of ptr values

val

floating-point or ranked tensor of floating-point values or integer or ranked tensor of integer values or ptr or ranked tensor of ptr values

mask

1-bit signless integer or ranked tensor of 1-bit signless integer values

Results:

Result

Description

result

floating-point or ranked tensor of floating-point values or integer or ranked tensor of integer values or ptr or ranked tensor of ptr values

tti.experimental_gsan_atomic_tensor_access (triton::instrument::ExperimentalGSanAtomicTensorAccessOp)

Instrument a tensor atomic access for GSan

Syntax:

operation ::= `tti.experimental_gsan_atomic_tensor_access` $sem `,` $scope `,` $ptr (`,` $mask^)? attr-dict `:` type($ptr)

Emits runtime instrumentation for a tensor pointer access whose individual elements are atomic read-modify-write operations.

Attributes:

AttributeMLIR TypeDescription
sem::mlir::triton::MemSemanticAttrallowed 32-bit signless integer cases: 1, 2, 3, 4
scope::mlir::triton::MemSyncScopeAttrallowed 32-bit signless integer cases: 1, 2, 3

Operands:

Operand

Description

ptr

ptr or ranked tensor of ptr values

mask

1-bit signless integer or ranked tensor of 1-bit signless integer values

tti.experimental_gsan_init (triton::instrument::ExperimentalGSanInitOp)

Initialize GSan thread

tti.experimental_gsan_tensor_access (triton::instrument::ExperimentalGSanTensorAccessOp)

Instrument a tensor load/store access for GSan

Syntax:

operation ::= `tti.experimental_gsan_tensor_access` $ptr `,` $isStore (`,` $mask^)? attr-dict `:` type($ptr)

Emits runtime instrumentation for a tensor pointer access. The pointer and optional mask are consumed by the GSan runtime.

Attributes:

AttributeMLIR TypeDescription
isStore::mlir::BoolAttrbool attribute

Operands:

Operand

Description

ptr

ptr or ranked tensor of ptr values

mask

1-bit signless integer or ranked tensor of 1-bit signless integer values

tti.experimental_gsan_tensordesc_info (triton::instrument::ExperimentalGSanTensorDescInfoOp)

Decode GSan descriptor metadata from a native tensor descriptor

Syntax:

operation ::= `tti.experimental_gsan_tensordesc_info` $desc attr-dict `:` qualified(type($desc)) `->` type($result)

Decodes a native tensor descriptor into the underlying base pointer, shape and stride values.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand

Description

desc

tensor descriptor type

Results:

Result

Description

result

variadic of any type

tti.experimental_local_gather (triton::instrument::ExperimentalLocalGatherOp)

Gather elements from shared memory with logical base offsets

Syntax:

operation ::= `tti.experimental_local_gather` $src `[` $indices `]` `offsets` `=` `[` $offsets `]`
              attr-dict `:` qualified(type($src)) `,` type($indices) `->` type($result)

Gather elements from a shared memory descriptor using an index tensor along one axis, after shifting the logical source coordinates by rank-sized scalar offsets. This is intentionally private to instrumentation passes.

Attributes:

AttributeMLIR TypeDescription
axis::mlir::IntegerAttr32-bit signless integer attribute

Operands:

Operand

Description

src

memory descriptor type (::mlir::triton::gpu::MemDescType) in Triton IR type system

indices

ranked tensor of integer values

offsets

variadic of 32-bit signless integer

Results:

Result

Description

result

ranked tensor of floating-point or integer or ptr values

tti.experimental_lock_acquire (triton::instrument::ExperimentalLockAcquireOp)

Acquire a lock.

Syntax:

operation ::= `tti.experimental_lock_acquire` $lock (`,` $pred^)? attr-dict `:` type($lock)

Enter a critical section by acquiring a lock with single thread.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

lock

ptr or ranked tensor of ptr values

pred

1-bit signless integer

tti.experimental_lock_release (triton::instrument::ExperimentalLockReleaseOp)

Release a lock.

Syntax:

operation ::= `tti.experimental_lock_release` $lock (`,` $pred^)? attr-dict `:` type($lock)

Leave a critical section by releasing a lock with single thread.

Interfaces: MemoryEffectOpInterface (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{MemoryEffects::Write on ::mlir::triton::GlobalMemory}

Operands:

Operand

Description

lock

ptr or ranked tensor of ptr values

pred

1-bit signless integer

tti.experimental_memdesc_to_i32 (triton::instrument::ExperimentalMemDescToI32Op)

Convert a memdesc into its base pointer as i32

Syntax:

operation ::= `tti.experimental_memdesc_to_i32` $memdesc attr-dict `:` type($memdesc)

Extract the base pointer from the given memdesc and return it as a 32-bit integer. This can be used to compare the memdesc against tensors of barrier pointers maintained by the concurrency sanitizer.

Traits: AlwaysSpeculatableImplTrait

Interfaces: ConditionallySpeculatable, InferTypeOpInterface, NoMemoryEffect (MemoryEffectOpInterface)

Effects: MemoryEffects::Effect{}

Operands:

Operand

Description

memdesc

memory descriptor type (::mlir::triton::gpu::MemDescType) in Triton IR type system

Results:

Result

Description

result

32-bit signless integer