triton.language

Programming Model

program_id

Returns the id of the current program instance along the given axis.

num_programs

Returns the number of program instances launched along the given axis.

Creation Ops

arange

Returns contiguous values within the open interval [start, end).

zeros

Returns a block filled with the scalar value 0 for the given shape and dtype.

Shape Manipulation Ops

broadcast_to

Tries to broadcast the given block to a new shape.

reshape

Tries to reshape the given block to a new shape.

ravel

Returns a contiguous flattened view of x

Linear Algebra Ops

dot

Returns the matrix product of two blocks.

Memory Ops

load

Return a block of data whose values are, elementwise, loaded from memory at location defined by pointer.

store

Stores value block of elements in memory, element-wise, at the memory locations specified by pointer.

atomic_cas

Performs an atomic compare-and-swap at the memory location specified by pointer.

atomic_xchg

Performs an atomic exchange at the memory location specified by pointer.

Indexing Ops

where

Returns a block of elements from either x or y, depending on condition.

Math Ops

exp

Computes the element-wise exponential of x

log

Computes the element-wise natural logarithm of x

cos

Computes the element-wise cosine of x

sin

Computes the element-wise sine of x

sqrt

Computes the element-wise square root of x

sigmoid

Computes the element-wise sigmoid of x

softmax

Computes the element-wise softmax of x

Reduction Ops

max

Returns the maximum of all elements in the input block along the provided axis

min

Returns the minimum of all elements in the input block along the provided axis

sum

Returns the sum of all elements in the input block along the provided axis

Atomic Ops

atomic_cas

Performs an atomic compare-and-swap at the memory location specified by pointer.

atomic_add

Performs an atomic add at the memory location specified by pointer.

atomic_max

Performs an atomic max at the memory location specified by pointer.

atomic_min

Performs an atomic min at the memory location specified by pointer.

Comparison ops

minimum

Computes the element-wise minimum of x and y.

maximum

Computes the element-wise maximum of x and y.

Random Number Generation

randint4x

Given a seed scalar and an offset block, returns four blocks of random int32.

randint

Given a seed scalar and an offset block, returns a single block of random int32.

rand

Given a seed scalar and an offset block, returns a block of random float32 in \(U(0, 1)\)

randn

Given a seed scalar and an offset block, returns a block of random float32 in \(\mathcal{N}(0, 1)\)

Compiler Hint Ops

multiple_of

Let the compiler knows that the values in input are all multiples of value.