triton.language¶
Programming Model¶
Returns the id of the current program instance along the given |
|
Returns the number of program instances launched along the given |
Creation Ops¶
Returns contiguous values within the left-closed and right-open interval [ |
|
Concatenate the given blocks |
|
Returns a tensor filled with the scalar value for the given |
|
Returns a tensor filled with the scalar value 0 for the given |
Shape Manipulation Ops¶
Tries to broadcast the two given blocks to a common compatible shape. |
|
Tries to broadcast the given tensor to a new |
|
Expand the shape of a tensor, by inserting new length-1 dimensions. |
|
Returns a contiguous flattened view of |
|
Returns a tensor with the same number of elements as input but with the provided shape. |
|
Returns a transposed tensor. |
|
Returns a tensor with the same elements as input but a different shape. |
Linear Algebra Ops¶
Returns the matrix product of two blocks. |
Memory Ops¶
Return a tensor of data whose values are loaded from memory at location defined by pointer: |
|
Store a tensor of data into memory locations defined by pointer: |
Indexing Ops¶
Returns a tensor of elements from either |
Math Ops¶
Computes the element-wise absolute value of |
|
Computes the element-wise exponential of |
|
Computes the element-wise natural logarithm of |
|
Returns a floating-point resultant tensor of dividing x by y. |
|
Computes the element-wise cosine of |
|
Computes the element-wise sine of |
|
Computes the element-wise square root of |
|
Computes the element-wise sigmoid of |
|
Computes the element-wise softmax of |
|
Returns the most significant 32 bits of the product of x and y. |
Reduction Ops¶
Returns the maximum index of all elements in the |
|
Returns the minimum index of all elements in the |
|
Returns the maximum of all elements in the |
|
Returns the minimum of all elements in the |
|
Applies the combine_fn to all elements in |
|
Returns the sum of all elements in the |
|
Returns the xor sum of all elements in the |
Scan Ops¶
Applies the combine_fn to each elements with a carry in |
|
Returns the cumsum of all elements in the |
|
Returns the cumprod of all elements in the |
Atomic Ops¶
Performs an atomic add at the memory location specified by |
|
Performs an atomic compare-and-swap at the memory location specified by |
|
Performs an atomic max at the memory location specified by |
|
Performs an atomic min at the memory location specified by |
|
Performs an atomic exchange at the memory location specified by |
Comparison ops¶
Computes the element-wise minimum of |
|
Computes the element-wise maximum of |
Random Number Generation¶
Given a |
|
Given a |
|
Given a |
|
Given a |
Compiler Hint Ops¶
Insert a barrier to synchronize all threads in a block. |
|
Let the compiler know that the value first values in |
|
Let the compiler know that the value first values in |
|
Let the compiler know that the values in |
Debug Ops¶
Print the values at compile time. |
|
Assert the condition at compile time. |
|
Print the values at runtime from the device. |
|
Assert the condition at runtime from the device. |
Iterators¶
Iterator that counts upward forever. |