triton.experimental.gluon.language.nvidia.hopper.tma.async_atomic_add
- triton.experimental.gluon.language.nvidia.hopper.tma.async_atomic_add(tensor_desc, coord, src, _semantic=None)
Atomically add data from shared memory into global memory using TMA.
- Parameters:
tensor_desc (tensor_descriptor) – Tensor descriptor (tiled).
coord (Sequence[int | ttgl.constexpr | ttgl.tensor]) – Coordinates in the destination tensor.
src (ttgl.shared_memory_descriptor) – Source memory descriptor.