triton.experimental.gluon.language.amd.cdna4.buffer_load
- triton.experimental.gluon.language.amd.cdna4.buffer_load(ptr, offsets, mask=None, other=None, cache=None, _semantic=None)
AMD buffer load from global memory via a scalar base pointer and a tensor of offsets instead of a tensor of pointers. This operation will load data directly into registers.
- Parameters:
ptr (pointer to scalar) – Global memory scalar base pointer to load from.
offsets (tensor) – Offsets tensor for the load operation.
mask (tensor, optional) – Mask tensor for predicated loads. Defaults to None.
other (tensor or scalar, optional) – Tensor or scalar providing default values for masked elements. Defaults to None.
cache_modifier (str) – Cache modifier specifier. Defaults to “”.