triton.experimental.gluon.language.amd.cdna4.mfma_scaled
- triton.experimental.gluon.language.amd.cdna4.mfma_scaled(a, a_scale, a_format, b, b_scale, b_format, acc, _semantic=None)
AMD Scaled MFMA operation.
` c = a * a_scale @ b * b_scale + acc `a and b use microscaling formats described in “OCP Microscaling Formats (MX) Specification”: https://www.opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf. Currently supported only on CDNA4 hardware.
- Parameters:
a (tensor) – The operand A to be multiplied.
a_scale (Optional[tensor]) – Scale factor for operand A.
a_format (str) – Format of the operand A. Available formats: e2m1, e4m3, e5m2.
b (tensor) – The operand B to be multiplied.
b_scale (Optional[tensor]) – Scale factor for operand B.
b_format (str) – Format of the operand B. Available formats: e2m1, e4m3, e5m2.
acc (tensor) – Accumulator tensor.