triton.testing.do_bench(fn, warmup=25, rep=100, grad_to_none=None, quantiles=None, fast_flush=True, return_mode='mean')

Benchmark the runtime of the provided function. By default, return the median runtime of fn along with the 20-th and 80-th performance percentile.

  • fn (Callable) – Function to benchmark

  • warmup (int) – Warmup time (in ms)

  • rep (int) – Repetition time (in ms)

  • grad_to_none (torch.tensor, optional) – Reset the gradient of the provided tensor to None

  • quantiles (list[float]) – Performance percentile to return in addition to the median.

  • fast_flush (bool) – Use faster kernel to flush L2 between measurements