triton.testing.do_bench_cudagraph

triton.testing.do_bench_cudagraph(fn, rep=20, grad_to_none=None, return_mode='mean')

Benchmark the runtime of the provided function.

Parameters:
  • fn (Callable) – Function to benchmark

  • rep (int) – Repetition time (in ms)

  • grad_to_none (torch.tensor, optional) – Reset the gradient of the provided tensor to None