blog

Limit clock speed and memory speed on an Nvidia GPU

How to limit clock speed and memory speed for more consistent results when benchmarking and profiling CUDA kernels.

3 min read / 21 Oct 24

In recent work developing a CUDA kernel for matrix multiplication on Nvidia GPUs, I wanted to fix the clock speed and memory speed of the GPU for more consistent benchmark results. This is a quick note on how to do that, although it’s not necessarily the right approach—allowing it to vary can make for more realistic benchmarks too. See the commentary in the evaluation section of the same post linked above for a discussion of different benchmarking protocols.

When running jobs on a modern GPU, especially on a laptop or mobile device, the clock speed will change very frequently, even multiple times per second, to maximize performance within thermal bounds. Memory speed will also change, but less frequently. Under load, the GPU increases its clock speed and memory speed for optimal performance, but this increases the temperature, and eventually a thermal throttle intervenes to reduce speeds and avoid overheating the device.

When benchmarking or profiling code, this dynamic clock speed can make it difficult to obtain consistent results across multiple runs. Execution times will depend not only on the kernel being tested, but also the temperature of the hardware when it is launched. For that reason, we may wish to operate in a different mode: instead of a temperature limit while allowing clock speed to vary, have a clock speed limit while allowing temperature to vary.

Similarly, Nvidia’s Nsight Compute sets clocks to a base rate for more consistent results when profiling.

We can use nvidia-smi from the command line to limit the clock speed and memory speed:

nvidia-smi --lock-gpu-clocks=1155
nvidia-smi --lock-memory-clocks=6000

That sets the clock speed to 1155 MHz and the memory speed to 6000 MHz (these can be changed to other values). Once finished, the speeds can be reset again:

nvidia-smi --reset-gpu-clocks
nvidia-smi --reset-memory-clocks

It may be necessary to add sudo to these commands.

You may get a warning about persistence mode being disabled when using the above commands. It does not seem necessary to enable it, but doing so will silence the warning, if it is a nuisance:
nvidia-smi --persistence-mode=1
and disabled again with:
nvidia-smi --persistence-mode=0
Again, sudo may be required.

It is possible to limit the clock speed and memory speed like this, but not to lock it, despite the terminology in the command-line options of nvidia-smi. The thermal throttle still governs the hardware, and will intervene to prevent damage if the GPU becomes too hot at these speeds. But by monitoring clock speed and memory speed with an app such as Resources or Mission Center, or a command-line tool such as nvtop, one can find a setting that is never reduced over the course of an experiment.