blog

GPU Programming in the Cloud

A round-up of cloud service providers, and a how-to guide for each

19 min read / 22 Nov 22 (updated 28 Jan 24)

I recently moved to the cloud for GPU computing, and in the process tested many of the GPU cloud service providers. I spun up an instance on each, connected to it with SSH, installed CUDA, compiled code and ran test programs. This post documents my experience. It includes a round up of GPU cloud service providers, advice on the best choice for you (easiest? cheapest? closest?) and how-to guides for getting started with each.

My own use case is C, C++ and CUDA development on Linux, with Nvidia GPUs; machine learning, scientific computing, the occasional PyTorch, but mostly bespoke code. The commentary is flavored by such.

Glossary
Choosing a GPU cloud service provider
Configure your laptop or desktop for SSH
Launch an instance on your GPU cloud service provider
Connect to your instance with SSH
Connect to your instance with Visual Studio Code
Connect to your instance with Nsight Systems

Glossary

Instance: A virtual machine running in the cloud. Multiple virtual machines may share a single physical machine.
Instance type: The configuration of an instance, as in its number of cores, memory, and disk space. When launching an instance with a cloud service provider, one chooses from a menu of instance types on offer. Some of these will have GPUs.
Region: The location of the data center in which an instance is running. Typically described at a country or continent level, e.g. U.S. West, U.S. East, Europe, Asia-Pacific, South America.
Machine Image: The disk image that loads onto the instance when launched, providing an operating system and other software. Usually based on a major Linux distribution; additional software can be installed via its package manager.

Choosing a GPU cloud service provider

You will need to choose a cloud service provider. Factors to consider include cost, ease of getting started, and the location of data centers.

I recommend Paperspace, now owned by Digital Ocean. Getting started is a breeze, and they offer a good mix of instance types, including cheaper options for development and testing, and cutting edge hardware for production runs. An added bonus is the ability to pause and resume a running instance, and only be billed for storage in between.

Tips for choosing a GPU cloud service provider

Prefer the bespokes (Paperspace, Lambda, Genesis Cloud) to the majors (Amazon Web Services, Microsoft Azure, Google Cloud): All three majors have a default quota of zero for GPU instances, so that a support ticket is necessary for GPU access. They do have extensive offerings for enterprise, but this can mean overwhelming complexity for individuals. The bespokes offer a better product for the individual developer: nonzero quotas, straightforward instance types, frictionless onboarding.

For me, quota increase requests were resolved in three minutes for Google Cloud, two days for Microsoft Azure, and four days for Amazon Web Services. I already had instances running within a few minutes on all of Paperspace, Lambda and Genesis Cloud.

Prefer a region near you: Latency degrades the development experience over SSH, adding lag to every keystroke. It reduces with distance, unnoticeable to an instance near you.

You may want to consider one of the majors over one of the bespokes if they offer better support to your region. All the majors have a global footprint, while the bespokes have more limited coverage, albeit still wide.

Shop around for the best prices: Exact prices are not reported here as they may change frequently, but expect about $0.50 USD per hour for a single GPU of older generation hardware (ideal for development at low cost) up to about $1.50 USD per hour for the latest generation (ideal for production runs). Prices do vary between providers and regions, but the market is competitive, and shopping around will keep it that way.
Staying with the same provider for multiple use cases may be convenient, but consider embracing different providers for different use cases: For development, older generation hardware may be sufficient. For production, latest generation hardware may be essential. Keeping compute close to data may be cheaper. Having access to CPU instances—not just GPU instances—may be convenient for supporting jobs like data preparation. But consider embracing different providers for different use cases—they all look the same through a terminal.

Roundup of GPU cloud service providers

Provider	Difficulty	GPU instances?	CPU instances?	Region
Paperspace	Easy	Yes	Yes	North America, Europe
Lambda	Easy	Yes	No	North America, Europe, Asia-Pacific, Middle East
Genesis Cloud	Medium	Yes	Yes	Europe
Linode	Hard	On request	Yes	North America, Europe, Asia-Pacific
Amazon Web Services	Hard	On request	Yes	Global
Google Cloud Services	Hard	On request	Yes	Global
Microsoft Azure	Painful	On request	Yes	Global

Providers not in this roundup include Alibaba Cloud and IBM Cloud.

Details of GPU cloud service providers

Paperspace offers a complete range of instance types, from affordable older generation hardware to the latest generation. It strives for simplicity—and obtains it—with a straightforward sign up process that gets new users up and running in minutes. A particular advantage is the ability to pause a running instance and resume it later while paying only for storage in between, offering convenience and cost effectiveness over competitors that only offer complete shutdown and reconfiguration. There are several machine images available based on Ubuntu. Paperspace was acquired by Digital Ocean in August 2023 but, as of January 2024, still runs as a separate service besides billing.

Lambda focuses on state-of-the-art, having mostly latest generation hardware, but there are some cheaper instances with older hardware in limited quantities. Lambda specializes in GPU instances and does not provide CPU instances. All instances run Ubuntu and have LambdaStack pre-installed—Lambda’s software stack for deep learning that includes e.g. CUDA, TensorFlow and PyTorch. Getting started is frictionless and takes only minutes.

Genesis Cloud is unique in two ways: providing hardware from Nvidia’s GeForce consumer line, and operating from Iceland and Norway. GeForce are great for many applications, and certainly for development, but look elsewhere for Nvidia’s data center GPUs if double precision performance is important. All instances run Ubuntu. A good selection of instance types are available by default, with access to the others on request. Essential software, such as CUDA, must be self-installed.

Linode offers a limited GPU service. All instance types use the Nvidia Quadro RTX 6000 which, while affordable for production, is a pricier floor for development than other providers. Access requires a support ticket, a description of the use case, and at least $100 USD of spend (which can be pre-purchased as credits). Once granted, instances are easy to setup with a simple web interface, but only base operating system images are provided, so that CUDA and other essential software must be self-installed.

Amazon Web Services (AWS) offers an extended range of instance types spanning multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances requires a support ticket, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance, and you must repeat the process.

Google Cloud Services offers an extended range of instance types covering multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances is on request, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance—but, unlike the other majors, quota increase requests seem to be processed quickly, and there is a convenient retry button to avoid repeating the whole process again. The web interface is much more helpful than the other majors, presenting instance types in a human readable form rather than obscure codes, and making better default suggestions for machine images.

Microsoft Azure offers an extended range of instance types covering multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances requires a support ticket, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance—and good luck from there, as you navigate a quagmire of product SKUs, regional availability, and interface nuisances, while getting rejected for quota increases.

Configure your laptop or desktop for SSH

Access to an instance requires an SSH key pair, consisting of a public key and a private key. The public key is shared with your cloud service provider to manage access to your instances, while the private key is only for you. The public key is the lock that only the private key can open.

You may already have an SSH key. Check for the files ~/.ssh/id_rsa.pub and ~/.ssh/id_rsa. If they exist, you can use them. If not, create them by running ssh-keygen in your terminal. The first file contains your public key, the second your private key.

You will need the contents of ~/.ssh/id_rsa.pub below. We will refer to it as your 🔑key.

Launch an instance on your GPU cloud service provider

The next step is to launch an instance via the web interface of your chosen provider. As part of this you will need to provide your 🔑key. Some providers require additional steps such as upgrading from a free trial to a paid plan, and submitting a support ticket to enable GPU access. Detailed instructions for each provider are below.

By the end of this section you should have an instance running on your chosen cloud service provider, with a username and hostname (or IP address) that can be used to access it. We will refer to these as your 👤user and 🏠host.

Work from the console.

Import your 🔑key

Click your profile picture in the top right, then Your account, then the SSH Keys tab. Click the Add new SSH Key button, enter a name and copy in your 🔑key.
Launch an instance

Select the CORE Virtual Servers product from the top left menu, then click the Create a machine button. Choose a region close to you, but otherwise keep the defaults. Click the Create button.
Find your 👤user and 🏠host

Wait for the instance to start then click the Connect button. You will see a command of the form ssh user@host. The 👤user is paperspace and the 🏠host some IP address.
Prepare the instance (after you log in for the first time, see below)
Unfortunately while CUDA is installed it is not accessible out-of-the-box. Enter the following commands each time you login to make it accessible:
export PATH=/usr/local/cuda/bin:$PATH export CPATH=/usr/local/cuda/include:$CPATH export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
These can be added to the end of ~/.profile to avoid entering them on each login. Run:
nano ~/.profile
Copy them into the bottom of the file, then hit Ctrl+O followed by Enter to save, and Ctrl+X to exit.
Terminate the instance (once you’re finished)

On the Machines tab, find the instance, click the three dots in the top right of its card and then Deactivate.

Work from the dashboard.

Launch an instance

Select Instances on the left then click the Launch Instance button. You may be prompted to add a payment method if you have not done so already. Follow the steps to specify an instance type, region, and provide your 🔑key. Once complete you will be taken back to the Instances page. The new instance will have a status of Booting, which will progress to Running after a few minutes.
Find your 👤user and 🏠host

Under the SSH Login column you will see a command of the form ssh user@host.
Terminate the instance (once you’re finished)

On the Instances page, select the checkbox next to the instance and a Terminate button will appear.

Work from the dashboard.

Launch an instance

Click the Create new instance button in the top right. Select an instance type; the default may require an increase to your quota, but other types should be available. Keep the default Ubuntu image but be sure to check the Install NVIDIA® GPU driver box. Under Authentication, choose SSH Key, and copy in your 🔑key. Click Create instance. The Instances page will appear with the new instance listed. Its status will progress from Enqueued to Creating… to Active.
Find your 👤user and 🏠host

On the same page will be a command of the form ssh user@host.
Prepare the instance (after you log in for the first time, see below)
The driver will install in the first few minutes of the instance running (as long as the Install NVIDIA GPU driver box was checked). CUDA can be installed with:
sudo apt update sudo apt install nvidia-cuda-toolkit
Terminate the instance (once you’re finished)

From the Instances page, click the three dots icon to the right of the instance and select Destroy from the pop-up menu.

Work from the AWS Management Console.

Choose a region

Select a region in the top right.
Import your 🔑key

Select the Services menu in the top left, then EC2. Select Network & Security > Key Pairs on the left. Select Actions > Import key pair at the top left. Provide a name and copy your 🔑key into the large text area. Click the Import key pair button.
Increase your quota

You may have zero quota for the GPU instances that interest you. Just try launching what you want below, see if it fails, and if so follow the instructions that will be provided in that case to request an increase in quota.
Launch an instance

Still on the EC2 page, select Instances on the left then the Launch Instances button. Click Browse more AMIs (AMI = Amazon Machine Image), search for gpu, and under the Quickstart AMIs tab choose Ubuntu’s Deep Learning AMI GPU PyTorch. For the instance type select box choose g4dn-xlarge (the economical option—more information on instance types available). Under Key Pair (login) select the name of the key pair that you created in step 2. Click the Launch instance button. Once the instance starts click View All Instances. If it fails to start due to lack of quota, follow the instructions given to submit a support ticket, and try again once approved
Find your 👤user and 🏠host

Select the instance and note the Public IPv4 address or Public IPv4 DNS, either works as your 🏠host. Your 👤user is ubuntu. To confirm these details, or if you chose a different machine image than that suggested above, click the instance name, then the Connect button, then the SSH client tab for detailed instructions. This will give an SSH command of the form ssh -i key.pem user@host; you will only need the 👤user and 🏠host parts of that.
Terminate the instance (once you’re finished)

On the Instances page, check the box next to instance name, then select Instance state > Terminate instance in the top right.

Work from the Google Cloud Console.

Upgrade to a paid account

A banner should appear at the top of the page with an Activate button to use. If not, refresh. If not, perhaps you already have a paid account.
Add your SSH key

From the main menu, select Compute Engine > Settings > Metadata > SSH Keys. Paste your 🔑key into the box and hit Save.
Increase your quota

You can update quotas by entering quotas in the search box at the top and selecting All Quotas. However, this firehose of regions and instance types assumes that you already know what you are doing. Easier is to try to start an instance, let it fail, and use the prompt to increase the relevant quota.
Launch an instance

Steps:
1. From the main menu, select Compute Engine > VM Instances. Click the Create Instance button. First time users may be interrupted by a splash page for the Compute Engine API; click Enable.
2. Under Machine Configuration, select GPU. It will select the cheapest GPU instance type by default.
3. Under Boot Disk will be a suggestion to switch from the default machine image to one with CUDA pre-installed. Click the Switch Image button to do that. In the dialog that appears, keep the defaults, just click Select.
4. Returning to the VM Instances page, the new instance should appear. You may receive an error message about quotas. If so, hover the mouse over the red status icon nex to the instance to reveal a popup with a Raise Limit button. Click that and follow the instructions. Wait for a response and, if approved, click the Retry icon in the rightmost column. With any luck second time, the instance will start.
Find your 👤user and 🏠host

On the VM Instances page, the External IP column provides 🏠host. For 👤user you will need the username associated with your SSH key pair, which is likely just the username on the laptop or desktop where you create it. If that does not work, you can always generate a new SSH key pair using the instructions above.
Prepare the instance (after you log in for the first time, see below)

On first login, a prompt will ask about installing the Nvidia driver. Answer yes. Unfortunately, if you intend to use Visual Studio Code, this prompt also breaks its setup routine. The solution is to log in via SSH with your terminal first, answer yes, wait for the installation to complete, then try connecting Visual Studio Code.
Terminate the insance (once you’re finished)

On the VM Instances page, check the box next to the instance, click the three dots in the menu above, and choose Delete.

My Azure experience was frustrating. I found the user interface tedious to work with, and was caught in a cycle of rejected quota requests for a single GPU instance with no reason provided. Finally, a helpful support engineer found a GPU for me a continent away. To get that far took two days. Some prior study may help: perhaps start with the documentation on GPU instance types and cross reference with regional availability. You may still find it necessary to contact support as I did, however, and be flexible with regions: the availability of an instance type in a given region does not guarantee that a quota request is approved (demand management, presumably).

Work from the Azure Portal.

Upgrade to a pay-as-you-go subscription

This is necessary to action a quota increase. Click the upgrade prompt in the top menu. If it does not appear, you may already have such an account.
Import your 🔑key

Type ssh keys in the search box at the top of the dashboard. Click the SSH keys item. Click the Create SSH key button. Under Resource group select an existing group or create a new one. Under Key pair name enter a name to give the key. Under SSH public key source select Upload existing public key and copy your 🔑key into the text area. Click the Review + Create button. A validation screen appears, check the details and click the Create button.
Request a quota increase

Type quotas in the search box at the top of the dashboard. Select Quotas then Compute. At the top, click on Region and select your preferred region to narrow the search. Search for a GPU instance type. Click Request quota increase > Enter a new limit in the top left. Enter a new limit equal to the number of vCPUs for the instance type and continue (emphasis: don’t just enter 1, enter the number of vCPUs). A response takes several minutes, and you need to keep the web page open while you wait for it (🤷). After waiting, you may find that the request is rejected, and that you must file a support ticket instead. There is a link to do so.
Launch an instance

Steps:
1. Type virtual machines in the search box at the top of the dashboard. Select the Virtual Machines item. Click the Create button and in the popup menu, Azure Virtual Machine.
2. Select a region.
3. Under Image, keep the default.
4. Under Size, click See all sizes and see if you can find a GPU instance type. The SKUs are obscure, but try searching for T4, A100 or A10. You may find an instance for which you do not have sufficient quota—there should be a link next to it to request, and you can try again once (if!) approved.
5. Under SSH public key source select Use existing key stored in Azure. A new select box labeled Stored Keys appears, select the key you created in step 1 above.
6. Click the Review + Create button. A validation screen appears, check the details and click the Create button. A status page will appear as the instance is deployed. Once deployed, click the Go to resource button.
7. Install the Nvidia GPU Driver Extension for Linux to the instance. On the left menu, click Extensions + Applications. Click the Add button and search for nvidia to find the extension. Click on it, hit Next, Review + Create, Create. Wait for it to deploy—it takes several minutes.
Find your 👤user and 🏠host

From the Virtual Machines page, select the instance. Your 🏠host is under Public IP Address. Your 👤user is azureuser.
Prepare the instance (after you log in for the first time, see below)
You will need to install CUDA:
sudo apt update sudo apt install nvidia-cuda-toolkit
Terminate the instance (once you’re finished)

From the Virtual Machines page, check the box next to the instance and click the Delete button at the top.

Connect to your instance with SSH

Open a terminal and enter:

ssh user@host

where 👤user is the username and 🏠host the hostname for your instance. You will be prompted with a message “The authenticity of host… can’t be established… are you sure you want to continue connecting?”; this is normal, respond yes.

You can now work with your instance via the terminal. If additional preparation of the instance is required according to the instructions above, you can do that preparation now.

Connect to your instance with Visual Studio Code

See Develop in the Cloud with Visual Studio Code.

Connect to your instance with Nsight Systems

See Profile in the Cloud with Nsight Systems.