Cover image.

blog

GPU Programming in the Cloud

How to develop on remote cloud instances, and a roundup of cloud service providers.

Lawrence Murray 22 November 2022

After enough mishmash of Nvidia drivers and Linux kernels, I’m looking for alternatives. I’ve been developing on a laptop with a discrete GPU—how about a laptop with integrated graphics and a GPU in the cloud?

My memories of remote coding are not fond. All-day nano in a terminal. Lethargic remote desktops. But then a discovery: my preferred tools, Visual Studio Code and Nsight Systems, can both make remote connections via SSH. They can run on my laptop, with snappy local convenience, while code executes elsewhere on my GPU in the cloud.

A trial of a few cloud service providers became a roundup of many. My particular use case is CUDA programming on Nvidia GPUs; machine learning, sometimes PyTorch, but mostly my own thing. This commentary is flavored by such, but can be adapted more broadly.

Here are my findings, a how-to for you.

Glossary

Instance
A virtual machine running in the cloud. Multiple virtual machines may share a single physical machine.
Instance type
The configuration of an instance, as in its number of cores, memory, and disk space. When launching an instance with a cloud service provider, one chooses from a menu of instance types on offer. Some of these will have GPUs.
Region
The location of the data center in which an instance is running. Typically described at a country or continent level, e.g. U.S. West, U.S. East, Europe, Asia-Pacific, South America.
Machine Image
The disk image that loads onto the instance when launched, providing an operating system and other software. Usually based on a major Linux distribution; additional software can be installed via its package manager.

Choose a cloud service provider

I recommend Paperspace or Lambda. Getting started is a breeze, and they offer a good mix of instance types, including cheaper options for development and testing, and cutting edge hardware for production runs.

You will need to choose a cloud service provider. Factors to consider include cost, ease of getting started, and the location of data centers.

Tips for choosing a cloud service provider

Prefer the bespokes (Paperspace, Lambda, Genesis Cloud) to the majors (Amazon Web Services, Microsoft Azure, Google Cloud)
All three majors have a default quota of zero for GPU instances, so that a support ticket is necessary for GPU access. They do have extensive offerings for enterprise, but this can mean overwhelming complexity. The bespokes offer a better product for the individual developer: nonzero quotas, straightforward instance types, frictionless onboarding.

For me, quota increase requests were resolved in three minutes for Google Cloud, two days for Microsoft Azure, and four days for Amazon Web Services. I already had instances running within a few minutes on all of Paperspace, Lambda and Genesis Cloud.

Prefer a region near you
Latency degrades the development experience over SSH, adding lag to every keystroke. It reduces with distance, unnoticeable to an instance near you.

You may want to consider one of the majors over one of the bespokes if they offer better support to your region. All the majors have a global footprint, while the bespokes have more limited coverage, albeit still wide.

Shop around for the best prices
Exact prices are not reported here as they may change frequently, but expect about $0.50 USD per hour for a single GPU of older generation hardware (ideal for development at low cost) up to about $1.50 USD per hour for the latest generation (ideal for production runs). Prices do vary between providers and regions, but the market is competitive, and shopping around will keep it that way.
Staying with the same provider for multiple use cases may be convenient, but consider embracing different providers for different use cases
For development, older generation hardware may be sufficient. For production, latest generation hardware may be essential. Keeping compute close to data may be cheaper. Having access to CPU instances—not just GPU instances—may be convenient for supporting jobs like data preparation. But consider embracing different providers for different use cases—they all look the same through a terminal.

Roundup of cloud service providers

Provider Difficulty GPU instances? CPU instances? Region
Paperspace Easy Yes Yes North America, Europe
Lambda Easy Yes No North America, Europe, Asia-Pacific, Middle East
Genesis Cloud Medium Yes Yes Europe
Linode Hard On request Yes North America, Europe, Asia-Pacific
Amazon Web Services Hard On request Yes Global
Google Cloud Services Hard On request Yes Global
Microsoft Azure Painful On request Yes Global

Providers not in this roundup include Digital Ocean (which, anyway, does not offer GPU instances), Alibaba Cloud and IBM Cloud.

Details of cloud service providers

Paperspace offers a complete range of instance types, from affordable older generation hardware to the latest generation. It strives for simplicity—and obtains it—with a straightforward sign up process that gets new users up and running in minutes. There are several machine images available based on Ubuntu.

Lambda focuses on state-of-the-art, having mostly latest generation hardware, but there are some cheaper instances with older hardware in limited quantities. Lambda specializes in GPU instances and does not provide CPU instances. All instances run Ubuntu and have LambdaStack pre-installed—Lambda’s software stack for deep learning that includes e.g. CUDA, TensorFlow and PyTorch. Getting started is frictionless and takes only minutes.

Genesis Cloud is unique in two ways: providing hardware from Nvidia’s GeForce consumer line, and operating from Iceland and Norway. GeForce are great for many applications, and certainly for development, but look elsewhere for Nvidia’s data center GPUs if double precision performance is important. All instances run Ubuntu. A good selection of instance types are available by default, with access to the others on request. Essential software, such as CUDA, must be self-installed.

Linode offers a limited GPU service. All instance types use the Nvidia Quadro RTX 6000 which, while affordable for production, is a pricier floor for development than other providers. Access requires a support ticket, a description of the use case, and at least $100 USD of spend (which can be pre-purchased as credits). Once granted, instances are easy to setup with a simple web interface, but only base operating system images are provided, so that CUDA and other essential software must be self-installed.

Amazon Web Services (AWS) offers an extended range of instance types spanning multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances requires a support ticket, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance, and you must repeat the process.

Google Cloud Services offers an extended range of instance types covering multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances is on request, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance—but, unlike the other majors, quota increase requests seem to be processed quickly, and there is a convenient retry button to avoid repeating the whole process again. The web interface is much more helpful than the other majors, presenting instance types in a human readable form rather than obscure codes, and making better default suggestions for machine images.

Microsoft Azure offers an extended range of instance types covering multiple generations of hardware. There are a wide variety of machine images with GPU support. Access to GPU instances requires a support ticket, as initial quotas are zero. Frustratingly, you will likely find out the first time you try and fail to launch an instance—and good luck from there, as you navigate a quagmire of product SKUs, regional availability, and interface nuisances, while getting rejected for quota increases.

Configure your laptop or desktop for SSH

Access to an instance requires an SSH key pair, consisting of a public key and a private key. The public key is shared with your cloud service provider to manage access to your instances, while the private key is only for you. The public key is the lock that only the private key can open.

You may already have an SSH key. Check for the files ~/.ssh/id_rsa.pub and ~/.ssh/id_rsa. If they exist, you can use them. If not, create them by running ssh-keygen in your terminal. The first file contains your public key, the second your private key.

You will need the contents of ~/.ssh/id_rsa.pub below. We will refer to it as your 🔑key.

Launch an instance on your cloud service provider

The next step is to launch an instance via the web interface of your chosen provider. As part of this you will need to provide your 🔑key. Some providers require additional steps such as upgrading from a free trial to a paid plan, and submitting a support ticket to enable GPU access. Detailed instructions for each provider are below.

By the end of this section you should have an instance running on your chosen cloud service provider, with a username and hostname (or IP address) that can be used to access it. We will refer to these as your 👤user and 🏠host.

Work from the console.

  1. Import your 🔑key
    Click your profile picture in the top right, then Your account, then the SSH Keys tab. Click the Add new SSH Key button, enter a name and copy in your 🔑key.
  2. Launch an instance
    Select the CORE Virtual Servers product from the top left menu, then click the Create a machine button. Choose a region close to you, but otherwise keep the defaults. Click the Create button.
  3. Find your 👤user and 🏠host
    Wait for the instance to start then click the Connect button. You will see a command of the form ssh [email protected]. The 👤user is paperspace and the 🏠host some IP address.
  4. Prepare the instance (after you log in for the first time, see below)
    Unfortunately while CUDA is installed it is not accessible out-of-the-box. Enter the following commands each time you login to make it accessible:
    export PATH=/usr/local/cuda/bin:$PATH
    export CPATH=/usr/local/cuda/include:$CPATH
    export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH
    export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
    

    These can be added to the end of ~/.profile to avoid entering them on each login. Run:

    nano ~/.profile
    

    Copy them into the bottom of the file, then hit Ctrl+O followed by Enter to save, and Ctrl+X to exit.

  5. Terminate the instance (once you’re finished)
    On the Machines tab, find the instance, click the three dots in the top right of its card and then Deactivate.

Work from the dashboard.

  1. Launch an instance
    Select Instances on the left then click the Launch Instance button. You may be prompted to add a payment method if you have not done so already. Follow the steps to specify an instance type, region, and provide your 🔑key. Once complete you will be taken back to the Instances page. The new instance will have a status of Booting, which will progress to Running after a few minutes.
  2. Find your 👤user and 🏠host
    Under the SSH Login column you will see a command of the form ssh [email protected].
  3. Terminate the instance (once you’re finished)
    On the Instances page, select the checkbox next to the instance and a Terminate button will appear.

Work from the dashboard.

  1. Launch an instance
    Click the Create new instance button in the top right. Select an instance type; the default may require an increase to your quota, but other types should be available. Keep the default Ubuntu image but be sure to check the Install NVIDIA® GPU driver box. Under Authentication, choose SSH Key, and copy in your 🔑key. Click Create instance. The Instances page will appear with the new instance listed. Its status will progress from Enqueued to Creating… to Active.
  2. Find your 👤user and 🏠host
    On the same page will be a command of the form ssh [email protected].
  3. Prepare the instance (after you log in for the first time, see below)
    The driver will install in the first few minutes of the instance running (as long as the Install NVIDIA GPU driver box was checked). CUDA can be installed with:
    sudo apt update
    sudo apt install nvidia-cuda-toolkit
    
  4. Terminate the instance (once you’re finished)
    From the Instances page, click the three dots icon to the right of the instance and select Destroy from the pop-up menu.

Work from the AWS Management Console.

  1. Choose a region
    Select a region in the top right.
  2. Import your 🔑key
    Select the Services menu in the top left, then EC2. Select Network & Security > Key Pairs on the left. Select Actions > Import key pair at the top left. Provide a name and copy your 🔑key into the large text area. Click the Import key pair button.
  3. Increase your quota
    You may have zero quota for the GPU instances that interest you. Just try launching what you want below, see if it fails, and if so follow the instructions that will be provided in that case to request an increase in quota.
  4. Launch an instance
    Still on the EC2 page, select Instances on the left then the Launch Instances button. Click Browse more AMIs (AMI = Amazon Machine Image), search for gpu, and under the Quickstart AMIs tab choose Ubuntu’s Deep Learning AMI GPU PyTorch. For the instance type select box choose g4dn-xlarge (the economical option—more information on instance types available). Under Key Pair (login) select the name of the key pair that you created in step 2. Click the Launch instance button. Once the instance starts click View All Instances. If it fails to start due to lack of quota, follow the instructions given to submit a support ticket, and try again once approved
  5. Find your 👤user and 🏠host
    Select the instance and note the Public IPv4 address or Public IPv4 DNS, either works as your 🏠host. Your 👤user is ubuntu. To confirm these details, or if you chose a different machine image than that suggested above, click the instance name, then the Connect button, then the SSH client tab for detailed instructions. This will give an SSH command of the form ssh -i key.pem [email protected]; you will only need the 👤user and 🏠host parts of that.
  6. Terminate the instance (once you’re finished)
    On the Instances page, check the box next to instance name, then select Instance state > Terminate instance in the top right.

Work from the Google Cloud Console.

  1. Upgrade to a paid account
    A banner should appear at the top of the page with an Activate button to use. If not, refresh. If not, perhaps you already have a paid account.
  2. Add your SSH key
    From the main menu, select Compute Engine > Settings > Metadata > SSH Keys. Paste your 🔑key into the box and hit Save.
  3. Increase your quota
    You can update quotas by entering quotas in the search box at the top and selecting All Quotas. However, this firehose of regions and instance types assumes that you already know what you are doing. Easier is to try to start an instance, let it fail, and use the prompt to increase the relevant quota.
  4. Launch an instance
    Steps:
    1. From the main menu, select Compute Engine > VM Instances. Click the Create Instance button. First time users may be interrupted by a splash page for the Compute Engine API; click Enable.
    2. Under Machine Configuration, select GPU. It will select the cheapest GPU instance type by default.
    3. Under Boot Disk will be a suggestion to switch from the default machine image to one with CUDA pre-installed. Click the Switch Image button to do that. In the dialog that appears, keep the defaults, just click Select.
    4. Returning to the VM Instances page, the new instance should appear. You may receive an error message about quotas. If so, hover the mouse over the red status icon nex to the instance to reveal a popup with a Raise Limit button. Click that and follow the instructions. Wait for a response and, if approved, click the Retry icon in the rightmost column. With any luck second time, the instance will start.
  5. Find your 👤user and 🏠host
    On the VM Instances page, the External IP column provides 🏠host. For 👤user you will need the username associated with your SSH key pair, which is likely just the username on the laptop or desktop where you create it. If that does not work, you can always generate a new SSH key pair using the instructions above.
  6. Prepare the instance (after you log in for the first time, see below)
    On first login, a prompt will ask about installing the Nvidia driver. Answer yes. Unfortunately, if you intend to use Visual Studio Code, this prompt also breaks its setup routine. The solution is to log in via SSH with your terminal first, answer yes, wait for the installation to complete, then try connecting Visual Studio Code.
  7. Terminate the insance (once you’re finished)
    On the VM Instances page, check the box next to the instance, click the three dots in the menu above, and choose Delete.

My Azure experience was frustrating. I found the user interface tedious to work with, and was caught in a cycle of rejected quota requests for a single GPU instance with no reason provided. Finally, a helpful support engineer found a GPU for me a continent away. To get that far took two days. Some prior study may help: perhaps start with the documentation on GPU instance types and cross reference with regional availability. You may still find it necessary to contact support as I did, however, and be flexible with regions: the availability of an instance type in a given region does not guarantee that a quota request is approved (demand management, presumably).

Work from the Azure Portal.

  1. Upgrade to a pay-as-you-go subscription
    This is necessary to action a quota increase. Click the upgrade prompt in the top menu. If it does not appear, you may already have such an account.
  2. Import your 🔑key
    Type ssh keys in the search box at the top of the dashboard. Click the SSH keys item. Click the Create SSH key button. Under Resource group select an existing group or create a new one. Under Key pair name enter a name to give the key. Under SSH public key source select Upload existing public key and copy your 🔑key into the text area. Click the Review + Create button. A validation screen appears, check the details and click the Create button.
  3. Request a quota increase
    Type quotas in the search box at the top of the dashboard. Select Quotas then Compute. At the top, click on Region and select your preferred region to narrow the search. Search for a GPU instance type. Click Request quota increase > Enter a new limit in the top left. Enter a new limit equal to the number of vCPUs for the instance type and continue (emphasis: don’t just enter 1, enter the number of vCPUs). A response takes several minutes, and you need to keep the web page open while you wait for it (🤷). After waiting, you may find that the request is rejected, and that you must file a support ticket instead. There is a link to do so.
  4. Launch an instance
    Steps:
    1. Type virtual machines in the search box at the top of the dashboard. Select the Virtual Machines item. Click the Create button and in the popup menu, Azure Virtual Machine.
    2. Select a region.
    3. Under Image, keep the default.
    4. Under Size, click See all sizes and see if you can find a GPU instance type. The SKUs are obscure, but try searching for T4, A100 or A10. You may find an instance for which you do not have sufficient quota—there should be a link next to it to request, and you can try again once (if!) approved.
    5. Under SSH public key source select Use existing key stored in Azure. A new select box labeled Stored Keys appears, select the key you created in step 1 above.
    6. Click the Review + Create button. A validation screen appears, check the details and click the Create button. A status page will appear as the instance is deployed. Once deployed, click the Go to resource button.
    7. Install the Nvidia GPU Driver Extension for Linux to the instance. On the left menu, click Extensions + Applications. Click the Add button and search for nvidia to find the extension. Click on it, hit Next, Review + Create, Create. Wait for it to deploy—it takes several minutes.
  5. Find your 👤user and 🏠host
    From the Virtual Machines page, select the instance. Your 🏠host is under Public IP Address. Your 👤user is azureuser.
  6. Prepare the instance (after you log in for the first time, see below)
    You will need to install CUDA:
    sudo apt update
    sudo apt install nvidia-cuda-toolkit
    
  7. Terminate the instance (once you’re finished)
    From the Virtual Machines page, check the box next to the instance and click the Delete button at the top.

Connect to your instance

Open a terminal and enter:

where 👤user is the username and 🏠host the hostname for your instance. You will be prompted with a message “The authenticity of host… can’t be established… are you sure you want to continue connecting?”; this is normal, respond yes.

You can now work with your instance via the terminal. If additional preparation of the instance is required according to the instructions above, you can do that preparation now.

Connect Visual Studio Code to your instance

  1. Install the SSH extension
    Click the Extensions icon on the left. Type ssh into the search box. Find Remote - SSH from Microsoft (likely the first) and click the Install button next to it.
  2. Connect to your instance
    After installing the SSH extension, a new icon appears in the bottom left; click on it. A menu will appear at the top of the window. Select Connect to Host… then enter your 👤user and 🏠host into the textbox in the form [email protected] and hit Enter. A new window will open for the remote connection. If this is the first login for the instance, you may be prompted with “Host has fingerprint… Are you sure you want to continue?”; this is normal, click Continue.
  3. Setup your workspace
    On the left, click Open Folder and select a working directory to enable the file browser. From the menu, select Terminal > New Terminal to open a terminal.

The screen recording below demonstrates the steps. For more information see the Visual Studio Code documentation.

Screen recording demonstrating how to connect to a remote instance via SSH in Visual Studio Code.

Connect Nsight Systems to your instance

If necessary, start a fresh project with File > New Project. Click Select target for profiling and choose Configure targets. Click the Create a new connection button. Enter your 🏠host and 👤user then click OK, Connect, Close. A message will show installation status and eventually Target is ready.

The screen recording below demonstrates the steps.

Screen recording demonstrating how to connect to a remote instance via SSH in Nsight Systems.

blog Related
Responsive Images with Jekyll and ImageMagick
Step by step through the HTML, ImageMagick and Ruby. Works with Jekyll 4.

Lawrence Murray

30 Oct 22

blog Related
Matrix Gradients of Scalar Functions
Understanding the building blocks of reverse-mode automatic differentiation.

Lawrence Murray

7 Nov 22

blog Related
Admonitions in Markdown
Working or failing gracefully across Apostrophe, Kramdown, and Jekyll. No plugins required.

Lawrence Murray

2 Nov 22

blog Previous
Open Source Alternatives for Two Factor Authentication (2FA) Across Multiple Devices
Gnome Authenticator for Desktop, Aegis Authenticator for Android, import and export between.

Lawrence Murray

10 Nov 22