Skip to main content
Runpod Instant Clusters provide fully managed compute clusters with high-performance networking for distributed workloads. Deploy multi-node jobs or large-scale AI without managing infrastructure, networking, or cluster configuration.

Why use Instant Clusters?

  • Scale beyond single machines. Train models too large for one GPU, or accelerate training by distributing across multiple nodes.
  • High-speed networking included. Clusters include 1600-3200 Gbps networking between nodes, enabling efficient gradient synchronization and data movement.
  • Zero configuration. Clusters come pre-configured with static IPs, environment variables, and framework support. Start training immediately.
  • On-demand availability. Deploy clusters in minutes and pay only for what you use. Scale up for intensive jobs, then release resources.

When to use Instant Clusters

Instant Clusters offer distributed computing power beyond the capabilities of single-machine setups. Consider using Instant Clusters for:
  • Multi-GPU language model training. Accelerate training of models like Llama or GPT across multiple GPUs.
  • Large-scale computer vision projects. Process massive imagery datasets for autonomous vehicles or medical analysis.
  • Scientific simulations. Run climate, molecular dynamics, or physics simulations that require massive parallel processing.
  • Real-time AI inference. Deploy production AI models that demand multiple GPUs for fast output.
  • Batch processing pipelines. Create systems for large-scale data processing, including video rendering and genomics.

Get started

Choose the deployment guide that matches your preferred framework and use case: You can also follow this video tutorial to learn how to deploy Kimi K2 using Instant Clusters.

How it works

When you deploy an Instant Cluster, Runpod provisions multiple GPU nodes within the same and connects them with high-speed networking. One node is designated as the primary node, and all nodes receive pre-configured environment variables for distributed communication.
The high-speed network interfaces (ens1-ens8) handle inter-node communication for distributed training frameworks like , , and . The eth0 interface on the primary node handles external traffic like downloading models or datasets. For more details on environment variables and network configuration, see the configuration reference.

Supported hardware

GPUNetwork speedNodes
B2003200 Gbps2-8 nodes (16-64 GPUs)
H2003200 Gbps2-8 nodes (16-64 GPUs)
H1003200 Gbps2-8 nodes (16-64 GPUs)
A1001600 Gbps2-8 nodes (16-64 GPUs)
For clusters larger than 8 nodes (up to 512 GPUs), contact our sales team.

Pricing

Instant Cluster pricing is based on the GPU type and the number of nodes in your cluster. For current pricing, see the Instant Clusters pricing page.
All accounts have a default spending limit. To deploy a larger cluster, submit a support ticket at help@runpod.io.

Next steps

Runpod offers custom Instant Cluster pricing plans for large scale and enterprise workloads. If you’re interested in learning more, contact our sales team.