GPU-96xCPU-960GB-4xB200 is a GPU (96 vCPUs, 960 GiB RAM, 4x B200) server offered by UpCloud with 96 vCPUs, 960 GiB of memory and 0 GB of storage. The pricing starts at 20.6064 USD per hour.
A massive parallel computing platform featuring four next-generation GPUs and high-capacity system memory for intensive machine learning workloads.
GPU AcceleratedMemory Optimized
UpCloud GPU-96xCPU-960GB-4xB200 is a high-capacity GPU-accelerated virtual server built on the KVM hypervisor. It features an x86_64 architecture with 96 shared vCPUs and 960.0 GB of system memory, yielding 10.0 GB of RAM per core. The hardware profile is defined by four NVIDIA B200 Blackwell GPUs, providing a total of 768 GB of VRAM. This server does not include local storage, requiring external storage solutions, and comes with one complimentary public IPv4 address. The shared CPU allocation presents a resource tradeoff, but the massive GPU and memory capacity makes it highly efficient for heavy parallel computing tasks. It is designed for workloads such as large language model training, deep learning, data science, and high-performance computing.
Economics
Average Price per Region
Prices per Zone
Lowest Prices
Workload Profiles
Precomputed compound score for Cache Intensive workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 50% Redis RPS (pipeline=1, SET), 20% Redis RPS (pipeline=16, SET), 10% PassMark Memory Mark (composite), 10% Memory bandwidth (read, 16 MB ~ L3), 10% PassMark single-thread CPU. Rationale for component selection: In-memory key-value store workload, mixing direct Redis performance metrics with memory speed and latency benchmarks, and single-core CPU performance profiles.
Precomputed compound score for CI/CD Build workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 50% Geekbench Clang compilation (multi-core), 10% Geekbench Clang compilation (single-core), 20% stress-ng div16 best-N cores, 5% PassMark integer math, 5% PassMark compression, 5% Brotli compression (multi-core, level 0), 5% PassMark string sorting. Rationale for component selection: Build performance is mainly driven by multi-core compilation throughput, but also bundles single-core compilation speed and general CPU performance, multi-core compression and text/scripting processing.
Precomputed compound score for Compute Heavy Applications workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 15% stress-ng div16 best-N cores, 10% stress-ng div16 single core, 20% PassMark CPU Mark (composite), 10% Memory bandwidth (read, 64 MB), 15% PassMark floating point, 15% PassMark AVX/SSE/FMA (SIMD), 10% PassMark integer math, 5% PassMark physics simulation. Rationale for component selection: Number-crunching workload augmenting raw CPU performance stressing, general CPU performance benchmarks, memory bandwidth, and pure math computation speed like floating point, integer, SIMD (AVX/SSE/FMA) operations.
Precomputed compound score for Data Analysis workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 70% PassMark CPU Mark (composite), 10% Gzip compression (single-core, level 5), 10% Memory bandwidth (read, 64 MB), 10% PassMark Memory Mark (composite). Rationale for component selection: Data analysis and ETL workloads are memory-bandwidth-bound and CPU-throughput-driven. The profile combines general CPU performance and memory bandwidth/latency as the primary drivers, supplemented by single-core compression speed as a proxy for serialisation-heavy ETL tasks.
Precomputed compound score for LLM Inference workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 15% LLM text generation (SmolLM-135M, 128 tok), 15% LLM prompt processing (SmolLM-135M, 512 tok), 15% LLM text generation (Llama 7B, 128 tok), 15% LLM prompt processing (Llama 7B, 512 tok), 15% LLM text generation (Llama-3.3 70B, 128 tok), 15% LLM prompt processing (Llama-3.3 70B, 512 tok), 5% Memory bandwidth (read, 256 MB), 2% PassMark AVX/SSE/FMA (SIMD), 2% PassMark floating point. Rationale for component selection: VRAM and memory-bandwidth-bound LLM inference workload, using direct LLM speed benchmarks at three model sizes, and supplementing with raw memory bandwidth and SIMD performance benchmarks.
Precomputed compound score for Web Server workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 30% Static web RPS (1 kB, 8 conn/vCPU), 20% Static web RPS (64 kB, 8 conn/vCPU), 20% Static web throughput (256 kB, 8 conn/vCPU), 20% OpenSSL AES-256-CBC (16 kB blocks), 5% Gzip compression (multi-core, level 5), 5% PassMark string sorting. Rationale for component selection: Primary workloads drivers are single-process static HTTP serving speed and throughput, text processing, TLS termination, and asset compression.
GPU-96xCPU-960GB-4xB200 is a GPU (96 vCPUs, 960 GiB RAM, 4x B200) server offered by UpCloud with 96 vCPUs, 960 GiB of memory and 0 GB of storage. The pricing starts at 20.6064 USD per hour.
The GPU-96xCPU-960GB-4xB200 server is equipped with 96 logical CPU cores on unknown number of physical CPU core(s), 960 GiB of memory, 0 GB of storage, and 4 NVIDIA Blackwell B200 GPUs. Additional block storage can be attached as needed.
The pricing for GPU-96xCPU-960GB-4xB200 servers starts at 20.6064 USD per hour, but the actual price depends on the selected region, zone and server allocation method (e.g. on-demand versus spot pricing options): currently, we track the prices in 15 regions and zones every 5 minutes, and the maximum price stands at 20.6064 USD.
The GPU-96xCPU-960GB-4xB200 server is offered by UpCloud, founded in 2012, headquartered in Uusimaa, Finland. For more information, visit the UpCloud homepage.
The GPU-96xCPU-960GB-4xB200 server is available in 15 availability zones of the following 15 regions: Sydney #1 (AU), Frankfurt #1 (DE), Copenhagen #1 (DK), Madrid #1 (ES), Helsinki #1 (FI), Helsinki #2 (FI), Amsterdam #1 (NL), Stavanger #1 (NO), Warsaw #1 (PL), Stockholm #1 (SE), Singapore #1 (SG), London #1 (GB), Chicago #1 (US), New York #1 (US), San Jose #1 (US).