a4x-maxgpu-4g-metal is a Accelerator Optimized: 4 NVIDIA GB300 GPU, 144 vCPUs, 960GB RAM server offered by Google Cloud Platform with 144 vCPUs, 960 GiB of memory and 0 GB of storage.
An accelerator-optimized bare-metal platform featuring four dedicated GPUs and high-density memory for parallel computing workloads.
GPU AcceleratedCompute Optimized
Google Cloud Platform a4x-maxgpu-4g-metal is an accelerator-optimized bare-metal server designed for high-density parallel computing. It features 144 dedicated vCPUs on an x86_64 architecture, 960.0 GB of system memory, and four integrated NVIDIA GB300 GPUs. The system provides 6.67 GB of RAM per core and does not include local storage or complimentary public IPv4 addresses. This hardware profile is optimized for workloads that require heavy GPU acceleration and substantial memory capacity, such as machine learning training, deep learning inference, and high-performance computing simulations. The lack of local storage requires external storage solutions, representing a key architectural tradeoff for deployment planning.
Economics
Average Price per Region
Prices per Zone
Lowest Prices
Workload Profiles
Precomputed compound score for Cache Intensive workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 50% Redis RPS (pipeline=1, SET), 20% Redis RPS (pipeline=16, SET), 10% PassMark Memory Mark (composite), 10% Memory bandwidth (read, 16 MB ~ L3), 10% PassMark single-thread CPU. Rationale for component selection: In-memory key-value store workload, mixing direct Redis performance metrics with memory speed and latency benchmarks, and single-core CPU performance profiles.
Precomputed compound score for CI/CD Build workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 50% Geekbench Clang compilation (multi-core), 10% Geekbench Clang compilation (single-core), 20% stress-ng div16 best-N cores, 5% PassMark integer math, 5% PassMark compression, 5% Brotli compression (multi-core, level 0), 5% PassMark string sorting. Rationale for component selection: Build performance is mainly driven by multi-core compilation throughput, but also bundles single-core compilation speed and general CPU performance, multi-core compression and text/scripting processing.
Precomputed compound score for Compute Heavy Applications workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 15% stress-ng div16 best-N cores, 10% stress-ng div16 single core, 20% PassMark CPU Mark (composite), 10% Memory bandwidth (read, 64 MB), 15% PassMark floating point, 15% PassMark AVX/SSE/FMA (SIMD), 10% PassMark integer math, 5% PassMark physics simulation. Rationale for component selection: Number-crunching workload augmenting raw CPU performance stressing, general CPU performance benchmarks, memory bandwidth, and pure math computation speed like floating point, integer, SIMD (AVX/SSE/FMA) operations.
Precomputed compound score for Data Analysis workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 70% PassMark CPU Mark (composite), 10% Gzip compression (single-core, level 5), 10% Memory bandwidth (read, 64 MB), 10% PassMark Memory Mark (composite). Rationale for component selection: Data analysis and ETL workloads are memory-bandwidth-bound and CPU-throughput-driven. The profile combines general CPU performance and memory bandwidth/latency as the primary drivers, supplemented by single-core compression speed as a proxy for serialisation-heavy ETL tasks.
Precomputed compound score for LLM Inference workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 15% LLM text generation (SmolLM-135M, 128 tok), 15% LLM prompt processing (SmolLM-135M, 512 tok), 15% LLM text generation (Llama 7B, 128 tok), 15% LLM prompt processing (Llama 7B, 512 tok), 15% LLM text generation (Llama-3.3 70B, 128 tok), 15% LLM prompt processing (Llama-3.3 70B, 512 tok), 5% Memory bandwidth (read, 256 MB), 2% PassMark AVX/SSE/FMA (SIMD), 2% PassMark floating point. Rationale for component selection: VRAM and memory-bandwidth-bound LLM inference workload, using direct LLM speed benchmarks at three model sizes, and supplementing with raw memory bandwidth and SIMD performance benchmarks.
Precomputed compound score for Web Server workloads. A weighted average (geometric mean) of benchmark scores compared to their medians: score = ∏ (x_i / m_i)^(w_i / Σw). The score of 1.0 represents a synthetic baseline server with the median performance of each component benchmark; 0.5 means roughly half the performance; and 2.0 means twice the performance of that reference profile. Component weights: 30% Static web RPS (1 kB, 8 conn/vCPU), 20% Static web RPS (64 kB, 8 conn/vCPU), 20% Static web throughput (256 kB, 8 conn/vCPU), 20% OpenSSL AES-256-CBC (16 kB blocks), 5% Gzip compression (multi-core, level 5), 5% PassMark string sorting. Rationale for component selection: Primary workloads drivers are single-process static HTTP serving speed and throughput, text processing, TLS termination, and asset compression.
a4x-maxgpu-4g-metal is a Accelerator Optimized: 4 NVIDIA GB300 GPU, 144 vCPUs, 960GB RAM server offered by Google Cloud Platform with 144 vCPUs, 960 GiB of memory and 0 GB of storage.
The a4x-maxgpu-4g-metal server is equipped with 144 logical CPU cores on unknown number of physical CPU core(s), 960 GiB of memory, 0 GB of storage, and 4 nvidia-gb300 GPUs. Additional block storage can be attached as needed.
The a4x-maxgpu-4g-metal server is offered by Google Cloud Platform, founded in 2008, headquartered in California, United States. For more information, visit the Google Cloud Platform homepage.
Yes! In addition to the a4x-maxgpu-4g-metal server, the a4x server family includes 1 other sizes:
a4x-highgpu-4g (gcp).