# GPU-Accelerated Google Cloud Platform

## Accelerate Innovation

NVIDIA pioneered accelerated computing to push the boundaries of innovation for developers, designers, and creators around the globe and transform the world’s largest industries. NVIDIA accelerated computing combined with the flexibility, global reach, and scale of Google Cloud speeds up  time to solution and drives down infrastructure TCO for computationally intensive workloads like generative AI, data analytics, [high-performance computing (HPC)](https://www.nvidia.com/en-us/high-performance-computing.md), graphics, and gaming wherever they need to run.

## Power New Capabilities With Google Cloud and NVIDIA

Previous

Next

1. image-1
2. image-2

## Explore Customer Stories

### PUMA

PUMA is transforming fan engagement by enabling fans to design the official Manchester City 26/27 third kit using generative AI, powered by Google Cloud and NVIDIA. The collaboration turns creative prompts into unique, AI-generated jersey designs.

[Learn More](https://youtu.be/Vqc7fY-yyVo?si=prRae7F_pYVVShGt)

### Toyota

Toyota is transforming manufacturing with an AI platform accelerated by NVIDIA and Google Cloud, empowering factory workers to automate tasks like quality inspection, predictive maintenance, and process optimization—saving over 10,000 work-hours annually across all of its plants.

[Learn More](https://cloud.google.com/blog/topics/hybrid-cloud/toyota-ai-platform-manufacturing-efficiency)

### Shopify

Shopify empowers businesses to supercharge their online stores with real-time, AI-powered search and recommendations—using NVIDIA and Google Cloud—to deliver instant updates to product listings and images, boost merchant sales, and create a seamless shopping experience.

[Learn More](https://cloud.google.com/blog/products/data-analytics/how-shopify-improved-consumer-search-intent-with-real-time-ml)

### Baseten

Baseten utilized NVIDIA Blackwell on Google Cloud, along with the NVIDIA Dynamo inference framework and NVIDIA TensorRT™-LLM, to help its customers scale quickly and meet the growing demand for AI. Baseten is now able to serve four of the most popular open source models—DeepSeek-V3, DeepSeek-R1, gpt-oss, and Llama 4 Maverick—directly on its model APIs, delivering over 225% better cost performance for high-throughput inference and 25% better cost performance for latency-sensitive inference.

[Learn More](https://www.nvidia.com/en-us/case-studies/baseten-cloud-scaling-ai-inference.md)

### SandboxAQ

SandboxAQ, a member of the NVIDIA Inception program, is leveraging the NVIDIA DGX™ Cloud platform on Google Cloud to build its frontier larger quantitative model (LQM) platform for scientific discovery. This collaboration enabled SandboxAQ to achieve up to 4x faster discovery across drug, chemical, and materials pipelines, as well as 80x acceleration in quantum chemistry calculations using NVIDIA CUDA-X™ libraries.

[Learn More](https://www.sandboxaq.com/press/sandboxaq-unlocks-new-capabilities-for-scientific-discovery-and-enterprise-innovation-with-nvidia-dgx-cloud)

### Palo Alto Networks

Palo Alto Networks is transforming cybersecurity by using NVIDIA Dynamo-Triton and Google Cloud for real-time AI-powered threat detection and data protection—reducing latency and costs while helping enterprises defend against advanced cyberattacks more efficiently.

[Learn More](https://www.youtube.com/watch?v=9XFwP0FHGI0)

Previous

Next

1. x
2. PUMA
3. toyata
4. shopify
5. Baseten
6. SandboxAQ
7. palo-alto

## NVIDIA Accelerated Infrastructure on Google Cloud

Accelerate next-generation AI with the latest NVIDIA GPUs on Google Cloud, seamlessly integrated with Google Cloud AI Hypercomputer architecture—enabling demanding workloads at scale like LLM training, real-time inference, and advanced agentic AI applications for autonomous decision-making and physical AI for robotics, autonomous vehicles, and digital twins.

[See the Full List of NVIDIA Accelerated VMs Here](https://cloud.google.com/compute/docs/gpus)

### Google A4X and A4X Max VMs With NVIDIA GB200 and GB300 NVL72

Google Cloud’s A4X and A4X Max VMs deliver over one exaFLOP of compute per rack and support seamless scaling to tens of thousands of NVIDIA Blackwell and Blackwell Ultra GPUs, enabled by Google’s Jupiter network fabric and advanced networking with NVIDIA® ConnectX®-7 NICs. Google’s third-generation liquid cooling infrastructure delivers sustained, efficient performance, even for the largest AI workloads.

[Learn More](https://cloud.google.com/blog/products/compute/now-shipping-a4x-max-vertex-ai-training-and-more)

### Google A4 VM With NVIDIA HGX B200

Google Cloud’s A4 VMs, accelerated by NVIDIA HGX™ B200, are now generally available. The A4 VM features eight NVIDIA Blackwell GPUs interconnected by fifth-generation NVIDIA NVLink™. Compared to previous generation A3 VMs, the A4 VM offers a significant performance boost, enabling faster model training, real-time inference, and accelerated data analytics.

[Learn More](https://cloud.google.com/blog/products/compute/introducing-a4-vms-powered-by-nvidia-b200-gpu-aka-blackwell)

### Google G4 VMs With NVIDIA RTX PRO 6000 Blackwell Server Edition

Google G4 VMs with NVIDIA RTX PRO™ 6000 Blackwell deliver breakthrough performance for both agentic and physical AI applications, accelerating everything from cost-efficient inference and generative AI to robotics simulation, hyper-realistic 3D rendering, and next-generation game rendering. Unlock next-generation AI and graphics capabilities.

[Learn More](https://cloud.google.com/blog/products/compute/introducing-g4-vm-with-nvidia-rtx-pro-6000?linkId=100000369285916)

## Unlock the Full Potential of NVIDIA Accelerated Computing on Google Cloud

1. NVIDIA on Google Cloud Marketplace
2. NVIDIA Integrations in Google Cloud

### NVIDIA on Google Cloud Marketplace

NVIDIA offers a comprehensive, performance-optimized software stack directly on Google Cloud Marketplace to unlock the full potential of cutting-edge NVIDIA accelerated infrastructure and reduce the complexity of building accelerated solutions on Google Cloud. This lowers TCO through improved performance, simplified deployment, and streamlined development.

### NVIDIA AI Enterprise

NVIDIA AI Enterprise is a cloud native platform that streamlines development and deployment of production-grade AI solutions including generative AI, computer vision, speech AI, and more. Easy-to-use microservices provide optimized model performance with enterprise-grade security, support, and stability to ensure a smooth transition from prototype to production for enterprises that run their businesses on AI.

[Learn More](https://console.cloud.google.com/marketplace/product/nvidia/nvidia-ai-enterprise-vmi?pli=1&_ga=2.192713898.1058218918.1718053619-1028737564.1709155519&_gac=1.41706006.1718054019.EAIaIQobChMIrvqY69jYhQMVmQatBh0F6gvTEAAYASAAEgK3bfD_BwE&rapt=AEjHL4OwClYPkEK7cYJGad-3D1XKNZNl3sXnRuxmpAC5cRW8BaP5K7BoiWsGiyRl1s1qHqxtttcfQfa1ktMmXQi6Yu8Rg_BMsOplKxwHBeNXu76lrf6YbSg&project=nvidia-ngc-public)

### NVIDIA NIM

NVIDIA NIM™, part of NVIDIA AI Enterprise, is a set of easy-to-use inference microservices for accelerating the deployment of AI applications that require natural language understanding and generation. By offering developers access to industry-standard APIs, NIM enables the creation of powerful copilots, chatbots, and AI assistants, while making it easy for IT and DevOps teams to self-host AI models in their own managed environments. NVIDIA NIM can be deployed on GCE, GKE, or Google Cloud Run.

[Learn More](https://www.nvidia.com/en-us/ai/#referrer=ai-subdomain)

### NVIDIA Omniverse

NVIDIA Omniverse™ is a platform of APIs, SDKs, and services that enable developers to integrate OpenUSD, NVIDIA RTX™ rendering technologies into physical AI applications. Use VMs on Google Cloud to accelerate your application development.

[Learn More](https://cloud.google.com/blog/products/compute/introducing-g4-vm-with-nvidia-rtx-pro-6000)

### Integrations at Every Layer of the Google Cloud Stack

NVIDIA and Google Cloud collaborate closely on integrations that bring the power of the full-stack NVIDIA AI platform to a broad range of native Google Cloud services, giving developers the flexibility to choose the level of abstraction they need. With these integrations, Google Cloud customers can combine the power of both enterprise-grade NVIDIA AI software and the computational power of NVIDIA GPUs to maximize application performance within the Google Cloud services they’re already familiar with.

### Google Kubernetes Engine

Combine the power of the [NVIDIA AI](https://cloud.google.com/kubernetes-engine/docs/concepts/gpus) platform with the flexibility and scalability of GKE to efficiently manage and scale generative AI training and inference and other compute-intensive workloads. GKE's on-demand provisioning, automated scaling, [NVIDIA Multi-Instance GPU (MIG) support](https://cloud.google.com/kubernetes-engine/docs/how-to/gpus-multi), and [GPU time-sharing](https://cloud.google.com/kubernetes-engine/docs/how-to/timesharing-gpus) capabilities ensure optimal resource utilization. This minimizes operational costs while delivering the necessary computational power for demanding AI workloads.

[Learn More](https://cloud.google.com/blog/products/compute/gke-and-nvidia-nemo-framework-to-train-generative-ai-models)

### Vertex AI

Combine the power of NVIDIA accelerated computing with Google Cloud’s Vertex AI, a fully managed, unified MLOps platform for building, deploying, and scaling AI models in production. Leverage the latest NVIDIA GPUs and NVIDIA AI software, like Triton™ Inference Server, within Vertex AI Training, Prediction, Pipelines, and Notebooks to accelerate generative AI development and deployment without the complexities of infrastructure management.

[Learn More](https://cloud.google.com/vertex-ai/docs/predictions/using-nvidia-triton)

### Google Dataproc

Leverage the [NVIDIA RAPIDS™ Accelerator for Spark](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/apache-spark-3.md) to accelerate Apache Spark and Dask workloads on Dataproc, Google Cloud’s fully managed data processing service—without code changes. This enables faster data processing, extract, transform, and load (ETL) operations, and machine learning pipelines while substantially lowering infrastructure costs. With the RAPIDS Accelerator for Spark, users can also speed up batch workloads within Dataproc Serverless without provisioning clusters.

[Learn More](https://docs.nvidia.com/spark-rapids/user-guide/latest/getting-started/google-cloud-dataproc.html)

### Google Dataflow

Accelerate machine learning inference with NVIDIA AI on [Google Cloud Dataflow](https://cloud.google.com/dataflow/docs/gpu), a managed service for executing a wide variety of data processing patterns, including both streaming and batch analytics. Users can optimize the inference performance of AI models using NVIDIA TensorRT’s integration with Apache Beam SDK and speed up complex inference scenarios within a data processing pipeline using [NVIDIA GPUs](https://docs.cloud.google.com/dataflow/docs/gpu/gpu-support) supported in Dataflow.

[Learn More](https://developer.nvidia.com/blog/simplifying-and-accelerating-machine-learning-predictions-in-apache-beam-with-nvidia-tensorrt/)

### Cloud Run

DAccelerate the path to deploy generative AI faster with NVIDIA NIM on Google Cloud Run, a fully managed, serverless compute platform for deploying containers on Google Cloud’s infrastructure. With support for NVIDIA GPUs in Cloud Run, users can leverage NIM to optimize performance and accelerate deployment of gen AI models into production in a serverless environment that abstracts away infrastructure management.

[Learn More](https://developer.nvidia.com/blog/google-cloud-run-adds-support-for-nvidia-l4-gpus-nvidia-nim-and-serverless-ai-inference-deployments-at-scale/?linkId=100000282194483)

### Dynamic Workload Scheduler

Get easy access to [NVIDIA GPU capacity](https://cloud.google.com/kubernetes-engine/docs/how-to/provisioningrequest) on Google Cloud for short-duration workloads like AI training, fine-tuning, and experimentation using Dynamic Workload Scheduler. With [flexible scheduling](https://cloud.google.com/blog/products/compute/introducing-dynamic-workload-scheduler#:~:text=Two%20modes%3A%20Flex%20Start%20and%20Calendar) and atomic provisioning, users can get access to the compute resources they need within services like GKE, Vertex AI, and Batch while enhancing resource utilization and optimizing costs associated with running AI workloads.

[Learn More](https://www.youtube.com/watch?v=gOByRkumrbk)

### Google Distributed Cloud

With the NVIDIA Blackwell platform coming to Google Distributed Cloud, enterprises can now securely deploy advanced agentic AI—including Google Gemini models—directly in their own data centers on premises. This integration empowers organizations to harness breakthrough AI performance and scalability for sensitive, regulated workloads while ensuring data privacy, sovereignty, and compliance. By combining the strengths of Google Distributed Cloud and NVIDIA Blackwell, businesses can accelerate innovation with next-generation AI, while maintaining full control over their data and operations.

[Learn More](https://blogs.nvidia.com/blog/google-cloud-next-agentic-ai-reasoning/)

### Google AI Hypercomputer With NVIDIA Dynamo Recipe

Google Cloud developed a recipe for disaggregated inferencing with NVIDIA Dynamo, a high-performance, low-latency platform for frontier AI models. This recipe makes it easy to deploy NVIDIA Dynamo on Google Cloud’s AI Hypercomputer, including Google Kubernetes Engine (GKE), vLLM inference engine, and A3 Ultra GPU-accelerated instances powered by NVIDIA H200 GPUs.

[Learn More](https://cloud.google.com/blog/products/compute/ai-inference-recipe-using-nvidia-dynamo-with-ai-hypercomputer)

### Google Vertex AI Model Garden With NVIDIA Nemotron

The NVIDIA Nemotron family of open models, including [Nemotron 3 Nano](https://nvidianews.nvidia.com/news/nvidia-debuts-nemotron-3-family-of-open-models), will be available on Google Vertex AI Model Garden as NVIDIA NIM microservices. This integration will provide developers and enterprises with access to NVIDIA's leading open-weight models. With a Vertex AI managed deployment, organizations can rapidly develop and deploy custom AI agents powered by Nemotron models while maintaining control over performance, cost, and compliance.

[Learn More](https://cloud.google.com/blog/products/compute/now-shipping-a4x-max-vertex-ai-training-and-more)

### Google Cloud Storage for NVIDIA Run:ai Model Streamer

NVIDIA Run:ai Model Streamer comes with native Google Cloud Storage support, supercharging vLLM inference workloads on Google Kubernetes Engine (GKE). This collaboration accelerates loading LLMs from “cold” storage to memory by nearly 5x for faster AI inference—including an example using a 141 GB Meta Llama 3.3-70B model.

[Learn More](https://cloud.google.com/blog/products/containers-kubernetes/nvidia-runai-model-streamer-supports-cloud-storage)

### Google Cloud Cluster Director

Cluster Director is a Google Cloud service designed to optimize the deployment and management of large-scale AI and HPC clusters, accelerated by NVIDIA. This includes full support for large-scale AI systems, including Google Cloud’s A4X and A4X Max VMs powered by NVIDIA GB200 and GB300 NVL72 systems. Google Cloud announced the Preview of Cluster Director support for Slurm on GKE, utilizing SchedMD’s Slinky offering—a company recently acquired by NVIDIA.

[Learn More](https://cloud.google.com/blog/products/compute/cluster-director-is-now-generally-available)

### Google Cloud and NVIDIA Developer Community

Google Cloud and NVIDIA have partnered to create this community for developers, data scientists, AI/ML engineers, and technical practitioners focused on leveraging NVIDIA and Google Cloud technologies for their development.

[Join the Community](https://developers.google.com/community/nvidia)

![Google Cloud and NVIDIA Developer Community](https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/gpu-cloud-computing/google-cloud-platform/google-cloud-nvidia-developer-video-thumbnail.jpg)

Consent for Optional Cookies

(googleCookiePolicyLink)YouTube sets performance, advertising, and other optional cookies(/googleCookiePolicyLink) when you watch embedded videos. To watch this video, you need to turn on optional cookies for the site. By clicking “Accept and Play Video,” you will automatically turn on advertising and other optional cookies for the site and accept our (nvidiaTermsOfServiceLink)Terms of Service(/nvidiaTermsOfServiceLink) (which contains important waivers). Please see our (nvidiaPrivacyPolicyLink)Privacy Policy(/nvidiaPrivacyPolicyLink) and (nvidiaCookiePolicyLink)Cookie Policy(/nvidiaCookiePolicyLink) for more information.

Cancel

Accept and Play Video

Alternatively, you can (youtubeLink)watch this video on YouTube(/youtubeLink).

## Additional Resources

### Gemma

NVIDIA is collaborating with Google to launch Gemma, a newly optimized family of open models built from the same research and technology used to create the Gemini models. An optimized release with TensorRT-LLM enables users to develop with LLMs using only a desktop with an NVIDIA RTX™ GPU.

[Try Now](https://build.nvidia.com/models?filters=publisher%3Agoogle)

### RAPIDS cuDF on Google Colab

RAPIDS cuDF is now integrated into Google Colab. Developers can instantly accelerate pandas code up to 50X on Google Colab GPU instances and continue using pandas as data grows—without sacrificing performance.

[Read Blog](https://developer.nvidia.com/blog/rapids-cudf-instantly-accelerates-pandas-up-to-50x-on-google-colab)

### Accelerate Your Startup

The NVIDIA Inception program helps startups accelerate innovation with developer resources and training, access to cloud credits, exclusive pricing on NVIDIA software and hardware, and opportunities for exposure to the VC community.

[Learn More and Apply](https://www.nvidia.com/en-us/startups/?ncid=ref-inc-589506-vt33)

## Latest News

### Co-Engineered AI Infrastructure for the Agentic AI Era

Google Cloud and NVIDIA provide the foundation and full-stack experience that technology leaders need to scale their agentic AI workloads.

Together, they are innovating every layer of the AI stack—driven by continued momentum for G4 VMs, support for the NVIDIA Vera Rubin NVL72 platform, NVIDIA Dynamo integration with Inference Gateway, enhanced NVIDIA support across Vertex AI Training and Model Garden, and a new AI startup accelerator program for the public sector.

[Read the Announcement](https://cloud.google.com/blog/products/compute/google-cloud-ai-infrastructure-at-nvidia-gtc-2026)

### NVIDIA Kicks Off the Next Generation of AI With Rubin—Six New Chips, One Incredible AI Supercomputer

Building on its decade-long partnership with NVIDIA, Google plans to bring the capabilities of the NVIDIA Rubin platform to our customers, offering them the scale and performance needed to advance the boundaries of AI. Google Cloud will be among the first cloud providers to deploy NVIDIA Vera Rubin-based instances in 2026.

[Read the Announcement](https://nvidianews.nvidia.com/news/rubin-platform-ai-supercomputer)

### Google Cloud Scales MoE inference on A4X (GB200 NVL72) with NVIDIA Dynamo

Google Cloud is scaling frontier mixture-of-experts models faster with A4X machines, powered by NVIDIA GB200 NVL72 systems, and NVIDIA Dynamo. This validated reference architecture achieved over 6,000 total tokens/sec/GPU with 10ms inter-token latency by delivering distributed runtime KV cache management and kernel scheduling with Wide Expert Parallelism (WideEP) across the full 72-GPU compute domain. Explore the technical details and get started with deployment recipes.

[Read the Blog](https://cloud.google.com/blog/products/compute/scaling-moe-inference-with-nvidia-dynamo-on-google-cloud-a4x)

### Accelerate Model Downloads on GKE With NVIDIA Run:Ai Model Streamer

Google Cloud and NVIDIA are supercharging AI workloads with native Google Cloud Storage support for the NVIDIA Run:ai Model Streamer. This integration reduces model loading "cold start" times on GKE by streaming tensors directly to GPU memory, enabling faster auto-scaling and improved GPU efficiency for high-performance AI inference.

[Read the Blog](https://cloud.google.com/blog/products/containers-kubernetes/nvidia-runai-model-streamer-supports-cloud-storage)

### NVIDIA and Google Cloud Accelerate Enterprise AI and Industrial Digitalization

NVIDIA and Google Cloud are accelerating industrial digitalization with the general availability of G4 VMs, powered by NVIDIA Blackwell GPUs. By bringing NVIDIA Omniverse and NVIDIA Isaac Sim™ to Google Cloud Marketplace, this partnership empowers enterprises to scale physical AI and digital twins for manufacturing, automotive, and logistics workloads.

[Read the Blog](https://blogs.nvidia.com/blog/nvidia-google-cloud-enterprise-ai-industrial-digitalization/)

### Google Cloud Now Shipping A4X Max, Vertex AI Training, and More

Google Cloud is expanding its AI Hypercomputer with the shipment of A4X VMs, powered by the NVIDIA Blackwell platform. Integrated with Vertex AI, these instances provide massive scale and performance for training and inference, enabling enterprises to accelerate the development of complex generative AI models and sophisticated agentic AI applications.

[Read the Blog](https://cloud.google.com/blog/products/compute/now-shipping-a4x-max-vertex-ai-training-and-more)

## Access the Power of Google Cloud and NVIDIA

[Contact Sales](#contact-sales)

![YouTube Video](https://img.youtube.com/vi_webp/qK5BeWZfmmY?si=pG-dxup8jbiS5-kH/maxresdefault.webp)

Consent for Optional Cookies

(googleCookiePolicyLink)YouTube sets performance, advertising, and other optional cookies(/googleCookiePolicyLink) when you watch embedded videos. To watch this video, you need to turn on optional cookies for the site. By clicking “Accept and Play Video,” you will automatically turn on advertising and other optional cookies for the site and accept our (nvidiaTermsOfServiceLink)Terms of Service(/nvidiaTermsOfServiceLink) (which contains important waivers). Please see our (nvidiaPrivacyPolicyLink)Privacy Policy(/nvidiaPrivacyPolicyLink) and (nvidiaCookiePolicyLink)Cookie Policy(/nvidiaCookiePolicyLink) for more information.

Cancel

Accept and Play Video

Alternatively, you can (youtubeLink)watch this video on YouTube(/youtubeLink).

![YouTube Video](https://img.youtube.com/vi_webp/UXyv0QvbsJY/maxresdefault.webp)

Consent for Optional Cookies

(googleCookiePolicyLink)YouTube sets performance, advertising, and other optional cookies(/googleCookiePolicyLink) when you watch embedded videos. To watch this video, you need to turn on optional cookies for the site. By clicking “Accept and Play Video,” you will automatically turn on advertising and other optional cookies for the site and accept our (nvidiaTermsOfServiceLink)Terms of Service(/nvidiaTermsOfServiceLink) (which contains important waivers). Please see our (nvidiaPrivacyPolicyLink)Privacy Policy(/nvidiaPrivacyPolicyLink) and (nvidiaCookiePolicyLink)Cookie Policy(/nvidiaCookiePolicyLink) for more information.

Cancel

Accept and Play Video

Alternatively, you can (youtubeLink)watch this video on YouTube(/youtubeLink).

## Contact Sales

Welcome back.
Not you? Log Out

Welcome
back. Not you? Clear form