Confidential AI with GPU Acceleration: Bounce Buffers Offer a Solution Today

by Mike Ferron-Jones (Intel) and Dan Middleton (NVIDIA)

As AI workloads increasingly process sensitive and regulated data, enterprises face a growing challenge: how to combine the performance of GPU acceleration with strong confidentiality guarantees. Confidential AI aims to meet this need by protecting data actively in use, not just at rest or in transit. While Intel® Xeon® CPUs and NVIDIA GPUs both now support Trusted Execution Environments (TEEs), securely connecting these isolated domains was a critical architectural hurdle. Addressing that challenge is where the “bounce buffer” architecture comes into play.

Why GPU Accelerated Confidential AI Matters

Many modern AI use cases, including healthcare analytics, financial modeling, and personalized recommendation systems depend on highly sensitive inputs and proprietary models, a trend which will go into overdrive with Agentic AI. AI workloads often require GPUs to meet performance requirements for training and inference, but traditional GPU passthrough across PCIe exposes data to system software and firmware outside the trusted boundary. This creates an inherent trust or privacy issue: organizations need assurance that data, model weights, and intermediate results remain confidential and unaltered throughout execution, even in shared or cloud environments.

The Trust Gap Between CPU and GPU TEEs

Both Intel and NVIDIA provide TEEs—Intel® Trust Domain Extensions (Intel® TDX) for CPUs and NVIDIA Confidential Computing modes for GPUs. However, data must still traverse the PCIe interconnect between these two domains. Without additional protection, DMA operations or other transfers could expose plaintext data on an unencrypted channel. The challenge is not the lack of TEEs but securely connecting them without breaking confidentiality or unacceptable performance degradation.

What Is a Bounce Buffer?

A bounce buffer is an intermediary memory region used to securely stage data transfers between CPU and GPU TEEs. In the NVIDIA Confidential Computing deployment architecture, GPU DMA operations are redirected through the host managed, encrypted bounce buffer. Data is decrypted only inside the CPU TEE, processed, and then re-encrypted before being staged for GPU consumption in the bounce buffer memory. This approach ensures that neither the hypervisor nor the device path ever sees plaintext data.

Figure 1. Visualization of CPU and GPU TEE with encrypted bounce buffer.

Reference Architecture and Implementation

Intel and NVIDIA collaborated closely on solution engineering and validation of bounce buffer architecture, working with Canonical to enable a production ready software stack. The reference implementation combines Intel TDX enabled Xeon platforms, NVIDIA H100 and H200 along with NVIDIA Blackwell B200 and B300 GPUs operating in Confidential Computing modes, and an Ubuntu Linux virtualization stack capable of enforcing memory isolation and encrypted DMA paths over PCIe. The reference architecture and deployment guide are publicly available today here.

Solution Ingredients

The reference architecture hardware uses 5th Gen Intel Xeon Scalable CPUs (code named “Emerald Rapids”) with NVIDIA Hopper, NVIDIA Blackwell, and the RTX PRO Server GPU family of offerings. The host OS and virtualization is provided by Ubuntu 25.10, and the guest OS is Ubuntu 24.04 LTS. This stack enables the establishment of TEEs on both CPU and multiple GPUs, as well as OS support to manage bounce buffer mappings.

While the bounce buffer introduces additional copy and encryption steps, observed performance remains suitable for real world AI inference scenarios, especially when weighed against the security, privacy, and compliance benefits provided.

Remote attestation is a critical part of Confidential Computing, providing cryptographic assurance and verification that the CPU and GPU TEEs launched correctly and are running as expected. In addition to bounce buffers, Intel and NVIDIA worked together to synchronize CPU and GPU attestation through Intel Trust Authority, enabling customers to receive attestations via a single service rather than using separate services.

The Road Ahead: TEE-IO and Intel TDX Connect

To address the gaps in architecture, there has been a broader industry push to secure data in use through open, interoperable confidential computing primitives, rather than siloed, vendor specific solutions. In that spirit, the solution aligns with the community models emerging in the Confidential Computing Consortium, where hardware vendors, cloud providers, and software developers collaborate on common TEE building blocks and deployment patterns.

Bounce buffers provide a practical solution today; the industry is moving toward standards-based TEE-IO, where the CPU and attached devices can effectively establish a single logical TEE, with faster direct memory access and end-to-end encrypted communications. Intel TDX Connect is Intel’s framework for securely binding CPU and device TEEs with hardware level PCIe link encryption, reducing overhead and improving efficiency. NVIDIA Accelerated Confidential Computing along with Intel Xeon 6 processors (code named “Granite Rapids”), are already architecturally prepared for Intel TDX Connect adoption as the ecosystem software matures.

Production Ready Today

Bounce buffer architecture is not theoretical. Confidential AI solutions using this technology are already in production at major cloud service providers including Alibaba, ByteDance, Google, and Oracle, with additional providers expected to follow. Customers can also work with their preferred Linux Distribution vendors to deploy select inference workloads on-premises. These deployments demonstrate that Confidential Computing and GPU acceleration can coexist at scale. We invite anyone interested to take them out for a test drive today.

Resources and Further Reading

NVIDIA Deployment Guide for Secure AI

Intel Confidential Computing Homepage

NVIDIA Confidential Computing Homepage

Intel TDX Connect Architectural Specification

Intel NVIDIA Seamless Attestation Whitepaper

No product or component can be absolutely secure.

Legal Notices and Disclaimers.