Rafay and Aviz partner to streamline GPU cloud orchestration and AI networking for data centers

Rafay Systems and Aviz Networks have announced a partnership to provide enterprises and GPU cloud providers with an integrated orchestration and monitoring platform for GPU-based multi-tenant AI environments. Rafay reports that this combination merges its Kubernetes and GPU lifecycle management with Aviz’s AI-optimized fabric orchestration, network visibility, and tenant-aware automation.

According to Rafay and Aviz, the joint solution delivers several technical features for data center and cloud environments:

  • End-to-End Self-Service: Secure, on-demand access to GPU and CPU resources with tenant-aware network automation for developers and data scientists.
  • AI-Ready Fabric Orchestration: Aviz ONES orchestrates Spectrum-X switches, GPU network interface cards (NICs), and server networking to support high east-west (E/W) bandwidth and workload isolation.
  • Multi-Tenancy at Scale: Rafay manages GPU resource assignments within Kubernetes clusters, while Aviz maps GPUs and NICs to logical network segments for tenant-specific isolation and visibility.
  • Full-Stack Observability: Real-time insights into compute and network layers are designed to reduce troubleshooting time and improve GPU utilization.
  • Rapid Time-to-Market: Integrated APIs support deployment of GPU cloud environments in weeks, replacing manual provisioning.

Haseeb Budhani, CEO and Co-Founder of Rafay Systems, stated, “Cloud providers and enterprises need a simple way to consume GPU infrastructure without reinventing orchestration stacks,” adding, “Our partnership with Aviz gives customers not just self-service compute, but the tools and visibility they need to run AI workloads at scale.”

Vishal Shukla, CEO and Co-Founder of Aviz Networks, said, “Aviz was founded to make AI networking simple, open, and multi-vendor – while giving networking teams the best experience in this hyper-evolving world of AI fabrics and exponential bandwidth needs.” Shukla continued, “Together with Rafay, we deliver a powerful combination: Rafay’s compute lifecycle automation with Aviz’s fabric-level multi-tenant orchestration and observability. This allows GPU cloud providers to achieve AWS-like efficiency with a simple, intuitive stack.”

Rafay and Aviz highlight that their integrated stack addresses common issues faced in GPU infrastructure environments, including challenges with orchestration, multi-tenancy, and lack of fabric-level insight, which often result in underutilized GPUs and high operational costs.

Source: Rafay Systems

Get Data Center Engineering News In Your Inbox:

Popular Posts:

Boyd-Unveils-a-new-2-Megawatt-High-Capacity-Coolant-Distribution-Unit-for-Liquid-Cooled-AI-Data-Centers-478x478-1 copy
Boyd launches 2 megawatt coolant distribution unit to boost liquid cooling in AI data centers
ZincFive debuts nickel-zinc UPS cabinet for AI data centers BC2AI-5-2
ZincFive debuts nickel-zinc UPS cabinet for AI data centers
Figure-2
SuperX unveils 800VDC power solutions for high-density AI data centers
Screenshot
Five AI data centers to reach 1 GW power capacity in 2026, new analysis shows
Boyd_Rack_Emulator_-_Copy
Boyd introduces rack emulator for liquid cooling validation on NVIDIA GB200 NVL72 platforms

Share Your Data Center Engineering News

Do you have a new product announcement, webinar, whitepaper, or article topic? 

Get Data Center Engineering News In Your Inbox: