Rafay and Aviz partner to streamline GPU cloud orchestration and AI networking for data centers

Rafay Systems and Aviz Networks have announced a partnership to provide enterprises and GPU cloud providers with an integrated orchestration and monitoring platform for GPU-based multi-tenant AI environments. Rafay reports that this combination merges its Kubernetes and GPU lifecycle management with Aviz’s AI-optimized fabric orchestration, network visibility, and tenant-aware automation.

According to Rafay and Aviz, the joint solution delivers several technical features for data center and cloud environments:

  • End-to-End Self-Service: Secure, on-demand access to GPU and CPU resources with tenant-aware network automation for developers and data scientists.
  • AI-Ready Fabric Orchestration: Aviz ONES orchestrates Spectrum-X switches, GPU network interface cards (NICs), and server networking to support high east-west (E/W) bandwidth and workload isolation.
  • Multi-Tenancy at Scale: Rafay manages GPU resource assignments within Kubernetes clusters, while Aviz maps GPUs and NICs to logical network segments for tenant-specific isolation and visibility.
  • Full-Stack Observability: Real-time insights into compute and network layers are designed to reduce troubleshooting time and improve GPU utilization.
  • Rapid Time-to-Market: Integrated APIs support deployment of GPU cloud environments in weeks, replacing manual provisioning.

Haseeb Budhani, CEO and Co-Founder of Rafay Systems, stated, “Cloud providers and enterprises need a simple way to consume GPU infrastructure without reinventing orchestration stacks,” adding, “Our partnership with Aviz gives customers not just self-service compute, but the tools and visibility they need to run AI workloads at scale.”

Vishal Shukla, CEO and Co-Founder of Aviz Networks, said, “Aviz was founded to make AI networking simple, open, and multi-vendor – while giving networking teams the best experience in this hyper-evolving world of AI fabrics and exponential bandwidth needs.” Shukla continued, “Together with Rafay, we deliver a powerful combination: Rafay’s compute lifecycle automation with Aviz’s fabric-level multi-tenant orchestration and observability. This allows GPU cloud providers to achieve AWS-like efficiency with a simple, intuitive stack.”

Rafay and Aviz highlight that their integrated stack addresses common issues faced in GPU infrastructure environments, including challenges with orchestration, multi-tenancy, and lack of fabric-level insight, which often result in underutilized GPUs and high operational costs.

Source: Rafay Systems

Get Data Center Engineering News In Your Inbox:

Popular Posts:

Screenshot
Five AI data centers to reach 1 GW power capacity in 2026, new analysis shows
hybrid-power-stabilizer
Prevalon launches Hybrid Power Stabilizer for AI data center power stabilization
pr429-10kw
Navitas ships a 10 kW 800 V-to-50 V DC-DC platform for high-voltage DC AI data center power
1765906506220
Tritium launches 800 VDC bidirectional inverter for data centers and renewable energy sites
LightSpeed2
LightSpeed Photonics introduces LightKonnect Fiber for board-level optical links

Share Your Data Center Engineering News

Do you have a new product announcement, webinar, whitepaper, or article topic? 

Get Data Center Engineering News In Your Inbox: