Rafay Systems and Aviz Networks have announced a partnership to provide enterprises and GPU cloud providers with an integrated orchestration and monitoring platform for GPU-based multi-tenant AI environments. Rafay reports that this combination merges its Kubernetes and GPU lifecycle management with Aviz’s AI-optimized fabric orchestration, network visibility, and tenant-aware automation.
According to Rafay and Aviz, the joint solution delivers several technical features for data center and cloud environments:
- End-to-End Self-Service: Secure, on-demand access to GPU and CPU resources with tenant-aware network automation for developers and data scientists.
- AI-Ready Fabric Orchestration: Aviz ONES orchestrates Spectrum-X switches, GPU network interface cards (NICs), and server networking to support high east-west (E/W) bandwidth and workload isolation.
- Multi-Tenancy at Scale: Rafay manages GPU resource assignments within Kubernetes clusters, while Aviz maps GPUs and NICs to logical network segments for tenant-specific isolation and visibility.
- Full-Stack Observability: Real-time insights into compute and network layers are designed to reduce troubleshooting time and improve GPU utilization.
- Rapid Time-to-Market: Integrated APIs support deployment of GPU cloud environments in weeks, replacing manual provisioning.
Haseeb Budhani, CEO and Co-Founder of Rafay Systems, stated, “Cloud providers and enterprises need a simple way to consume GPU infrastructure without reinventing orchestration stacks,” adding, “Our partnership with Aviz gives customers not just self-service compute, but the tools and visibility they need to run AI workloads at scale.”
Vishal Shukla, CEO and Co-Founder of Aviz Networks, said, “Aviz was founded to make AI networking simple, open, and multi-vendor – while giving networking teams the best experience in this hyper-evolving world of AI fabrics and exponential bandwidth needs.” Shukla continued, “Together with Rafay, we deliver a powerful combination: Rafay’s compute lifecycle automation with Aviz’s fabric-level multi-tenant orchestration and observability. This allows GPU cloud providers to achieve AWS-like efficiency with a simple, intuitive stack.”
Rafay and Aviz highlight that their integrated stack addresses common issues faced in GPU infrastructure environments, including challenges with orchestration, multi-tenancy, and lack of fabric-level insight, which often result in underutilized GPUs and high operational costs.
Source: Rafay Systems







