SoftBank has developed Infrinia AI Cloud OS software stack for AI data center GPU clouds

SoftBank has announced that its Infrinia Team has developed Infrinia AI Cloud OS, a software stack designed for AI data centers. SoftBank says the stack is intended to help AI data center operators deliver GPU cloud services at scale by managing GPUs, Kubernetes, and AI workloads, and by supporting multi-tenant Kubernetes as a Service and Inference as a Service.

SoftBank says deploying Infrinia AI Cloud OS lets operators build Kubernetes as a Service in a multi-tenant environment and offer Inference as a Service that provides Large Language Model inference via APIs as part of their own GPU cloud services. SoftBank also says the software is expected to reduce total cost of ownership and operational burden compared with bespoke solutions or in-house development, and to support the full AI lifecycle from model training to inference.

For Kubernetes as a Service, SoftBank says the stack automates “the entire stack (from BIOS and RAID settings to the OS, GPU Drivers, networking, Kubernetes Controllers and Storage)” on GPU platforms including NVIDIA GB200 NVL72. SoftBank also says it supports software-defined, on-the-fly physical connectivity (NVIDIA NVLink) and memory (Inter-Node Memory Exchange) reconfiguration as customers create, update, and delete clusters, plus automatic node allocation based on GPU proximity and NVIDIA NVLink domain to reduce latency and maximize GPU-to-GPU bandwidth for distributed jobs.

For Inference as a Service, SoftBank says users can deploy inference services by selecting Large Language Models without working with Kubernetes or underlying infrastructure. The company says the service provides OpenAI-compatible APIs for “drop-in integration with existing AI applications,” and “seamless scaling across multiple nodes in core and edge platforms such as NVIDIA GB200 NVL72 and other platforms.”

SoftBank lists secure multi-tenancy and operability features including tenant isolation “through encrypted cluster communications and separation,” automation for operational maintenance including system monitoring and failover, and an API environment for connecting to an AI data center portal, customer management systems, and billing systems. SoftBank says it plans to deploy Infrinia AI Cloud OS initially within its own GPU cloud services, and that the Infrinia Team aims to expand deployment to overseas data centers and cloud environments.

Source: SoftBank

Get Data Center Engineering News In Your Inbox:

Popular Posts:

Screenshot
Five AI data centers to reach 1 GW power capacity in 2026, new analysis shows
1600x1600_1
DCX announces 8.15 MW facility-scale CDU for 45 C warm-water AI data center cooling
hybrid-power-stabilizer
Prevalon launches Hybrid Power Stabilizer for AI data center power stabilization
pr429-10kw
Navitas ships a 10 kW 800 V-to-50 V DC-DC platform for high-voltage DC AI data center power
pr434-option-d-1
Navitas launches fifth-generation 1,200 V SiC TAP MOSFET platform for AI data center power

Share Your Data Center Engineering News

Do you have a new product announcement, webinar, whitepaper, or article topic? 

Get Data Center Engineering News In Your Inbox: