Penguin Solutions launches ICE ClusterWare 13.0 to optimize AI data center server performance and secure multi-tenant clusters

Penguin Solutions has announced the upcoming release of ICE ClusterWare software version 13.0, a management platform designed for high-performance computing (HPC) and artificial intelligence (AI) clusters. According to Penguin Solutions, the new version introduces two key features: patent-pending anomaly detection with automated remediation, and network-isolated multi-tenancy for secure resource separation.

The anomaly detection and auto-remediation feature in ICE ClusterWare 13.0 continuously monitors cluster operations to identify hidden performance degradation. If an underperforming node is found, the software isolates it and initiates automated remediation in real time, ensuring only validated high-performing nodes handle workloads. Penguin Solutions claims this reduces manual intervention, minimizes downtime, and accelerates model training by cutting down on restart events.

The network-isolated multi-tenancy feature gives organizations the ability to segment a single cluster into secure, dedicated subclusters. Each tenant—such as a department, project, or external GPU-as-a-Service customer—can operate in an isolated environment, choose its own workload manager, and govern users, with assurance that data and operations are securely segregated. This capability targets those operating large GPU clusters for diverse internal or external user groups, aiming to maximize infrastructure utilization while maintaining security and autonomy for each group.

Penguin Solutions cites applications for ICE ClusterWare 13.0 in hyperscale and cloud service provider data centers, enterprises delivering AI computing to multiple business groups, research institutes, and government agencies requiring stringent resource isolation and security.

ICE ClusterWare 13.0 is scheduled for general availability on December 2, 2025.

For organizations considering biomedical and life sciences research workloads, Assistant Dean for Information Technology Shailesh Shenoy at Albert Einstein College of Medicine stated, “The pace and quality of biomedical research are directly tied to the technology that supports it,” adding, “AI and HPC are crucial to providing the computational power that biometrics, life science, and medical research require, but we also had to ensure that it is optimized for our specific use cases. Having a trusted partner in Penguin Solutions has enabled us to not only build out this infrastructure, but also helped ensure we can manage and optimize it to keep it running smoothly and at capacity, freeing our faculty and student researchers to continue their groundbreaking work without interruption.”

Source: Penguin Solutions

Get Data Center Engineering News In Your Inbox:

Popular Posts:

1600x1600_1
DCX announces 8.15 MW facility-scale CDU for 45 C warm-water AI data center cooling
Screenshot
Five AI data centers to reach 1 GW power capacity in 2026, new analysis shows
hybrid-power-stabilizer
Prevalon launches Hybrid Power Stabilizer for AI data center power stabilization
pr429-10kw
Navitas ships a 10 kW 800 V-to-50 V DC-DC platform for high-voltage DC AI data center power
1765906506220
Tritium launches 800 VDC bidirectional inverter for data centers and renewable energy sites

Share Your Data Center Engineering News

Do you have a new product announcement, webinar, whitepaper, or article topic? 

Get Data Center Engineering News In Your Inbox: