TDK — leaderboard ()

KAYTUS launches all-QLC flash storage for 10,000-GPU AI clusters

KAYTUS has launched an all-QLC flash storage solution aimed at feeding data to 10,000-GPU clusters, targeting read-heavy AI training environments where storage throughput and latency can cap GPU utilization. Introduced at AI EXPO KOREA 2026, the company frames the system as a way to remove data-delivery bottlenecks in ultra-large-scale training deployments.

The all-QLC flash storage solution is built on KAYTUS’s KR2280 and KR1180 server platforms and integrates with “industry-leading AI-native parallel file systems.” KAYTUS describes the design goal as avoiding data silos associated with tiered storage by tying compute nodes and the parallel file system more tightly together for read-intensive workloads.

In testing described for an exabyte-scale deployment, KAYTUS reports 10 TB/s aggregate bandwidth and 100 million IOPS. The company also claims a five-year TCO reduction of 70% versus traditional TLC-based solutions. Those are big numbers, but for data center operators the practical question is how repeatable they are once you layer in real-world network design, metadata behavior, and multi-tenant contention.

On the architecture side, KAYTUS says the system provides a unified namespace with native multi-protocol access across file, object, and block storage. It uses high-capacity QLC flash pools and NVMe-oF shared interconnects, with the goal of allowing data to flow to GPU nodes “on demand” without cross-system migration.

At the node level, KAYTUS says the design uses a PCIe 5.0 direct-connect architecture intended to double single-node I/O bandwidth compared to the previous generation, along with NUMA-balanced optimization to reduce internal throughput bottlenecks. On the data path, KAYTUS says it supports NFS over RDMA and native GPU Direct Storage, with a disaggregated design that decouples protocol processing from storage states to reduce east-west traffic.

KAYTUS also outlined a QLC product lineup with single-drive capacities up to 122.88 TB. KR1180 (1U10) is listed at 1 PB capacity and 140 GB/s bandwidth in 1U, with “optimized air cooling” and an 18% latency improvement under GPU workloads. KR2280 (2U24) supports 24 QLC drives and seven PCIe 5.0 slots, is compatible with Intel and AMD platforms, and has liquid-cooling options. KR4266 (4U60) is listed at up to 7 PB per unit, 260 GB/s sequential read bandwidth, and 20 million IOPS.

Source: KAYTUS

Get Data Center Engineering News In Your Inbox:

By subscribing, you agree to our Privacy Policy and Terms of Use. You can unsubscribe at any time.

Popular Posts:

Sam-Abdel-Rahman,-Infineon-1
Silicon, SiC, or GaN? In the AI rack, the winner depends on where you look
Screenshot
EPC eval board converts 800 VDC to 12.5 VDC at 6 kW for AI racks
Not-All-Liquids-Are-Created-Equal-White-Paper-FINAL-1
Download the practical guide to liquid cooling fluid selection
picotest-thumbnail
A closer look at power integrity at AI scale
2602-Southwire-EnergizingTheDigitalEra-eGuide-HIGHRES-5
How to reduce electrical infrastructure risk in data center projects: download the guide
TDK — tower ()

Share Your Data Center Engineering News

Do you have a new product announcement, webinar, whitepaper, or article topic? 

Get Data Center Engineering News In Your Inbox:

By subscribing, you agree to our Privacy Policy and Terms of Use. You can unsubscribe at any time.