Lumai Iris Nova optical inference server targets 90% lower AI power use

Lumai has introduced Lumai Iris, an optical computing inference server family designed to run billion-parameter large language models (LLMs) in real time. The first system in the line, Lumai Iris Nova, is available now for evaluation by hyperscalers, neo-clouds, enterprises, and research institutions.

Lumai describes Iris as a shift from silicon-based processing to photonics for core inference math, with the goal of improving inference efficiency and reducing energy use. The company claims up to 90% lower energy consumption than conventional architectures for inference workloads, although those figures will hinge on model, batch size, latency targets, and how the server is integrated into an operator’s broader stack.

The Lumai Iris family includes three server variants: Nova, Aura, and Tetra. Nova is the initial system available for evaluation, with future Iris systems intended to extend performance and efficiency for wider deployment across hyperscale and enterprise environments.

Hybrid optical-plus-digital architecture

Iris Nova uses a hybrid processor architecture that combines digital processing for system control and software with an optical tensor engine that performs core mathematical operations. Lumai says this hybrid approach is intended to support integration into data centers. The company also notes that Iris Nova uses standard PCIe cards, aligning the system with familiar server expansion and service workflows.

On model support, Lumai states that Iris Nova runs real-time inference on Llama 8B and 70B. It also highlights the prefill stage in disaggregated inference architectures as a strong fit for its optical approach, positioning the system for scenarios where token processing throughput and latency are tightly constrained by compute.

Why data center engineers should care

Inference growth is colliding with facility power ceilings, and any credible step-change in performance per kilowatt matters more than almost any single component spec right now. But optical compute claims need hard validation in real deployments, because “energy savings” can evaporate once you account for host CPUs, memory, networking, and the utilization profile required to hit latency SLAs.

“By shifting the computation paradigm from electrons to photons, Lumai can deliver an order-of-magnitude increase in performance with significant energy savings,” said Dr. Xianxin Guo, CEO and Co-Founder of Lumai.

The Advanced Research and Invention Agency (ARIA) also commented on the effort. “The demands on existing AI processors necessitate an urgent search for alternative scaling pathways,” said Suraj Bramhavar, Program Director at ARIA. “Lumai is leading the charge in demonstrating that optical processors could provide one such pathway.”

Lumai is taking evaluation requests for the Lumai Iris Nova system at lumai.ai/eval.

Source: Lumai