Storage-aware Platform

Any storage, one interface

Backend.AI's Storage Proxy optimizes I/O between containers and storage and abstracts diverse storage backends into a single interface, providing them as virtual folders (VFolder). With NVIDIA GPUDirect Storage, data transfers directly from storage to GPU memory, fundamentally eliminating I/O bottlenecks.

The problem

Is GPU performance bottlenecked by storage?

GPU performance doubles every year, but storage I/O bandwidth can't keep pace.

01

GPU idle cost

Even while GPUs wait for data due to I/O bottlenecks, hourly costs keep accumulating. As storage delays add up, actual compute efficiency relative to investment can deteriorate.

02

CPU bottleneck

Inefficient data flows routed through the CPU can become a bottleneck that degrades high-performance AI workload throughput.

03

Storage fragmentation

Organizations operate diverse commercial and open-source storage: VAST, PureStorage, WEKA, NetApp, and more. If an AI platform only supports specific storage, data silos emerge.

Storage, by design

A dedicated lane for your data

Backend.AI is designed with a 3-Plane Architecture that clearly separates the Control Plane (management), Compute Plane (compute), and Storage Plane (I/O). Storage Proxy transparently abstracts the type and location of physical storage and optimizes I/O between containers and storage. It also ensures data isolation between tenants through random UUID-based namespaces.

Control PlaneAPI Gateway / Auth / SchedulerCompute PlaneGPU Containers / Sokovan Scheduler / SessionsStorage PlaneStorage Proxy···NVIDIA GPUDirect Storage / RDMA / InfiniBand

NVIDIA GPUDirect Storage

Skip the CPU, direct from storage to GPU.

The world's first containerized support for NVIDIA Magnum IO GPUDirect Storage.

Legacy I/O path

System MemoryCPUGPUPCIe SwitchStorage123Bounce BufferPCIe
Increased CPU load + unnecessary memory copies + high latency

NVIDIA GPUDirect Storage Path

System MemoryCPUGPUDirect StorageBypass CPU · Direct DMAGPUPCIe / NICStorage12GPUDirect StoragePCIe
CPU load eliminated + reduced memory copies + minimal latency

Benchmark

NVIDIA GPUDirect Storage + Dell PowerScale

Using a storage system that supports NVIDIA GPUDirect Storage along with management software that supports it can accelerate storage I/O performance. Dell PowerScale is a solution that supports NVIDIA GPUDirect Storage, delivering the best performance when used together with Backend.AI.

View Dell PowerScale Case Study

I/O Performance (Without vs With GDS)

Read + WriteGaussian Filter
108.2s
40.1s
2.7x
Write Onlykvikio-pwrite
39.1s
6.9s
5.7x
Write + LZ4nvCOMP
11.6s
5.5s
2.1x
Without GDS
With GDS

Closer means faster

NUMA-Aware Scheduling

In multi-socket servers, each CPU socket has different local memory regions (NUMA). When a GPU accesses remote memory, latency increases significantly compared to local memory.

Backend.AI's Sokovan scheduler is NUMA topology-aware, placing workloads to use memory and NICs on the same NUMA node as the GPU. RDMA and InfiniBand paths are also optimized for NUMA topology, eliminating unnecessary remote memory access across the entire data path from storage to GPU.

Multi-Socket ServerNUMA Node 0CPU Socket 0Local MemoryGPU 0GPU 1NIC (RDMA)Local PathLOW LATENCYNUMA Node 1CPU Socket 1Remote MemoryGPU 2GPU 3NICCross-NUMA PathHIGH LATENCY

Partnership

Faster paths, built together

Backend.AI partners with storage and accelerator vendors to optimize the data path for AI. Together, we co-engineer solutions that maximize throughput and minimize latency.

VAST Data

VAST Data

Since 2025, as a Technology Partner in the VAST COSMOS program, we collaborate on KV cache offloading, VAST AI OS integration, and other technologies to support customer inference workloads.

Dell Technologies

Dell Technologies

Since 2024, as a Dell Telecom AI self-certified partner, we collaborate to integrate Dell products including Dell PowerScale with Backend.AI to deliver solutions to customers.

PureStorage

PureStorage

Technology Alliance Partners since 2021. Integrating FlashArray and FlashBlade to deliver high-performance AI training and inference storage for customers.


Other Supported Storage

NetAppONTAP
NFSSMBS3
WekaFSHigh-performance distributed
POSIXS3NFS
LustreParallel file system
POSIXRDMA
IBM Storage ScaleIBM Parallel FS
POSIXNFSRDMA
CephSoftware-defined storage
S3RBDCephFS
+ MorePOSIX-compatible FS

Experience AI infrastructure with integrated storage

See how Backend.AI resolves I/O bottlenecks in your existing storage environment through a demo.

View Documentation

We're here for you!

Complete the form and we'll be in touch soon

Contact Us

Headquarter & HPC Lab

KR Office: 8F, 577, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea US Office: 3003 N First st, Suite 221, San Jose, CA 95134

© Lablup Inc. All rights reserved.

We value your privacy

We use cookies to enhance your browsing experience, analyze site traffic, and understand where our visitors are coming from. By clicking "Accept All", you consent to our use of cookies. Learn more