Storage-aware Platform

Any storage, one interface

Backend.AI's Storage Proxy optimizes I/O between containers and storage and abstracts diverse storage backends into a single interface, providing them as virtual folders (VFolder). With NVIDIA GPUDirect Storage, data transfers directly from storage to GPU memory, fundamentally eliminating I/O bottlenecks.

The problem

Is GPU performance bottlenecked by storage?

GPU performance doubles every year, but storage I/O bandwidth can't keep pace.

GPU idle cost

Even while GPUs wait for data due to I/O bottlenecks, hourly costs keep accumulating. As storage delays add up, actual compute efficiency relative to investment can deteriorate.

CPU bottleneck

Inefficient data flows routed through the CPU can become a bottleneck that degrades high-performance AI workload throughput.

Storage fragmentation

Organizations operate diverse commercial and open-source storage: VAST, PureStorage, WEKA, NetApp, and more. If an AI platform only supports specific storage, data silos emerge.

Storage, by design

A dedicated lane for your data

Backend.AI is designed with a 3-Plane Architecture that clearly separates the Control Plane (management), Compute Plane (compute), and Storage Plane (I/O). Storage Proxy transparently abstracts the type and location of physical storage and optimizes I/O between containers and storage. It also ensures data isolation between tenants through random UUID-based namespaces.

NVIDIA GPUDirect Storage

Skip the CPU, direct from storage to GPU.

The world's first containerized support for NVIDIA Magnum IO GPUDirect Storage.

Legacy I/O path

Increased CPU load + unnecessary memory copies + high latency

NVIDIA GPUDirect Storage Path

CPU load eliminated + reduced memory copies + minimal latency

Benchmark

NVIDIA GPUDirect Storage + Dell PowerScale

Using a storage system that supports NVIDIA GPUDirect Storage along with management software that supports it can accelerate storage I/O performance. Dell PowerScale is a solution that supports NVIDIA GPUDirect Storage, delivering the best performance when used together with Backend.AI.

View Dell PowerScale Case Study→

I/O Performance (Without vs With GDS)

Read + WriteGaussian Filter

108.2s

40.1s

2.7x

Write Onlykvikio-pwrite

39.1s

6.9s

5.7x

Write + LZ4nvCOMP

11.6s

5.5s

2.1x

Without GDS

With GDS

Closer means faster

NUMA-Aware Scheduling

In multi-socket servers, each CPU socket has different local memory regions (NUMA). When a GPU accesses remote memory, latency increases significantly compared to local memory.

Backend.AI's Sokovan scheduler is NUMA topology-aware, placing workloads to use memory and NICs on the same NUMA node as the GPU. RDMA and InfiniBand paths are also optimized for NUMA topology, eliminating unnecessary remote memory access across the entire data path from storage to GPU.

Partnership

Faster paths, built together

Backend.AI partners with storage and accelerator vendors to optimize the data path for AI. Together, we co-engineer solutions that maximize throughput and minimize latency.

VAST Data

Since 2025, as a Technology Partner in the VAST COSMOS program, we collaborate on KV cache offloading, VAST AI OS integration, and other technologies to support customer inference workloads.

Dell Technologies

Since 2024, as a Dell Telecom AI self-certified partner, we collaborate to integrate Dell products including Dell PowerScale with Backend.AI to deliver solutions to customers.

PureStorage

Technology Alliance Partners since 2021. Integrating FlashArray and FlashBlade to deliver high-performance AI training and inference storage for customers.

Other Supported Storage

NetAppONTAP

NFSSMBS3

WekaFSHigh-performance distributed

POSIXS3NFS

LustreParallel file system

POSIXRDMA

IBM Storage ScaleIBM Parallel FS

POSIXNFSRDMA

CephSoftware-defined storage

S3RBDCephFS

+ MorePOSIX-compatible FS

Experience AI infrastructure with integrated storage

See how Backend.AI resolves I/O bottlenecks in your existing storage environment through a demo.

View Documentation

backend.ai