ResourcesCase Study

How Upstage, Lablup, and its consortium are powering a national frontier AI model with Backend.AI

Keeping large-scale training resilient: An end-to-end approach

We spent our time on development—not infrastructure—and Backend.AI handled the rest

Make training efficient with Backend.AI anywhere, anytime, at any scale.


The Upstage consortium used Backend.AI as its infrastructure backbone to tackle the operational demands of a large-scale GPU cluster. Backend.AI orchestrated hundreds of GPUs across distributed nodes while preserving visibility and control over the entire training environment. It enabled the consortium to maintain continuous model-development cycles, significantly reducing the effort required for infrastructure maintenance.

Related Services

Backend.AI

Backend.AI is a vendor-agnostic accelerated workload hosting platform based on our own home-grown orchestration and job scheduler, running on top of either cloud or on-premises (air-gapped) clusters.

Explore service
Backend.AI FastTrack 3

A MLOps pipeline platform for LLM fine-tuning and serving that simplifies the entire lifecycle of large language model customization. Prepare data, train models, validate performance, and deploy as a REST API—all managed within a single pipeline.

Explore service

We're here for you!

Complete the form and we'll be in touch soon

Contact Us

Headquarter & HPC Lab

KR Office: 8F, 577, Seolleung-ro, Gangnam-gu, Seoul, Republic of Korea US Office: 3003 N First st, Suite 221, San Jose, CA 95134

© Lablup Inc. All rights reserved.

We value your privacy

We use cookies to enhance your browsing experience, analyze site traffic, and understand where our visitors are coming from. By clicking "Accept All", you consent to our use of cookies. Learn more