Releases
Release and updates
Uncharted AI: The Age of AI
By LablupThis article is a summary of Jeongkyu Shin's keynote speech on September 24, 2024 at lab | up > /conf/4.
On September 24, 2024, Lablup's 4th conference, lab | up > /conf/4, was held. The event was attended by a variety of external speakers as well as Lablup employees, and the keynote address was given by Lablup's CEO, Jeongkyu Shin.
This article will cover the advancements in the AI era as introduced by Jeongkyu Shin in his keynote speech, the future trajectory of Lablup, updates on the current products, and some of our new product releases.
Uncharted Waters
The title of this keynote, "Uncharted AI - The Age of AI," draws inspiration from the classic game "Uncharted Waters," fondly remembered by many. However, the Uncharted Waters is not merely a game; it represents a significant chapter in the real-life history of our global community.
During the Age of Discovery, beginning in the 15th century, numerous explorers ventured across the oceans in pursuit of spices, such as the nowaday widely-known "pepper." Although I was not alive during that time to witness it firsthand, so I played it with a game. We may not consider a spice today so valuable, but numerous adventurers risked their lives in its pursuit.
Uncharted AI
Like so many people who risked their lives across the ocean in search of spices back then, we're in a new era of artificial intelligence (AI), and we're risking our lives and working with a diverse set of partners to advance AI. The necessity of this effort lies in its commitment to accessibility. If I could harvest pepper in my backyard, I wouldn't have to cross the ocean. At the dawn of a new era, this difference in access creates a skills gap for some and a challenge for others. For Lablup, the skills gap introduced by emerging technologies has catalyzed the dawn of a new era.
At Lablup, our motto has been clear since our founding in 2015. We've made it our core mission to Make AI Accessible, making technology more accessible and lowering barriers. Our goal was to reduce the barriers to AI accessibility by making the technology itself comprehensible and user-friendly, not merely available as an API.
As the field of AI advances, the challenge of scaling emerges. As AI technology expands, data it processes increases, computation also intensifies, it moves from single-node to multi-node, and from tens to hundreds of thousands of GPUs. Simultaneously, AI is becoming more compact, operating on devices in the palm of your hand, such as Samsung's Galaxy AI and Apple Intelligence, as well as on IoT sensors like thermometers.
Simultaneously, we are witnessing efforts to operate AI with greater power and more resources, as well as a surge in endeavors to run AI with less power and fewer resources. If we consider the traditional spectrum of AI, it is expanding both upwards (larger) and downwards (smaller), with the technology needed to shift the scale in either direction being entirely distinct.
Back in 2015, we were able to construct models using just a GeForce GTX970. However, workloads have expanded so rapidly that for the past four or five years, their growth has surpassed the performance improvements of semiconductors, known as Moore's Law. Consequently, the focus has shifted from enhancing a single chip's performance to combining several chips and utilizing them in parallel.
Make AI "Scalable"
Over the past four years, the distributed computing paradigm in AI has undergone significant evolution. We have moved beyond parallel processing to witness a variety of computations occurring concurrently. Diverse tasks like data processing, model training, and service provisioning are now integrated. Simultaneous demands for heterogeneous computational resources have emerged, encompassing databases, training, data processing, fleet management, RAS, and others that align more closely with the service stack.
Accelerators such as GPUs have become essential for modern computing. We no longer use CPUs and GPUs separately; instead, we must integrate them more closely. The driving force behind this integration is the universal need for GPUs, which leads to bottlenecks that are both physical—such as power, network, and data—and non-physical, including hardware instability, platform management, and software issues. At Lablup, our goal is to eliminate these obstacles to scaling.
This year at Lablup, we've set a new objective: Make AI Scalable. Our aim is to expand AI workloads across the full range, from accelerators to individual nodes to hyperscale environments. This goal builds upon our initial mission of “Making AI Accessible,” as we eliminate obstacles to scaling, incorporate elements that facilitate scaling, and persist in dismantling barriers to accessing AI technology.
Through the years, the company's dedication to making AI both accessible and scalable has resulted in numerous innovations. As a result, the number of enterprise GPU running on Backend.AI has grown to nearly 13,000, with some sites managing more than 1,500 GPUs. Additionally, the number of teams (customers) utilizing our products has increased to over 100. In varied sectors such as cloud services, AI accelerator testbeds, and autonomous driving, Backend.AI has established itself as a crucial infrastructure component for AI.
This massive scale significantly increased the technical challenge. We've had to develop technologies that span the entire spectrum, from single servers to thousands of clusters. We had to “take away everything that are blocking the scaling, and add everything for the scaling.” We would like to use this opportunity to share our recent innovations, the ongoing developments, and the future we are striving to create.
Open Source
Lablup is a company that is deeply involved in the open source ecosystem. We are developing and releasing various projects such as Backend.AI, Callosum, aiodocker, aiomonitor (aiotools), Raftify, and many more. Open source is in our DNA. Our experience on the open-source we create, publish, or contribute to across various on-premises environments is a significant competitive edge of us. Backend.AI's support for on-premises environments, compatibility with cloud environments, and more are all capabilities that what we've gained from our open source experience.
Backend.AI CLI Installer: Easy installation experience with TUI
The Backend.AI CLI Installer is an open-source initiative designed to enhance the accessibility of Backend.AI. It features a text-based user interface (TUI) for simplified installation, automates the package-based installation process, and includes meta settings for streamlined automatic setup.
bndev: Easily build your own AI infrastructure
For enthusiasts who enjoy tinkering and hacking beyond mere package-based installations, we have introduced a development tool named bndev. This tool simplifies the process of constructing and maintaining intricate Backend.AI development environments. The concept behind bndev is to empower everyone to own and maintain their personal AI infrastructure.
Backend.AI Core
Backend.AI conducts major version releases biannually, in March and September. The release of version 24.03 took place in March 2024, and the upcoming release of version 24.09 is imminent. Significant updates to Backend.AI Core are expected to influence future releases. Allow me to introduce these changes for you.
Key Updates
- Support for NVIDIA NGC(NVIDIA GPU Cloud) NIM(Nemo Infrerence Microservice): Key NGC features, like license-based container image loading, are compatible with Backend.AI.
- Expanded support for new accelerators including Intel Gaudi2, Rebellions ATOM+, and Furiosa RNGD: Backend.AI allows you to flexibly choose the best AI accelerator to match the characteristics of your workload.
- General availability of Backend.AI model store, browser, and serving: A comprehensive solution that integrates the essential features of MLOps, simplifying the process for customers to find AI models and deploy them seamlessly into their workflows.
- Enhanced Task Scheduling: The new Priority Scheduler enables the independent prioritization of tasks, ensuring that tasks of high importance are addressed swiftly and dependably.
- Agent Selector Concept: The Agent Selector is responsible for determining which nodes the scheduler actually runs the selected tasks on. This part is now easily customizable as a standalone plugin. You can use it to distribute jobs based on different criteria, such as power usage or temperature of each node. We expect this to be a great help in optimizing the operation of your infrastructure by balancing the load across nodes, increasing power efficiency, and more.
- Our own Docker network plugin: Expanded support for GPUDirect Storage for large-scale data processing, minimizing bottlenecks in moving data within a single node.
- Cilium-based networking stack for inter-container communication: The implementation has enhanced large-scale distributed learning, resulting in a 30% increase in network performance compared to previous setup.
- OpenID Connect (OIDC)-based federated authentication scheme: Access various infrastructure services, such as Backend.AI and others, using a single account to significantly streamline account management.
- Expanded support for enterprise environments: It works with a variety of PrivateContainer Registries, including GitLab, GitHub Enterprise, AWS ECR, and more, and makes it easy to configure hybrid configurations that span both on-premises legacy resources and the cloud.
Leveraging these updates, Backend.AI is broadening its scope as a cutting-edge AI infrastructure, serving both high-performance computing (HPC) and enterprise needs. Further enhancements will accompany the launch of Backend.AI 24.09.
Next-gen Sokovan
We continues to develop the next-generation Sokovan, scheduled for release early the following year. Here is a brief overview of what to expect from Next-gen Sokovan.
- Dual-engine architecture supporting Kubernetes: In addition to the current proprietary cluster management system, it will function as a native Kubernetes service. This includes managing accelerators through the Kubernetes Operator Proxy. We will seamlessly integrate NVIDIA and AMD device plugins, Intel GPU plugins, among others, to uphold industry standards.
- Database load balancing with Raftify during high-availability (HA) config: Minimize bottlenecks for metadata services and ensure reliable operation in clusters of tens of thousands of units.
- Enhanced automatic scaling for serving large language models: API metrics like request patterns and latency, and resource usage are analyzed for optimal scaling
- Strengthening the project unit: Capable to manage datasets, models, pipelines, and more collectively. The objective is to facilitate fine-grained role-based access control (RBAC) to accommodate diverse collaborative scenarios.
- Enhanced management capabilities for enterprise customers: You'll have integrated logging and monitoring, as well as audit log tracking for regulatory compliance.
All of these changes are being made with one goal in mind: to accelerate our customers' AI projects. With the new AI accelerator and connections to other Kubernetes-based solutions, our team is looking forward to further maturing the Backend.AI Core and MLOps features. Stay tuned for the next Sokovan's journey as he takes on a broader role.
Backend.AI WebUI
In the near future, the Backend.AI WebUI will be getting a new look. From a user's perspective, the user interface is probably the most important factor that determines the first impression of Backend.AI. We have always recognized the importance of the WebUI and have been innovating on it. We launched ML Desktop last year and GenAI Desktop earlier this year to test different user experiences, and we recently brought a user-friendly UI to our products with Neo Session Launcher.
Introducing WebUI Neo, the third new evolution of WebUI. Designed in close collaboration with Vice Versa Design Studio with the goal of delivering a rich user experience, this new design language is designed with the user in mind from start to finish. To coincide with the relaunch of Backend.AI, we've redesigned the entire UI/UX to give it a sleeker, more futuristic look and feel.
WebUI Neo was designed with the concepts of “reducing cognitive load” and “maintaining consistency in visual metaphors.” In terms of reducing cognitive load, we wanted to minimize the amount of complex information users had to type or top-search. For example, when setting up large-scale experiments, we limited the amount of information available in a step by exposing information sequentially, rather than presenting dozens of options at once.
In terms of “maintaining consistency in visual metaphors,” we've organized UI/UX elements, from screen composition to icons to colors, into similar design patterns for similar concepts, such as experiments, models, and data sets. By this, our users can reuse what they've learned once without having to relearn how to use similar features. WebUI Neo will be applied across both Backend.AI Core and Enterprise.
In recognition of this innovation, WebUI Neo was awarded the Excellence Award, which is only given to four consortia, at the Seoul Design Foundation and Seoul Metropolitan Government's Industrial Design Development Support Project for Small and Medium-sized Enterprises.
WebUI Neo will not be included in the Backend.AI 24.09 update right away, but is still being developed and tested with the goal of a general release later this year. We're also finalizing the move from Web Components, which is the codebase used since the first version of WebUI, to React. WebUI Neo is more than just a repackaging of past features; it will continue to add new functionality that is tightly aligned with machine learning workflows and will be the foundation for achieving the high level of automation and ease of use that Backend.AI strives for. This is the future we envision with WebUI Neo, a world where everyone can easily understand and benefit from AI infrastructure beyond its complexity.
Lablup Enterprise
The core of Lablup Enterprise, centered on Backend.AI Enterprise, can be described as ___ made easy. Lablup Enterprise aims to make deep-level AI technology innovation easy with end-to-end technology from device driver level to AIOps. We have three ___ made easy concepts: “Scaling made easy”, “Acceleration made easy”, and “Inference made easy”.
Scaling made easy: FastTrack 2, Finetun.ing, Cluster Designer
FastTrack 2
FastTrack 2, released with 24.09, is an automation solution for AI projects at scale. It provides pipeline management based on project groups, making it easy to define and execute complex workflows. It offers a wide range of reusable templates to minimize repetitive tasks. In addition, FastTrack 2 enables you to better leverage your resources by connecting with external partners. You can add model compression nodes and model serving services from partners to your pipeline.
Finetun.ing
Finetun.ing is a cloud-based fine-tuning service created in collaboration with FastTrack. It stands out from traditional fine-tuning services by eliminating the need for users to prepare their own data. Typically, fine-tuning involves uploading data to adjust model, but Finetun.ing simplifies this process by allowing users to interactively input prompts. The service then generates synthetic data from these interactions to fine-tune the model. The finetuned models are automatically evaluated and made available for download, complete with a model card. Finetun.ing operates on NVIDIA NemoTron and supports Llama 3.1 and Gemma 2. Ongoing tests aim to enable fine-tuning for an array of new models, with plans to expand the selection in the future.
Finetun.ing is currently gearing up for its final unveiling, and we've decided to take a waitlist for the first time at this event. You can sign up for the waitlist at https://finetun.ing.
Cluster Designer
Backend.AI Cluster Designer is a GUI-based cluster design tool. It automatically calculates the effective performance of a cluster of your desired size and performance, along with the required hardware configuration and estimated cost. It's perfect for those who want to validate the optimal architecture before actually building.
Backend.AI Helmsman
Backend.AI Helmsman is an interactive cluster management interface. It makes complex cluster operations possible just by chatting in a terminal. Under the hood, it utilizes a Gemma-based fine-tuning model to accurately understand user intent. It combines packages such as TorchTune, LangGraph, and LangChain to build interactive fine-tuning pipelines for on-premises environments. UI packages and models via the Helmsman CLI and WebUI will be released after the Backend.AI 24.09 release, by the end of the year.
Acceleration made easy
The second is “Acceleration made easy”. We support a wide variety of accelerators for AI workloads than any other AI infrastructure platform in existence.
CPU architecture coverage includes x86 as well as heterogeneous architectures such as Arm and RISC-V. We work closely with the latest accelerators, including NVIDIA's Grace Hopper, AMD's MI Series, Intel Gaudi, GraphCore BOW, GroqCard, Rebelion ATOM+, and Furiosa RNGD, to ensure you get the same user experience and peak performance on Backend.AI.
Inference made easy
Finally, “Inference made easy”.
We've simplified the sharing and distribution of pre-trained models with a unified model store. Inspired by package managers like Choco on Windows and Homebrew on macOS, Lablup ION model recipes allow you to install models and services contributed by the community via GitHub with a single line of command.
PALI, PALI PALI (PALI2), PALANG
There's also something new to introduce in terms of model service operations. It's PALI, PALI2, PALANG.
**Performant AI Launcher for Inference (PALI) is a high-performance inference runtime that combines the Backend.AI model player with a curated model catalog and predefined models. It features flexible scalability and high performance. Anyone can easily install, run NVIDIA NIM, Hugging Face models, and Lablup ION recipes right out of the box to run model services.
PALI2 is a dedicated hardware infrastructure appliance for PALI. You can easily scale by connecting multiple appliances with PALI. PALI2 is an architecture optimized for AI workloads, delivering high performance and low latency. Depending on your installation, we can provide and update models for different architectures and chip environments.
We are also preparing a PALI2 appliance that incorporates the NVIDIA reference platform GH200, and KYOCERA Mirai Envision Co., Ltd. in Japan will launch Instant.AI as the first reference platform for PALI2, which will be available for purchase on October 1.
Reference platforms for the Korean market will be available to reserve in October and for sale in Q4. PALI2 appliances targeting the U.S. and European markets will be available as early as Q4 of this year.
PALANG is a language model inference platform that includes PALI, FastTrack, Talkativot, and Helmsman. It provides ready-to-use inference and fine-tuning settings, greatly simplifying the deployment and operation of large-scale language models. Talkativot makes it easy to create custom chatbot interfaces and provides software components for model comparison and interface building during development. You can use PALI and PALI2 if you only need references, or PALANG if you need both language model fine-tuning and inference.
G
Finally, One More Thing... We'd like to give you a sneak peek at a new project we're currently working on: G, a language model based on Gemma2. It features easy customization with Finetun.ing. It will be used for a variety of purposes, including a backend model for Helmsman and an enterprise agent. Details will be revealed soon.
From Uncharted AI to Industrial Revolution
During the Age of Discovery, countless adventurers sailed the globe in search of pepper. Their adventures led to the discovery of many parts of the world that remained uncharted, and the world became more connected through the routes they opened. Shipbuilding and navigation were improved, new trade routes were opened, and innovations were made in medicine, military technology, and more. But that's not all: the Age of Discovery spawned another important event: the Industrial Revolution.
We are currently living in what is known as the Age of Great AI. It's akin to the dawn of the Age of Discovery, where the doors to new possibilities are just now opening. One person is returning with pepper, while another is constructing a larger vessel to demonstrate that the Earth is round. We are witnessing the equivalent of what the Industrial Revolution brought by the Age of Discovery.
Engine of AI Infrastructure
The Industrial Revolution began with James Watt's steam engine. The invention of the steam engine ushered in an era of mass production and mechanization. Now we're in the midst of another revolution. In the face of the tidal wave that is the Age of Great AI, Lablup is building a new engine.
Lablup is the engine of AI infrastructure. Our technology fuels innovation across industries. While the steam engine harnessed the power of coal, our engine is fueled by data. Just as a car engine converts the energy of gasoline into motion, Lablup provides an efficient and powerful engine that converts the fuel of data into AI, and the value it brings.
Just as the internal combustion engine gave birth to the automotive industry, AI engines will reshape the data-driven IT industry. Lablup is preparing for the time when everyone and every organization will be able to derive insights and value from their own data, rather than just storing and managing it. Lablup's AI engine is unrivaled in scale and speed. It has the scale to run dozens to tens of thousands of GPUs simultaneously, processing petabytes of data in real time, for the IoT and beyond. Just as the performance of an engine determines the speed of a car, our infrastructure will determine your success in the AI ecosystem.
So far, you've seen the engines that we had built. With these engines, we want to drive the AI revolution beyond the Age of Great AI. We're going to work on designing and improving the engine so that each and every one of you can be in the driver's seat. We invite you to step on the gas pedal of the AI era with Lablup.
27 September 2024
24.03: Release Update
By Lablup24.03, the first release of Backend.AI in 2024, has been released. This update brings significant improvements to the UI and user experience, making it even more functional to install and operate. It includes the following updates since 23. 9, which include
Backend.AI Core & WebUI
- We've made installing Backend.AI even easier with the addition of a TUI-based installer, which automates the download process and makes it easy for users to install and get started with Backend.AI.
- New: Added trash functionality to vfolders. Files in unused vFolders are now moved to the trash instead of deleting them completely, and are then deleted through a complete deletion process. New: Added an argument value to indicate the state of the vfolder.
- New: Added Backend.AI Model Store, where you can now store, search, and easily utilize various machine learning and deep learning models.
- Added metadata for indexing to vfolders to utilize indexes instead of full directory scans for queries.
- Improved system resource utilization by introducing a limit policy for session creation based on the number of pending sessions and requested resource slots. This new resource policy option helps filter and set the maximum value for resource presets and custom resource sliders in the Session Launcher.
- We've added dark themes to the WebUI, so users can now choose from a variety of options to suit their personal preferences.
- Improved screen tuning for unaligned line breaks, whitespace, and extruded announcements in the WebUI, as well as stability improvements such as session name validation.
- The Session Launcher for Model Serving also limits UI input so that only as much input is available as the allocated resources.
- Added an
allowAppDownloadPanel
argument to hide the WebUI app download panel in the config.toml file for different UI user options.
Backend.AI is constantly evolving to provide a more powerful and user-friendly experience while supporting a variety of environments in the ever-changing AI ecosystem. We look forward to seeing what's next! Make your AI accessible with Backend.AI!
29 March 2024
23.09: September 2023 Update
By Lablup23.09: September 2023 update
In the second half of 2023, we released 23.09, a major release of Backend.AI. In 23.09, we've significantly enhanced the development, fine-tuning, and operational automation of generative AI. We've automatically scaled and load-balanced AI models based on workload, expanded support for various GPUs/NPUs, and increased stability when managing a single node as well as 100-2000+ nodes. The team is working hard to squeeze every last bit out of it. Here are the main improvements since the last [23.03 July update] (/posts/2023/07/31/Backend.AI-23.03-update).
Backend.AI Core & UI
- The Backend.AI Model Service feature has been officially released. You can now use Backend.AI to more efficiently prepare environments for inference services as well as training of large models such as LLM. For more information, see the blog Backend.AI Model Service sneak peek.
- Added the ability to sign in to Backend.AI using OpenID single sign-on (SSO).
- If your kernel image supports it, you can enable the sudo command without a password in your compute session.
- Support for Redis Sentinel without HAProxy. To test this, we added the
--configure-ha
setting to theinstall-dev.sh
file. - Added the ability to use the RPC channel between Backend.AI Manager and Agent for authenticated and encrypted communication.
- Improved the CLI logging feature of Backend.AI Manager.
- Fixed an issue where Manager could not make an RPC connection when Backend.AI Agent was placed under a NAT environment.
- The Raft algorithm library, riteraft-py, will be renamed and developed as raftify.
- Support for the following new storage backends
- VAST Data
- KT Cloud NAS (Enterprise only)
Backend.AI FastTrack
- Improved UI for supporting various heterogeneous accelerators.
- Deleting a VFolder now uses an independent unique ID value instead of the storage name.
- Upgraded Django version to 4.2.5 and Node.js version to 20.
- Added pipeline template feature to create pipelines in a preset form.
- If a folder dedicated to a pipeline is deleted, it will be marked as disabled on the FastTrack UI.
- Improved the process of deleting pipelines.
- Added a per-task (session) accessible BACKENDAI_PIPELINE_TASK_ID environment variable.
- Actual execution time per task (session) is now displayed.
Contribution Academy
Especially in the past period, the following code contributions were made by junior developer mentees through the 2023 Open Source Contribution Academy organized by NIPA.
- Created a button to copy an SSH/SFTP connection example to the clipboard.
- Refactored several Lit elements of the existing WebUI to React.
- Wrote various test code.
- Found and fixed environment variable and message errors that were not working properly.
Backend.AI is constantly evolving to provide a more powerful and user-friendly experience while supporting various environments in the AI ecosystem. Stay tuned for more updates!
Make your AI accessible with Backend.AI!This post is automatically translated from Korean
26 September 2023
23.03: July 2023 Update
By Lablup23.03: July 2023 update
A wrap-up of the ongoing updates to Backend.AI 23.03 and 22.09. The development team is working hard to squeeze every last bit out.
Here are the most important changes in this update
- Enhanced storage manageability: Added per-user and per-project storage capacity management (quotas) with VFolder v3 architecture.
- Expanded NVIDIA compatibility: Support for CUDA v12 and NVIDIA H100 series.
- Extended hardware compatibility: Support for WARBOY accelerators from FuriosaAI.
Backend.AI Core & UI
- Supports CUDA v12 and NVIDIA H100 series.
- Supports the WARBOY accelerator, the first NPU from FuriosaAI company.
- Added storage capacity management function (Quota) by user and project by applying VFolder v3 architecture.
- However, it is limited to storage that supports Directory Quota.
- Fixed an error that caused multi-node cluster session creation to fail.
- Fixed an error where a compute session in the
PULLING
state was incorrectly labeled asPREPARING
. - Fixed an error in which the
CLONING
state was incorrectly displayed when cloning a data folder with the same name when multiple storage devices have the same folder. - Improved the web terminal of a compute session to use zsh as the default shell if the zsh package is installed in the kernel image.
- Added the ability to know the health status of the (managed) storage proxy and event bus.
Backend.AI FastTrack
- Added the ability to set
multi-node
cluster mode by task. - Fixed an error where environment variables set in
.env
were not applied to the frontend. - Fixed an error recognizing out-of-date when accessing with a mobile browser.
- Added a field to show the cause message when a task-specific error occurs.
- Fixed other editor-related issues.
Backend.AI is constantly evolving to provide a more powerful and user-friendly experience while supporting various environments in the ever-changing AI ecosystem. Stay tuned for more updates!
Make your AI accessible with Backend.AI!31 July 2023
23.03: May 2023 Update
By LablupA recap of the ongoing updates to Backend.AI 23.03 and 22.09. The development team is working hard to squeeze every last bit out.
Here are the most important changes in this update:
- Expanded hardware compatibility: Expanded hardware compatibility with support for ATOM accelerator idle checking and Dell EMC storage backends from Rebeillons.
- High-speed upload enhancements: Introduced SFTP functionality to support high-speed uploads to storage.
- Development Environment Enhancements: Enhanced the development environment by allowing sessions to be accessed in remote SSH mode from local Visual Studio Code.
- Increased manageability: Improved the user interface for administrators to make it easier to set up AI accelerators and manage resource groups.
Backend.AI Core & UI
- Added support for idle state checking of ATOM accelerators.
- Introduced SFTP functionality to support high-speed uploads directly to storage.
- Added ability to force periodic password updates based on administrator settings.
- Added an upload-only session (SYSTEM) tab.
- Added Inference type to the allowed session types.
- Added the ability to connect to a session in remote SSH mode from local Visual Studio Code.
- Added support for uploading folders from Folder Explorer.
- Improved the display of the amount of shared memory allocated when creating a session.
- Added support for Dell EMC storage backend.
- Improved the accuracy of container memory usage measurement.
- Improved the ability to run multiple agents concurrently on a single compute node.
- Added project/resource group name filter for administrators.
- Added user interface for administrators to set various AI accelerators, including GPUs, in resource presets/policies.
- Added a user interface for administrators to display the allocation and current usage of various accelerators, including GPUs.
- Added a user interface for administrators to set the visibility of resource groups.
- Provided a user interface for administrators to view the idle-checks value per session.
- Added recursion option when uploading vfolders in the CLI, and improved relative path handling.
- Added a recursive option in the CLI to terminate sessions with dependencies on specific session termination at once.
- Added a new mock-accelerator plugin for developers, replacing the old cuda-mock plugin.
- Added status and statistics checking API for internal monitoring of the storage proxy for developers.
Backend.AI FastTrack
- Improved searching for vfolders by name when adding pipeline modules.
- Added an indication to easily recognize success/failure after pipeline execution.
Backend.AI Forklift
- Bug fixes and stability improvements.
- Support for deleting build job history.
- Supports pagination of the build task list.
Backend.AI is constantly evolving to support a variety of environments in the ever-changing AI ecosystem, while providing a more robust and user-friendly experience. Stay tuned to see what's next!
Make your AI accessible with Backend.AI!
This post is automatically translated from Korean
31 May 2023
23.03: March 2023 Update
By Lablup23.03: 2023년 3월 업데이트
We're excited to announce version 23.03.0, the first major release of Backend.AI for 2023. Some features will continue to be rolled out in subsequent updates.
Specifically in this update:
- Support for the 'inference' service with a new computation session type.
- Support for 'model' management with a new storage folder type.
- Support for managing storage capacity on a per-user and per-project basis.
- Significant improvements to FastTrack's pipeline versioning and UI.
Backend.AI Core & UI (23.03)
- Added model management and inference session management capabilities.
- More advanced inferential endpoint management and network routing layers will be added in subsequent updates.
- The codebase has been updated to be based on Python 3.11.
- Introduced React components to the frontend and leveraged Relay to introduce a faster and more responsive UI.
- Full support for cgroup v2 as an installation environment, starting with Ubuntu 22.04.
- Updated the vfolder structure to v3 for storage capacity management on a per-user and per-project basis.
- Kernel and sessions are now treated as separate database tables, and the state transition tracking process has been improved to work with less database load overall.
- Improved the way the agent displays the progress of the image download process when running a session.
- Improved the display of GPU usage per container in CUDA 11.7 and later environments.
- Scheduling priority can be specified by user and project within each resource group.
- Supports two-factor authentication (2FA) login based on one-time password (TOTP) to protect user accounts.
- Support for users to register their own SSH keypair for session access.
- Supports user interfaces for Graphcore IPUs and Rebellions ATOM devices.
Backend.AI Forklift (23.03)
- Added Dockerfile templates and advanced editing capabilities.
- Support for creating container images for inference.
- Extended image management capabilities to work with the Harbor registry.
Backend.AI FastTrack (23.03)
- Storage folder contents can be viewed directly from the FastTrack UI.
- Improved session state synchronisation with Core to event-based.
- You can set the maximum number of iterations for a pipeline schedule.
- If a task fails to execute, the pipeline job is automatically cancelled instead of waiting.
- Added pipeline versioning. You can track the shape history of your pipeline, and you can recall the contents at a specific point in time to continue working on it.
- You can modify pipelines in YAML format directly through the code editor.
개발 및 연구 프레임워크 지원
- Supports TensorFlow 2.12, PyTorch 1.13
- Support for NGC (NVIDIA GPU Cloud) TensorFlow 22.12 (tf2), NGC PyTorch 22.12, NGC Triton 22.08
- Added python-ff:23.01 image, which provides the same libraries and packages as Google Colab
In addition to what we've listed above, we've included many bug fixes and internal improvements.
Stay tuned for more to come!This post is automatically translated from Korean
31 March 2023