Backend.AI + Intel Arc Pro B70: Memory for Agentic AI
A benchmark look at the Arc Pro B70's memory advantage in agentic workloads
Backend.AI extends its Intel lineup from Gaudi accelerators to Arc graphics, starting with the Arc Pro B70. See how the Arc Pro B70's 32GB memory sustains throughput in agentic workloads where context length and concurrency keep growing, and how Backend.AI's single control plane and model serving stack scale the same environment without changing the stack.
Download Resource
Please fill out the form below.
Sustained throughput where agentic AI actually runs
The Arc Pro B70 pairs 32GB of GDDR6 with 608GB/s of bandwidth at a $1,099 MSRP. In vLLM benchmarks against the NVIDIA RTX PRO 4000 Blackwell, the B70 held about 2.1x more KV cache on average and kept scaling under load, reaching 2.24x higher throughput with Qwen3 8B at 8K context and 16 concurrent requests, and up to 4.4x at 32K context. From a desktop B70 to shared inference clusters and datacenter-scale Gaudi deployments, run the same platform on Backend.AI's single control plane and model serving stack.
Related Services
Backend.AI is a vendor-agnostic accelerated workload hosting platform based on our own home-grown orchestration and job scheduler, running on top of either cloud or on-premises (air-gapped) clusters.
Explore service →Run bigger models, work with richer data, and keep workstation performance steady with the Intel® Arc™ Pro B-Series. Combining large dedicated memory, robust graphics power, and the latest XMX AI engines, the Intel® Arc™ Pro B-Series accelerates rendering, video processing, and AI workloads.
Learn more →