Engineering
May 7, 2021
Engineering
ARM Next-Generation Instruction Set

Mario (Manseok) Cho
Solution Architect / Consultant
May 7, 2021
Engineering
ARM Next-Generation Instruction Set

Mario (Manseok) Cho
Solution Architect / Consultant
ARM v9 Architecture: Next-Generation Instruction Set
ARM unveiled the Arm®v9 architecture at an online conference (ARM Vision Day) on March 30th, marking the first major architectural update in 10 years. This article will review the functional evolution of the Arm®v8 series and examine the changes in ARM cores expanding into data centers, along with newly added AI/ML features.
ARM's Next-Generation Instruction Set Arm®v9
The Information Technology (IT) industry has evolved alongside high-performance CPUs (Central Processing Units). The current market is essentially divided between Intel x86/AMD64 (x64) and ARM cores. Intel/AMD's x86/x64 has evolved to optimize server environments through high performance and multi-core support. ARM announced Arm®v8 10 years ago and has since focused on products optimized for low power consumption and high performance in mobile and embedded environments like smartphones. Most mobile vendors including Apple's A-series cores, Qualcomm Snapdragon, Samsung Exynos, Huawei Kirin, and MediaTek Helio use ARM architecture. Currently, approximately 98% of smartphones worldwide use ARM architecture, so changes in this Arm®v9 instruction set architecture are expected to significantly impact future IT industry development.
The Beginning of ARM Architecture: 'Let's Build Low-Power, High-Performance Computers!'
In 1985, the British computer company Acorn Computer aimed to leverage the latest CPU core technology of the 1980s—RISC and 32-bit cores—while maintaining low power consumption. They announced the prototype Arm®v1 using VLSI Technology's 3-micron (3um) process. Acorn Computer improved the prototype and announced Arm®v2 in 1987, successfully commercializing PCs using this architecture. In the late 1980s, Apple selected ARM as the core for developing the Newton MessagePad and invested in ARM. Subsequently, ARM was primarily used in small terminals, home appliances, and factory automation requiring low power consumption, evolving through the ARM v6 architecture. Apple's iPhone used the ARM v6 architecture as an AP (Application Processor), and the ARM v7 architecture was also widely adopted as an AP for Google's Android, expanding ARM's market share in mobile devices.
Arm®v8 Announcement
In 2011, the CPU ISA (Instruction Set Architecture) transitioned from Arm®v7's 32-bit to Arm®v8's 64-bit architecture. ARM's 64-bit architecture is called AArch64 and was developed in three directions while ensuring backward compatibility with Arm®v7's 32-bit operation and extending 64-bit instruction sets. These directions included: system extensions required in enterprise IDC server markets such as virtualization and RAS (Reliability, Availability and Serviceability), computational extensions for deep learning, and security extensions. Representative products using Arm®v8 include Apple's iPhone 5S through 12 and Samsung Galaxy S5 through S21.
Arm®v9
ARM held an online briefing called ARM Vision Day on March 30th and announced the latest architecture "Arm®v9" after 10 years.
Various rumors about Arm®v9's features had circulated before the official announcement through technical announcements of several products. These ranged from anecdotes that Japan's supercomputer Fugaku's "A64FX" uses Arm®v8.2-A but could have used Arm®v9 if the development schedule had been more flexible, to ARM TechCon 2018's Neoverse session mentioning that the Poseidon Platform generation would become Arm®v9.

At this point, Arm®v9 was likely scheduled to be announced in 2019 and begin IP licensing in 2020-2021. For unknown reasons, Arm®v9's schedule was delayed, and during this time, Arm®v8.4-A through Armv8.6-A were released. As this marks the first new architectural announcement in 10 years, we hope it will continue advancing both technologically and in market impact.

Arm®v9 Architecture
The Arm®v9 generation targets significant performance improvements. Arm®v9 supports systems ranging from small microcontrollers to large-scale servers while maintaining ARM core backward compatibility (up to Arm®v8). It has enhanced computational capabilities for Machine Learning and DSP (Digital Signal Processing) required by the IT market, and introduced new hardware-based security features.
Building on the SVE feature added in Arm®v8.2-A to support vector operations frequently used in machine learning, it has been enhanced to SVE2 for HPC (High Performance Computing). It also supports FP16 and BF16 data types for 5G networks and VR/AR metaverse applications.

It also supports data formats smaller than 8 bits (INT 1/2/4) widely used in machine learning and enables flexible data access between heterogeneous computing cores (Gather-Scatter DMA).

ARM also announced CCA (Arm Confidential Compute Architecture) in Arm®v9, which enables computation in hardware-based secure environments. To help understand CCA functionality, we've organized the features added by each Armv8-A generation.

Armv8.3-A Pointer Authentication is a technology that prevents unauthorized access by storing part of the address in a different location and including an index to that location in the stack. This prevents malicious programs from directly accessing stack addresses, reducing risks of incorrect program execution or hacking when ROP (Return-Oriented-Programming) and JOP (Jump-Oriented-Programming) attacks are attempted. This feature was designed to prevent vulnerabilities using ROP/JOP in ARM instruction sets. Armv8.5-A and Armv9.0-A support RNG/BTI/Memory Tagging/Cache Clean to Pointer features, while Armv8.6-A and Armv9.1-A support MatMul instructions/bfloat16/virtualization enhancements/pointer authentication enhancements/precision timer features. Armv8.7-A and Armv9.2-A support PCIe Hot plug/Atomic 64 Bytes/WFI and WFE features, with Branch-Record recording supported only in v9.2-A.

A new feature introduced in Arm®v9 is the A Profile, which introduces a concept called Realms. Unlike TrustZone (the Arm®v8 security feature) that executes by dividing into secure and non-secure categories, CCA uses the concept of Realms, where general applications can dynamically access memory. These realms are completely isolated from operating systems or virtualization hypervisors and can only run a small number of trusted management software. Memory page tables can be shared between general memory and realm memory.

For example, applications downloaded from app stores can create secure areas that maintain confidentiality from the OS or hypervisor. Therefore, even if the operating system is compromised through hacking, data used in that Realm remains safely protected. This capability prevents leakage of important personal information or commercially valuable algorithms, enabling comprehensive personal information protection using Realms without requiring drivers or dedicated security devices, making security more reliable.
Conclusion
The Armv9 generation is gaining attention through Apple's M1's remarkable performance at the end of last year, NVIDIA's ARM core-based data center CPU "Grace" announced at GTC2021, and general-purpose ARM IP Neoverse for the server market. This signals full-scale expansion into data centers and artificial intelligence fields by leveraging the strengths of low-power processors. With Intel's semiconductor process and performance improvements stagnating since the 6th generation core, AMD's Zen performance improvements and various AP manufacturers entering the enterprise market using ARM core IP are expected to trigger renewed market competition. This competition is expected to drive development across computing fields from mobile to high-performance computing, alongside the expansion of artificial intelligence, metaverse, and 5G network services demanded by the market.