1

Arm A-Profile Architecture Developments 2022

 2 months ago
source link: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

Arm A-Profile Architecture Developments 2022

webimage_2D00_BA397E46_2D00_E12A_2D00_4DE8_2D00_8FDC7A944805C2D9.jpg_2D00_900x506x2.jpg?_=638000559751317545
September 29, 2022
7 minute read time.

Working with its architecture licensees and ecosystem partners, Arm continues to evolve its architecture, developing new functionality to meet the needs of both new and existing markets.

This blog discusses some of the key additions to the A-Profile architecture in 2022.

Full Instruction Set and System Register information will be available from early October from our developer webpages. The complete Arm Architecture Reference Manual (Arm ARM), the 2022 extensions and earlier functionality, is due for release in early 2023. Updates to the Learn the Architecture pages will appear during 2022 and 2023.

Details of previous updates to the A-Profile architecture are available here: 2014201520162017, 2018, 2019, 2020 and 2021.

2022 Virtual Memory System Architecture (VMSA) enhancements

The 2022 extensions include several updates to the VMSA.

Permission indirection and overlays

The 2022 extensions introduce a new way to control memory permissions. Instead of directly encoding the permission in the Translation Table Entry (TTE), fields in the TTEs are used to index into an array of permissions specified in a register. This indirection provides greater flexibility, greater encoding density and enables the representation of new permissions.

Each TTE can select two values, a base permission, and an overlay. The base permission represents the maximum set of permissions that the block or page has. The overlay can be used to further restrict the permission.

This is illustrated in the following diagram:

VMSA Translation Table

The base permission is permitted to be cached in. This means that the effective permission of a block or page can be efficiently changed dynamically.

For operating systems, the architecture provides separate EL1 and EL0 overlay registers. This can allow an operating system to set a maximum permission for a page allocated to an application, then allow the application to further manage permissions within those constraints. For example, a JIT might be allocated a page that was permitted by the operating system to be write-able or executable. The JIT could then control, with the Overlays, whether the page was currently write-able or executable. This has the advantage of reducing the number of system calls and TLB invalidates.

Permission indirection also has benefits where the same tables are shared by multiple entities. For example, a set of tables might be used by both an Arm processor and an Arm System Memory Management Unit (SMMU).  The permissions that we want to apply to software accesses might be different to those we want to apply to an accelerator behind the SMMU. With permission indirection, the processor and SMMU can use the same tables but interpret the permissions differently.

Translation hardening

The translation tables used by the S isolation model and a high value target for attackers. The 2022 extensions introduce a series of features to harden the MMU table walk process by reducing the available attack surface. These features include:

  • A new stage 1 attribute – Protected.
  • A new stage 2 permission – Mostly read-only.
  • A new instruction, RCW (Read-Compare-Write), for updating translation table entries.

The Protected attribute controls which fields within a TTE are permitted to change. When the new instruction is used to modify a TTE it will atomically check the Protected attribute, and if set, only update permitted fields.

protected attributes for translation table entry

The new stage 2 “Most Read-only” (MRO) permission enables software to restrict what can write into a page.  A page marked as MRO permits hardware updates of the Access Flag and Dirty, as well as updates due to an RCW instruction. However other forms of store, such as STR (store) instructions, will fail with a permission fault.

Permission failure in Intermediate Physical Address Space

Together the Protected attribute at stage 1 and MRO permission at stage 2 give robust protection against many types of attacks. The MRO attribute prevents stores, other than those from RCW instructions, from changing mappings. The Protected attribute and RCW instruction limits which fields in TTEs can be updated.

The feature also introduces a stage 2 attribute, AssuredOnly, that can be used to ensure that only Protected tables can point to a certain page. This is to help protect against aliasing attacks.

128-bit translation tables

As part of the 2022 extensions, Arm is adding a new translation table format to Armv9-A. The translation format follows the same principle as the existing format but increases the size of each descriptor to 128 bits. The new format enables larger output addresses and scope for new attribute fields.

In 2021 Arm announced the Scalable Matrix Extension (SME) to Armv9-A. SME added new capabilities to efficiently process matrices, including matrix tile storage and outer-product operations. In 2022, Arm builds on the capabilities of SME by introducing SME2.

SME provides outer-product instructions to accelerate matrix operations. SME2 significantly extends the capabilities with instructions for multi-vector operations, multi-vector predicates, range prefetches and 2b/4b weight compression.

building on the capabilities of Scalable Metrix Extensions(SME) with SME2

The new instructions enable SME2 to accelerate more workloads than the original SME. Including GEMV, Non-Linear Solvers, Small and Sparse Matrices, and Feature Extraction or tracking.

Guarded Control Stack (GCS)

With the 2022 extensions Arm also adds support for a Guarded Control Stack (GCS) in Armv9-A. GCS provides mitigations against some forms of ROP attacks. GCS also provides an efficient mechanism for profiling tools to get a copy of the current call stack, without needing to unwind the main stack.

A GCS is a protected region of virtual address space allocated by software. When the processor executes a Branch with Link instruction, such as BL, the return address is pushed onto the GCS as well as being written into the Link Register (LR). On a procedure return, the latest stored return address is popped from the GCS. The processor either compares the popped value with the LR, or uses the popped value directly. This process is illustrated here:

Arm support for Guarded Control Stack (GCS)

There are times when the software needs to make manual adjustments to the control stack, for example to handle some long jumps. To enable this, the architecture provides specialist instructions for maintaining the GCS; GCSPUSHx and GCSPOPx.

To prevent accidental or malicious changes to the GCS, a new Stage 1 permission is introduced. This permission allows reads by software, but restricts writes to either GCSPUSH instructions or as a side-effect of executing a BL. 

Confidential Computing

In 2021 Arm announced the Realm Management Extension (RME), part of the Arm Confidential Compute Architecture. The 2022 extensions enhance RME in two areas:

  • Memory Encryption Contexts – this extension introduces support for multiple memory encryption contexts for the Realm physical address space. This can be used to implement memory encryption with a unique key for each Realm, which provides defence-in-depth to the security already afforced by Realms.
  • Device Assignment – this extension enhances the RME System Architecture and SMMUv3 to enable the secure assignment of devices to Realms. Each Realm can independently choose whether to allow an off-processor resource such as an accelerator to access a region of its address space.

Other functionality

Other enhancements introduced as part of the 2022 extensions include:

  • Support for Hybrid Vector Length Agnostic (HVLA) programming model in SVE2. (Armv9-A)
  • Updates to the Memory Tagging Extension (MTE), including stage 2 traps on tag accesses and store-only checking mode.
  • Performance Monitor (PMU) snapshot support and fixed-function instruction counter.
  • Additional RAS capabilities, including a new exception type for reporting errors on structures other than memories.

Summary

This blog provides a brief introduction to the latest features included in the Arm architecture as Armv8.9-A and Armv9.4-A. More detailed information can be found on our Developer website.

The next step will be working with our ecosystem partners to ensure that open-source software is enabled, to make use of this functionality as soon as the hardware becomes available.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK