2

Next-gen Armv9 CPUs unleash compute performance - Announcements - Arm Community...

 1 year ago
source link: https://community.arm.com/arm-community-blogs/b/announcements/posts/compute-performance-unleashed
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

New generation of Armv9 CPUs unleash unprecedented compute power

CPUs-blog-post-image.png_2D00_900x506x2.png?_=637916617388726059

Last year, Arm launched the very first generation of Armv9 CPUs. This was a landmark announcement, not only for Arm, but also for our extended ecosystem. The advancements set in motion by the Armv9 architecture will have a lasting impact on the tech industry and the next decade of compute.

Now, we are excited to announce our second generation of Armv9 based CPUs. These include the Arm Cortex-X3 and Arm Cortex-A715, as well as important updates to Arm Cortex-A510 and the DSU-110 (DynamIQ Shared Unit). The new Armv9 CPUs and updates form the foundation of Arm’s new Total Compute Solutions (TCS22).

The new Armv9 CPUs show our commitment to Compute Performance Unleashed. The new Cortex-X3 and Cortex-A715 and the upgrades to Cortex-A510 and DSU-110 are all designed to push the limits of peak performance and deliver exceptional sustained performance and efficiency. As part of a versatile CPU cluster, we aim to inspire partners and captivate end users by delivering outstanding user experiences on next-generation consumer devices.

Cortex-X3: bringing the X-factor to performance

The new Cortex-X3 is Arm’s third-generation Cortex-X CPU. It is the product of the Cortex-X custom program which allows participating partners to shape the final product design. Designed for ultimate performance, Cortex-X3 represents the third consecutive year of double-digit IPC growth. The result of this strong IPC beat translates to performance leadership on Android flagship smartphones and Windows on Arm laptop devices. Cortex-X3 targets a range of benchmarks and applications, and delivers 25 percent improved performance when compared with the latest Android flagship smartphone.

Within the laptop space, Cortex-X3 delivers 34 percent improved single-threaded performance when compared with the latest mainstream laptops. Consistent performance and microarchitecture improvements have laid solid foundations for a strong portfolio of Cortex-X CPUs. 

Enhanced scalability with new CPU clusters

DSU support for up to 12 cores and 16M L3 cache on Cortex-X3 enables scalability across laptop and desktop devices, mobile, DTVs, and beyond. Compared to the previous generation, the newly updated DSU-110 supports 50 percent more cores, alongside the latest ISA features. These changes improve flexibility for our partners and deliver the resources to realize the full potential of our CPUs for improved user experiences. Our partners can now target premium laptop devices with new configurations, such as 8 Cortex-X3 CPU cores and 4 Cortex-A715 CPU cores, unlocking a new generation of consumer devices.

CPU clusters

Cortex-A715 for perfectly balanced efficient performance

For the Cortex-A715, we are doubling down on the key value proposition of the Cortex-A700 CPUs, which is all about ultimate efficient performance. We have made a series of targeted improvements to the Cortex-A715 design, including branch prediction accuracy and data prefetching. Consistent IPC gains mean that the Cortex-A715 now reaches the significant milestone of matching the performance of the 2-year-old Arm Cortex-X1 CPU.

Cortex-A715 offers the perfect balance of performance and efficiency for our partners. This includes a 20 percent power efficiency improvement at the same performance, and a 5 percent performance uplift at the same power, compared to the Arm Cortex-A710 CPU (ISO process).

Cortex-A715

This push for efficient performance makes Cortex-A715 the CPU cluster workhorse of the big.LITTLE CPU cluster. The CPU can be paired with Cortex-X3 and Cortex-A510 CPU cores as part of TCS22.

Efficiency upgrades for Cortex-A510

Alongside our new CPUs, we have also made updates to last year’s “LITTLE” Armv9 CPU, the Cortex-A510, which is predominantly designed for high efficiency. We have maintained the performance from the 2021 version and delivered superior efficiency with a 5 percent power reduction. This takes our push for ultimate efficiency in our “LITTLE” CPU cores to all new heights, with lower power translating to better battery life for the end user.

Updates to Cortex-A510

The Power of big.LITTLE

Arm’s big.LITTLE technology, which was first launched in 2011, is now the most commonly used heterogeneous processing architecture for consumer devices worldwide, including smartphones, laptops, and DTVs. Our DynamIQ technology then combines the big and LITTLE CPUs into a single, fully-integrated cluster. The flexibility of the big.LITTLE CPU clusters is perfect for multi-threaded workloads. The technology can adjust to the dynamic usage pattern across consumer devices, such as high-processing intensity for gaming and web browsing, and longer periods of low-processing intensity tasks for texting, email, and audio.

Game on with Arm’s CPUs

Gaming is a key use case that is improved and enhanced by our latest CPU technologies, particularly when silicon partners utilize the big.LITTLE technology alongside our new flagship and premium GPUs in TCS22. For gaming, we have focused on data-driven optimizations across CPU and software. The microarchitecture optimizations for our new CPUs provide far larger footprints across the front end, core, and back-end. For example, the front-end for the Cortex-X3 includes a 50 percent larger L1/L2 branch target buffer (BTB) capacity and a 10x larger L0 BTB capacity. Meanwhile, we have fine-tuned the software to ensure that there is a seamless transition across the different CPUs to meet the demands of different games.

Security evolution with Armv9

Through the second-generation Armv9 CPUs, we are introducing a brand-new Asymmetric Memory Tagging Extension (MTE), and Enhanced Privileged Access Never (EPAN) for improved access control.

MTE detects and prevents memory safety vulnerabilities across the entire system, providing time-to-market benefits for application developers. MTE-enabled devices can quickly and effectively identify buffer overflows and heap corruption in the code.

Asymmetric MTE offers improved flexibility between the speed, precision, and targeting of these security vulnerabilities. This benefits software development with more stable applications, while also enabling a broader rollout of MTE across the ecosystem.

Enhanced developer experience

At Arm, we take an approach where all hardware design efforts can be fully realized through software performance gains to support developers. 

Arm provides a fully featured, MTE-enabled Fixed Virtual Platform (FVP) for TCS22, alongside a complete Android software stack supporting the functionality. This gives developers a platform to validate that their applications are secure and stable before physical devices being available. Arm up-streams MTE support to the open-source community, ensuring applications can be compiled with the necessary functionality. Well-implemented applications require no source code modifications from developers.

The Streamline performance analyzer, a component of Arm Development Studio, enables developers to easily visualize and compare performance as they migrate platforms to Armv9. Support for devices powered by these processors will also be available as part of Arm Mobile Studio in line with the availability of such devices. This enables application developers to optimize the performance and efficiency of their games.

Compute performance unleashed

The second-generation Armv9 CPUs are our most comprehensive CPU offering to date, delivering tangible benefits across the core areas of performance, compute, and software. These serve as the catalyst for the future of compute on Arm.

Summary of second-generation Armv9 CPUs

Arm big.LITTLE is testament to our clear vision for compute and exemplifies the flexibility that Arm solutions offer our partners and the wider ecosystem. Whether it is compute-intensive gaming or low-intensity messaging, big.LITTLE CPU cluster configurations deliver the best possible experiences for the user.

We empower developers to create the very best, most secure applications by building on the solid foundation of Armv9 security features. By integrating these features into the hardware, our CPUs offer exceptional security without compromising on peak performance or power.

With our new CPUs, we are continuing to push the boundaries of specialized processing. Our latest CPU clusters enable scalability in multiple power, performance, and area vectors across a broad range of next-generation devices. The versatile and powerful Armv9 CPU clusters deliver compute for the next-decade and support the next-generation of innovation on Arm.

Arm's new Total Compute Solutions


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK