Arm 2022 GPU Architecture to Deliver 5X ML Workload Performance

By Tiera Oliver

Assistant Managing Editor

Embedded Computing Design

October 22, 2021

Blog

Arm 2022 GPU Architecture to Deliver 5X ML Workload Performance

Arm has announced an unnamed GPU architecture will be released next year that delivers a 4.7x performance increase in FP32 workload performance.

Speaking from the company’s virtual Arm DevSummit this week, Ian Bratt, a Fellow and Senior Director of Technology for ML at Arm, unsurprisingly said that the new GPU architecture will cater to AI and ML workloads.

The expected 4.7X increase in the 2022 architecture compares to the Mali-G76, which was released in 2018 and is based on the previous-generation Bifrost architecture. Compared to the Mali G710 – the company’s third GPU built on the current-generation Valhall architecture – the 2022 GPU is projected to offer double the FP32 machine learning performance.

Bratt said that precise performance-per-watt metrics are still unknown, and may not available for some time given Arm cores typically don’t ship in production silicon for a year after release. For context, the G710 came out of the gate with a 35% improvement in ML workloads and 20% higher graphics performance in ISO-process node GPU configuration compared to its G78 predecessor.

The Arm Mali G710 is slated to ship in laptop, tablet, and smartphone sockets this year.

According to Bratt, Arm expects to continue developing GPU architectures on a three-year release cadence to address the expanding demand for ML performance. They hope to achieve this through better per-core performance, support for increasing numbers of cores, and software development infrastructure for advanced neural networks that mirror human cognition.

"We actually don't have the luxury of millions of years of evolution, so we need to develop tools to kind of short circuit that evolution and enable quick exploration of neural network architectures,” he said. "It's more than just adding instructions and improving hardware IP, we also have to provide the software, the tools, the libraries to enable that ML performance."

For more information about the Arm DevSummit 2021 Day 2 Keynote, visit: www.arm.com/blogs/blueprint/devsummit-2021-day2 or the company’s
Graphics and Multimedia offerings.

Tiera Oliver, Assistant Managing Editor for Embedded Computing Design, is responsible for web content edits, product news, and constructing stories. She develops content and constructs ECD podcasts, such as Embedded Insiders. Before working at ECD, Tiera graduated from Northern Arizona University, where she received her B.S. in journalism and political science and worked as a news reporter for the university’s student-led newspaper, The Lumberjack.

Embedded Computing Design

Arm 2022 GPU Architecture to Deliver 5X ML Workload Performance

By Tiera Oliver

Arm has announced an unnamed GPU architecture will be released next year that delivers a 4.7x performance increase in FP32 workload performance.

Categories

AI & Machine Learning - AI Logic Devices & Workload Acceleration

Processing

Trending Articles

Avalue ECM-ASL3 Industrial Board Offers Intel Next-Gen Compatibility for Edge AI and Automation

AAEON’s BOXER-8651AI-PLUS Offers Rugged Jetson Orin NX AI Computing with JetPack 6.2

Axiomtek’s mBOX603 Delivers High-Performance Medical Imaging and AI-Assisted Diagnostics

Embedded Executive: The CSA Ushers in ZigBee 4.0 | CSA

Secure Boot and the Manufacturing Chain: Implementation and Impact

Analog & Power

Optimizing LDO Headroom Control with a Current Referenced Switching Regulator Design—Part 1: Noise Sources, Impact, and Strategies

Industrial

Avalue ECM-ASL3 Industrial Board Offers Intel Next-Gen Compatibility for Edge AI and Automation

Processing

Navitas Semiconductor and Cyient Partner to Build India’s Complete GaN Ecosystem

Software & OS

Engineering Real-Time: Lessons Learned While Chasing Determinism Part 4