Arm Releases Cortex-A77 CPU, Machine Learning Processor, and Mali-G77 GPU

By Brandon Lewis

Editor-in-Chief

Embedded Computing Design

June 04, 2019

News

Arm Releases Cortex-A77 CPU, Machine Learning Processor, and Mali-G77 GPU

Arm has released a suite of IPs that include the Arm Cortex-A77 CPU, Arm Mali-G77 GPU, and Arm Machine Learning (ML) processor.

COMPUTEX. Arm has released a suite of IPs that include the Arm Cortex-A77 CPU, Arm Mali-G77 GPU, and Arm Machine Learning (ML) processor.

The Cortex-A77 offers a 20 percent instruction per clock (IPC) performance improvement over its predecessor and 35x the machine learning performance of the Cortex-A75. It also delivers a 20 percent improvement in integer performance, 35 percent better floating point performance, and 15 percent higher memory bandwidth.

The IP is slated for 7 nm process technology, and includes several microarchitecture enhancements, including:

Branch Prediction: Double the branch prediction bandwidth, 4x the L1 branch target buffer (BTB) capacity, and 33 percent more L2 BTB
Memory: High bandwidth, low latency fetch operations and dynamic code optimization through Macro-op (Mop) cache; dynamic data prefetching based on memory subsystem configuration; and twice the dedicated load-store issue bandwidth
Execution: 50 percent increase in integer execution bandwidth enabling up to six instructions per cycle; a 25 percent increase in out-of-order window size to 160 instructions; and a second AES encryption pip has been added

The Arm Machine Learning (ML) processor is a neural processing unit (NPU) that provides up to 5 Tera Operations Per Second per watt. Based on the Winograd architecture that consists of fixed-function engines for executing convolutional layers and programmable layer engines for non-convolutional layers, the Arm ML processor delivers 225 percent more performance on common filters than competing NPUs.

Key features of the Arm ML processor include:

Network Types: Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are supported for classification, object detection, speech recognition, natural language processing, and other edge artificial intelligence (AI) applications.
Heterogeneous Compute: With optimizations for use with Cortex-A CPUs and Mali GPUs
Multicore Scalability: Up to eight NPUs and 32 TOPS in a cluster or 64 NPUs in a mesh configuration
Software and Framework Support: The Arm ML processor integrates with TensorFlow, TensorFlow Lite, Caffe, Caffe 2, and other frameworks via the ONNX ecosystem. It is also compatible with the Arm NN software development kit (SDK).

The new Mali-G77 GPUs are based on the Valhall architecture, which offers an enhanced microarchitecture engine, load store caches, and texture pipes. These upgrades result in a 40 percent performance improvement, 30 percent better density, 30 percent increase in energy efficiency, and 60 percent greater machine learning inferencing performance over the previous generation. Arm expects this to result in a 40 percent upgrade in peak graphics performance.

Highlights of the Valhall microarchitecture include:

Wider Execution Engines: Two 16-wide execution engines, delivering 32 fused multiply-adds (FMA) per core (two clusters of 16 FMAs per execution engine per core)
Quad Texture Mapper: With four texels per cycle, providing double the throughput of the Mali-G76
Dynamic Instruction Scheduling: The schedular decides which instructions should be executed from which warps. This is handled completely in hardware, and then delivered to independent parallel arithmetic logic units (ALUs)
Arm Frame Buffer Compression 1.3: AFBC 1.3 supports 2-plane YUV, improved front-buffer rendering, and separate depth/stencil encoding for better compatibility with APIs like Vulkan

For more information on the new processor cores visit developer.arm.com.

Brandon is responsible for guiding content strategy, editorial direction, and community engagement across the Embedded Computing Design ecosystem. A 10-year veteran of the electronics media industry, he enjoys covering topics ranging from development kits to cybersecurity and tech business models. Brandon received a BA in English Literature from Arizona State University, where he graduated cum laude. He can be reached at [email protected].

Embedded Computing Design

Arm Releases Cortex-A77 CPU, Machine Learning Processor, and Mali-G77 GPU

By Brandon Lewis

Arm has released a suite of IPs that include the Arm Cortex-A77 CPU, Arm Mali-G77 GPU, and Arm Machine Learning (ML) processor.

Categories

Processing - Compute Modules

AI & Machine Learning

Processing - Chips & SoCs

Trending Articles

iOmniscient Partners with Intel for Lightweight Predictive Maintenance

Product of the Week: AAEON’s de next-RAP8-EZBOX for Industrial Robotics

Pattern AI Uses Intel to Create Edge AI Tools for Smart Delivery

Toradex Launches OSM and Lino CoM Families Featuring NXP i.MX 93 and i.MX 91 Processors

Taoglas Expands Low-Cost, Compact Chip Antenna Range for Wi-Fi 6/7, UWB, and ISM Applications

Industrial

Avalue ECM-ASL3 Industrial Board Offers Intel Next-Gen Compatibility for Edge AI and Automation

IoT

EMASS and Semtech to Showcase Collaboration at CES 2026

Open Source

Embedded Executive: RISC-V Works Great At Low Power Levels, Too | Upbeat Technology

Software & OS

Engineering Real-Time: Lessons Learned While Chasing Determinism Part 4