Holistic design for the modern MCU platform
May 24, 2017
To meet the demands of today's industrial and consumer applications, microcontrollers must be as fast and efficient as possible while meeting the nece...
To meet the demands of today's industrial and consumer applications, microcontrollers must be as fast and efficient as possible while meeting the necessary cost constraints. Taking a holistic design approach that considers ways to enhance all aspects of the MCU platform, from the CPU core to memory to the fabrication process, can provide the required performance and power levels within cost-sensitive budgets.
Microcontrollers (MCUs) are ubiquitous in our lives. In the morning, they perk up our coffee machines; on our way to work, they help run our vehicles; at home, they enable a variety of devices we rely on every day. In all these areas and more, application-specific microcontrollers have achieved a powerful position.
However, as user expectations become more sophisticated, industrial and consumer application manufacturers face growing cost and performance pressures. Their approach to countering these pains is to increase their use of the platform concept. Designing multiple industrial applications for platform-based development allows designers to reuse previously developed elements to help reduce costs.
Several key enhancements give an MCU platform the flexibility to be adapted for general use, as well as the desired power and performance levels needed within the required system cost parameters:
· Improved code efficiency
· Low-power fabrication process
· Advanced interrupt handling
· Multibus configuration
· Extreme connectivity
· High-speed flash memory
The best MCU platforms take a holistic design approach, aligning development of the CPU core, flash technology, fabrication process, system architecture, and peripherals such that they all work together to form an optimally efficient system.
Improved code efficiency
Improving code efficiency allows designers to reduce the amount of on-chip memory, which in turn helps reduce system costs. Traditionally, CISC architectures have been used to enable short programs and RISC architectures have been used to create fast programs. By blending CISC and RISC techniques, it is possible for a modern MCU platform to gain the best of both worlds. Modern MCU platforms such as the RX Family from Renesas Electronics have 89 complex instructions with 10 addressing modes and a five-stage execution pipeline, resulting in an architecture that achieves 1.65 DMIPS per MHz and has 28 percent smaller code size than competing RISC architectures.
Digital Signal Processing (DSP) is becoming increasingly popular in MCU applications such as motor control, speech and audio, and intelligent sensing. By integrating specific DSP instructions and hardware features such as a Multiply/Accumulator (MAC) and a barrel shifter, it is possible to efficiently perform these DSP tasks with a modern MCU platform.
Another trend is the growing use of floating-point math. Floating-point math can be implemented using four methods (see Figure 1). Lookup tables are fast but result in very large code sizes. Fixed-point math and software libraries are also possible but inefficient. The use of a hardware Floating-Point Unit (FPU) results in the best blend of fast execution time and small code size.
An FPU can be implemented in different ways. Typically, an FPU is added to the CPU core, which requires dedicated FPU registers. This results in a non-optimum architecture, as the extra registers burn more power and demand that load and stores move data to the general registers. By designing the CPU core and FPU together, modern MCU platforms can offer an inline FPU implementation that boosts performance while at the same time reduces power consumption and code size (see Figure 2).
Low-power fabrication process
In the search for lower power consumption levels, designers are turning to process miniaturization, which achieves lower current consumption levels by reducing capacitive loads but comes with some significant trade-offs. For example, miniaturization causes physics-related problems such as higher leakage current. Moving to a 90 nm fabrication process helps achieve lower power cost effectively while minimizing these design trade-offs.
Compared to larger geometry processes, the 90 nm process has smaller capacitive loads from gates and wiring, which enables modern MCUs to achieve higher performance with lower active power consumption. To minimize power loss, it is possible to use low-power design techniques such as clock gating and power gating to turn off clocks and power to areas of the MCU when they are not needed.
Another technique increases the threshold voltage (VT) of a circuit element to reduce power loss and power consumption. The disadvantage of this approach is that it makes high-speed operation more difficult. To address this issue, MCUs like the Renesas RX600 series use a two-step approach for the basic circuit design: 1) use low-speed (high-VT, low-leakage) cells to keep power loss low during normal operation; 2) assign the high-speed critical-path cells to a fast, low-VT design. Using this critical path analysis and optimization design approach can achieve overall power consumption as low as 500 μA per MHz, or 50 mA at 100 MHz operation.
For applications that require even lower power consumption, it is possible to trade off performance for power and use an even lower-power process technology. For example, the RX200 series, which is limited to 50 MHz operation, is built using a low-power 130 nm process technology that enables current consumption as low as 200 μA per MHz.
Advanced interrupt handling
Interrupt handling is a vital aspect of real-time control applications, and the way it is implemented directly translates to overall MCU system performance. Modern MCU platforms offer the ability to configure an interrupt as a fast interrupt wherein the Program Counter (PC) and Program Status Word (PSW) are saved to registers instead of SRAM, which greatly reduces latency when entering and exiting interrupt service routines compared to traditional push/pop interrupt service techniques. The ability to dedicate up to four general-purpose registers for interrupt use enables a modern MCU platform to offer eight cycles of total overhead associated with entering and exiting an interrupt service routine, which results in enhanced real-time control and overall higher system performance (see Figure 3).
In many embedded systems, bottlenecking due to intensive internal memory accessing is a common problem that leads to reduced overall processing capacity. Revising the bus configuration from a single to a multibus configuration eliminates this problem. Modern MCU platforms have advanced system interfacing capabilities with multiple system buses and multiple bus masters, enabling four independent plus two interleaving data transfers, as depicted in Figure 4.
The demand for networking capabilities has exploded, a trend that will continue throughout 2012. To meet these demands within competitive pricing and form factor constraints, semiconductor suppliers are integrating connectivity on-chip using a combination of Ethernet, USB, and CAN capabilities. The security system example shown in Figure 5 illustrates the use of Ethernet, USB, and CAN all within one application.
In this dual-USB example, one channel is configured as a Host taking field updates from a USB thumb drive, and the second channel is configured as a Device connecting to a computer for diagnostic testing.
The USB channels on modern MCU platforms also should offer multiple endpoints to enable composite implementations, which are becoming increasingly popular as USB continues to proliferate into embedded designs as a preferred communication standard (see Figure 6).
No-wait high-speed flash memory access
CPU speeds for standard MCUs have increased from the 50 MHz range to the 100 MHz range during the past few years, while access speeds for embedded flash memory have often remained at 30 MHz or below. This widening gap results in slow on-chip flash memory access, reducing the CPU’s effective processing speed.
To minimize or eliminate this type of speed difference, engineers usually implement delaying measures, such as inserting a number of wait states to keep pace with the flash memory’s slower access speed. Ultimately, this type of work-around solution produces a disappointingly low limit in the MCU’s overall performance and wastes valuable design potential.
Modern MCU platforms use advanced flash memory that offers fast 10 ns access times, resulting in 100 MHz performance. This means that instructions can be pulled from 100 MHz flash and fed into the 100 MHz CPU with no wait states needed, delivering extremely high performance and excellent system efficiency (see Figure 7).
An efficient design methodology
Today’s industrial and consumer applications require microcontrollers that offer increased performance and lower power while staying within cost-sensitive budgets. To achieve this, modern MCUs must be as efficient as possible. By using a holistic system design methodology, semiconductor suppliers can provide modern MCU platforms that address today’s challenges.
MCUs that combine a fast and efficient CPU with no-wait flash memory ensure that the full design potential is realized while enabling DSP and floating-point capabilities. These modern MCUs also address the need for increased connectivity by integrating advanced system interfacing capabilities and communication peripherals enabling Ethernet, USB, CAN, and direct drive LCD, all within one MCU platform.