Accelerating Virtualized Environments with Virtualized Hardware Functions
October 31, 2019
The movement towards software transportability created a need for designers to move quickly into systems with different performance and price points. It also meant they had to design for all of these.
Over time, the industry has moved to virtualized functions and software defined networks (SDN) in order to achieve flexibility of deployment across a wide range of hardware platforms in datacenters and networks. Software now must be transferrable across multiple software and hardware environments in order to be both cost-effective and provide the required flexibility to meet a range of changing performance demands.
Open vSwitch (OVS) is an example of a high-level function that used to be hardware but through virtualizing the switching function brought flexibility of deployment. In doing so, some functions like packet parsing and classification – that are more suited for hardware – have become bottlenecks. Finding ways to use hardware to accelerate these virtual functions while maintaining flexibility of deployment could greatly increase throughput and reduce latency.
SmartNICs with FPGAs or dedicated silicon are becoming a solution which is a hybrid of hardware and software. In order to play together with software functions, key hardware functions must be virtualized.
Several challenges currently exist when accelerating virtual environments with FPGAs which can be mainly attributed to storage capacity, aggregate random-access performance of memory, and memory access latency:
The first challenge lies in the difficulty of supporting systems that require large amounts of memory and need to be accessed in a truly random-access pattern. Even with present-day FPGAs that support larger on die resources, and even those that support high-bandwidth memory (HBM), the issue becomes how much of these resources will be needed to support high-speed random-access requirements. There is even a third common case of using off-chip QDR SRAM to augment on-chip resources, which uses many I/O resources for relatively low density. The question is, how to replace multiple QDR devices with equal or more memory using only one device?
The second challenge can be summarized as investment utilization, or portability. Many end user customers who need functions such as Packet classification or DPI expend significant development effort in software. If those customers need a mid-life performance boost or to enlarge the product offering breadth, substantial software rework is required in addition to a possible redesign of the associated hardware.
The Virtual Accelerator Engine Approach
A Virtual Accelerator Engine (VAE) approach allows the user to design to an API and RTL module interface that isolates the software and the rest of the system design from the underlying implementation. This can be upgraded to support a performance boost, while saving the effort spent on software development and debug. In addition, the virtual accelerator approach plays well in hierarchical solutions like OVS with fast paths that “fall back” to slower paths, all using the same programming model to manage tables.
Currently, there are multiple approaches to accelerating designs based on FPGAs. Two methods address the most common bottleneck challenges when customers design systems based on FPGAs in particular.
The strategy has been to define a common API function interface that is supported by a range of hardware environments with scalable performance. Since the API has different performance specifications to choose from, it is not necessary to know what the hardware is. So, in essence, it is essentially a virtual hardware solution.
Here, hardware does not drive software decisions. Software performance is defined, a hardware environment is selected. Thus: Software-defined, hardware-accelerated.
Since software system design is performed at the at the API level with an unselected hardware environment, these solutions are called Virtual Accelerator Engines. This allows the system to execute this software-defined function on a series of hardware platform options, each of which can offer a different performance point.
Key benefits of a Virtual Accelerator Engine are on applications that need to protect software investments by using a common API for transportability or performance scaling over many different hardware environments. By designing to a common API, this allows a system designer to seamlessly port it across a range of performance platforms.
Possible VAE Platforms
The following illustration depicts a VAE’s scalable hardware environment that would execute a common API and module interface. A common application can experience up to a 400x performance increase by moving from lower-end, more cost-effective hardware environments to performance-driven, hardware-defined systems. The fundamental performance of each is determined by the capability of the underlying memory with regards to total aggregate random reads and writes.
This flexibility allows the system designer, the option to implement the appropriate VAE platform to achieve the system performance that is required.
Many markets are moving to a software transportable world. While this is common in cloud computing, it is not common at lower level system functions.
Virtual Accelerator Engines: Software + Firmware + Hardware
The Virtual Accelerator Engine is defined to provide scaled acceleration at the function level of a system. The common API is hardware agnostic. It can run on a CPU or FPGA that is not attached to a specialized IC or an FPGA that is attached to accelerator ICs like the MoSys Accelerator IC family that include Bandwidth Engines or Programmable HyperSpeed Engines with in-memory compute capability.
Virtual Accelerator Engines are designed to support a functional platform such as “Packet Classification,” for example. It is “virtual” because it is an abstracted function that can be standalone software, FPGA RTL, or embedded firmware-based.
Using MoSys’ common software interface (API) and RTL module interface across multiple hardware environments, system designers can reuse internally-developed software code to tune the performance required. In addition, all FPGA-based VAEs use a common RTL that allows hardware transportability. A VAE with a common API can run on a CPU or a common RTL module interface where either the FGPA is not attached to a MoSys IC or the FPGA is attached to the MoSys accelerator engine IC.
The Age of Software-Defined Systems
With the movement to software transportability, not only in the cloud but also in standalone systems, there has been a need for designers to move quickly into new system designs with different performance/price points. This created a need for system designers to drop the application into hardware environments with varying performance levels, and thus design to each of those platforms.
VAE allows engineers to think at the system/application level without worrying “What does my hardware do?” With “Function Platforms," these engineers can develop systems that are software-defined first, and select hardware from a range of performance environments later.