Accelerating Time-to-Market using an Integrated High-Assurance Software Stack

By Paul Pazandak

Director of Research

Real-Time Innovations (RTI)

By Fabrizio Bertocci

Principal Systems Engineer

Real-Time Innovations (RTI)

September 16, 2022

Story

General-purpose computing, operating systems (OSs), inherent language features (like C memory allocation), and software quality issues have led to a lack of inherent security and resiliency in systems throughout industry. This has resulted in many security breaches that have had dire consequences to national security. It is necessary to design assured systems based on appropriate techniques and tools by applying sound security and engineering principles.

Generally speaking, building an assured system entails a thorough understanding of the problem domain, deep analysis of domain-specific workflows and requirements, careful architectural considerations and design trade-offs, vetted development, proper configuration, and managed deployment of the final product. This level of care will also be needed throughout the product lifecycle.

Specifically related to system architecture, leveraging hardware and software techniques and tools for enhanced security boils down to applying sound security principles to suitable targets such as memory access (for example, the Principles of Open Design, Least Privilege, Separation of Privilege, and Complete Mediation). Other research and development efforts may adopt different applications of such principles to their particular environments and design goals.

The process of building trustworthy and high assurance systems is complex, costly, and requires significant expertise.  The end goal is to create a complete software-hardware solution whose components (both individually and collectively) meet your customers’ required levels of assurance for safety and security. This will vary depending upon the standards that are required.

For example, RTCA DO-178C for flight safety airworthiness and ISO 26262 for autonomous vehicles each contain multiple levels of certification corresponding to the level of criticality (the role) that the component has. Within DO-178C, for instance, there are five levels:

  • Level A (Catastrophic): Prevents continued safe flight or landing, many fatal injuries
  • Level B (Hazardous/Severe): Potentially fatal injuries to a small number of occupants
  • Level C (Major): Impairs crew efficiency, discomfort, or possible injuries to occupants
  • Level D (Minor): Reduced aircraft safety margins, but well within crew capabilities
  • Level E (No Effect): Does not affect the safety of the aircraft at all

It is simply too costly in terms of funding and time to build a high-assurance system from top to bottom. On the contrary, the goal should be to develop as little code as possible. The more proven/certifiable code that one can acquire or license, the less one will need to design, develop, maintain, and certify. This will expedite development efforts and significantly lower costs. A high-assurance software stack provides this (Figure 1). 

[Figure 1 | A high-assurance software stack allows engineering organizations designing safety-critical systems to develop as little code as possible.]

The role of this stack is to provide a proven foundation. It is composed of a real-time operating system (RTOS) that has been verified or certified (a safety RTOS) and a distributed communications middleware. 

Foundations of a High-Assurance Software Stack

For the last six years, under DARPA research funding, RTI has been working on a verified stack for embedded systems to accelerate safety/security accreditation. In this stack we use RTI’s certifiable Connext Software Framework. RTI Connext supports the Object Management Group Data Distribution Service standard (OMG DDS). Connext is running in nearly 2,000 critical systems today spanning avionics/defense, autonomous systems, medical robotics, energy, and industrial systems. Utilizing the OMG DDS open standard enables the ability to rapidly assemble loosely coupled (distributed) software components into a working system.

For the Safety RTOS we chose the open-source seL4 separation kernel (sel4.systems). It is a mathematically provably correct microkernel that will provide both time and space separation between running processes. It guarantees that there will be no unintended data leakage between processes, and that one process cannot impact the operation of another. This provides greater system resilience and security, which are also attributes of a multiple independent levels of security (MILS) solution. 

Derivatives of seL4 are being used by several large technology companies today.

The Need for a Secure Microkernel

To understand the need for a secure microkernel like seL4, it is helpful to start with a closer look at kernel design principles in general.

As shown in Figure 2, there are two main kernel design approaches – the monolithic kernel and the microkernel. In the former, all code required for providing typical OS services is directly implemented in the kernel itself. The kernel executes in the privileged mode of the hardware, meaning that all code is granted unrestricted access and control of all system resources.

[Figure 2 | If designed correctly, a microkernel operating system (OS) contains far less code than a monolithic architecture, which reduces the attack surface, simplifies compliance, and more.]

This type of implementation might be beneficial to the overall system performance, but it can lead to dangerous situations if any of the kernel components feature some type of malfunction – a state that could be exploited by an attacker. A prominent example is provided by the Linux kernel, which – containing more than 20 million lines of code – can be expected to contain a certain number of bugs providing potential attack channels.

In contrast, the microkernel design copes with this drawback by drastically reducing the trusted computing base (TCB), meaning the subset of code in the overall system that must be trusted to operate correctly. A microkernel follows the design principle of having the kernel contain only the most fundamental mechanisms (for example, inter-process communication and scheduling). All remaining OS functionality must be transferred to the unprivileged user mode, thereby running encapsulated within isolated sandboxes.

This approach protects the kernel processes from any interference from the outside, only allowing communication that is explicitly wanted. For a well-designed microkernel like seL4 this means that code base can be reduced to the order of ten thousand lines of code. This drastically shrinks the attack surface.

seL4 and DDS: A Reliable Combination

The purpose of seL4 is to provide a reliable, safe, and secure foundation for applications that require it. This includes, for example, military systems, medical devices, robotics, autonomous vehicles, and energy systems. Without exception, these high-assurance applications require a reliable and robust distributed communications capability, which is not provided by seL4.

OMG DDS for Real-Time Systems is a real-time, secure, loosely coupled, publish/subscribe software connectivity framework for distributed systems and is ideally suited as the communications layer for high-assurance systems, including for any safety RTOS such as seL4. While there are other open-source and commercial off-the-shelf communications framework technologies, those frameworks lack high-assurance certification and at best they provide rudimentary all-or-none security.

For DDS, seL4 creates an enriched, lower cost, smaller footprint, high-assurance foundation. For seL4, DDS provides an open standards-based communications protocol.

DDS drastically simplifies seL4 inter-component/application development, reduce associated costs, and promote component interoperability in the seL4 development community. DDS is a solution that will standardize data distribution in a more consistent, secure, and efficient manner. It provides a publish-subscribe model that enables easier, faster, and more secure distributed system development. Application developers can be alleviated from the burden of creating their own piecemeal, perhaps proprietary, and one-off solutions for message-based communications and deciphering the message sequence, allowing them to focus on domain-specific components and rely on DDS to provide standardized, secure interaction with other (local and remote) entities in the system.

Reducing Barriers to Entry for High-Assurance Software

DDS will significantly reduce the barriers of entry for companies and developers that decide to use seL4/CAmkES because it provides an abstraction layer that hides most of the complexity associated with developing applications on top of seL4.  DDS will significantly reduce the development time and the need for seL4 subject matter expertise in-house.

To get started, we have provided links to a number of general resources about seL4 and DDS below.


Paul Pazandak is Director of Research at RTI. [email protected]

Fabrizio Bertocci is Principal Systems Engineer at RTI. [email protected]

Real-Time Innovations (RTI) www.rti.com

Twitter: @rti_software

LinkedIn: www.linkedin.com/company/rti

YouTube: www.youtube.com/user/RealTimeInnovations

Resources:

RTI Connext and the Object Management Group Data Distribution Service (DDS):

seL4 and Access to seL4 Source Code:

Software Certification Standards

Flight Safety

Autonomous Vehicles:

Medical Devices (FDA):