Design
Safe and Secure Virtualization for the Next Generation of Avionics
The next generation of Integrated Modular Avionics (IMA-NG) is being developed under the EU-funded SCARLETT project. It requires both multiple application execution environments, of varying degrees of criticality, and the ability to reconfigure resources on the platform, while still meeting safety standards. Chris Hills examines how Safe and Secure Virtualization (SSV) can meet these demanding requirements.
In aAvionic systems were first developed as a series of discrete Line Replaceable Units (LRUs), each with its own housing, power supply, hardware and software, and forming part of a federated architecture. This approach became untenable: size, weight, power consumption and communication complexity all ballooned. Each LRU required its own development, certification and update process and its own repertoire of spare parts.
##IMAGE_2_C##
Figure 1: The IMA1G architecture
The next step was Integrated Modular Avionics (IMA). Here generic Core Processing and Input Output Modules (CPIOMs) are integrated into an Avionics Bay sharing the power supply and the communications connection. Within the CPIOM platform, specific services are decoupled from application level code. Applications with different criticality levels run concurrently in a partitioned execution environment.
Since multiple function suppliers needed to integrate their applications, the aerospace community has developed standards which address Application Programming Interfaces (APIs), module configuration, the data loading protocol and file formats, as well as integration and certification aspects of an IMA system. The IMA approach decreased the number of processing units, compared to the federated/RLU approach, despite an increase in the number of functions.
Next generation
The aviation industry is continuing to develop: new aircraft programmes are increasing, responding to the demand for more air services; the development cycle of a new aircraft is decreasing, from a typical 10 years to closer to 5, bringing increased pressure for reuse of components; electronics is an increasing part of the overall initial cost of an aircraft.
Total cost of aircraft services is under pressure, requiring the reduction of development and maintenance costs, reduction of certification costs (by means of incremental certification), reduction of energy costs and an increased availability.
While IMA (now often referred to as IMA First Generation, or IMA1G) is an important and valuable step towards part reduction across different aircraft families, and is deployed in the A340 and A400M, it is still limited to a small set of aircraft functions.
Today, under the EU funded Scarlett (SCAlable & ReconfigurabLe elEctronics plaTforms and Tools) project, a large number of European companies and universities are developing the next generation of Integrated Modular Avionics (IMA-NG or IMA2G).
##IMAGE_3_C##
Figure 2 The IMA-NG architecture
Within IMA-NG, some things are already established. A core concept is Distributed Modular Electronics (DME) with application processing and I/O functions in separate modules. The processing modules (CPMs) will need to provide more processing bandwidth and, just as elsewhere, this will be through multi-core. However, the aerospace industry has still to reach common ground on how to reach the same level of determinism with multi-core CPUs as is achievable today with single-core processors.
IMA1G uses standards like ARINC 653 for the application programming interface and ARINC 615 for data loading. IMA-NG will extend these, both to improve further the level of independence of applications from the underlying platform and to improve the configuration process. This will simplify incremental certification (certifying new applications as they are added rather than having to recertify the whole system) as well as meeting the new DO178C requirements.
Reconfiguration
A significant part of the current activity is looking at reconfiguration. The idea is that redundancy and resource allocation will be managed dynamically at the platform level rather than statically at system level, to increase overall aircraft availability with a reduced number of spare parts. Under consideration are:
Reconfiguration both on ground and in flight.
Pre-qualified versus algorithm based reconfiguration.
Supervised versus automatic reconfiguration.
Sharing of spare parts between multiple functions.
Graceful degradation of functionality.
Additional work is going forward to reduce development and maintenance costs for the entire development, integration, certification and maintenance process, by creating a single, consistent tool chain. This could include:
Common software development and debug tools.
Platform simulation tools.
Application pre-qualification.
Platform configuration and reconfiguration.
Analysis of the dynamic platform behaviour.
Automated tests.
Currently reconfiguration uses time schedule switching. Consider two CPMs, both connected through a Remote Data Concentrator (RDC) to actuators and sensors. Each CPM is configured with active partitions and also has non-active partitions (partitions with all resources but no time slot allocated in the current time schedule). Execution of active partitions is governed by a time schedule, defined by an overall logical configuration table. This has, for both CPMs, a description that includes all the possible and validated time schedules, and all partitions, active and non-active, for each time schedule.
Under the current time schedule an active partition CPM A is already replicated, but non-active, on CPM B. It is necessary for the partition to be running on CPM B, and this is done by setting a new time schedule from the configuration table that has the partition active on CPM B and non-active on CPM A. Before this can happen, it may be necessary to download a memory image from a remote server to CPM B.
There are issues with this scenario. It is based on a non-overlapping model of memory mapping for the partitions. This simplifies the tool chain and configuration management by having a uniquely identified partition for any given address in a CPM. However, since the virtual address and physical address of an application are the same, memory mapping of a partition is dependent on the module configuration, making the binary code for a partition dependent on the CPM configuration. This means it is impossible to share or reuse a partition’s binary image. And if, to improve performance, it is decided that part of a driver needs to be merged into the trusted code, each configuration of each CPM needs to be recertified.
Adding a new partition function, which uses trusted specific low level services to address hardware not used by other partitions, has major consequences; it immediately invalidates the configuration certification; each partition has a specific binary image for each configuration of each CPM. So the size of each load is increased by the size of each binary image for the function; for every configuration of every CPM, even if is not active, there is a specific memory requirement, increasing allocated memory, which can dramatically increase RAM requirements. In summary, without a new approach, reconfiguration generates new constraints and requires additional resources.
The IMA-NG goals of providing high availability, with a minimal number of spare parts, mean that all modules must accept a full range of functions. Applications, which are today part of the Open World Information System, use open standards like POSIX, Java and even Linux. Adding these alongside critical applications on a CPM increases overall efficiency but requires flexible and adaptive platform APIs. However it is impossible to create a single monolithic middleware which provides all the required APIs for multiple execution environments and yet can be certified to the highest DO-178B criticality level.
SSV
Reconfiguration in avionics requires multiple execution environments, hard partitioning and deterministic real-time behaviour. Safe and Secure Virtualization (SSV) appears to be a way to achieve this.
Virtualization is not new: mainframe computers in the early 1970s were creating an abstract platform of the physical resources to improve sharing and utilization. (See figure 3 – Monolithic versus Virtual Platform RTOS). Two approaches to virtualization are applicable to critical systems: full virtualization, with hardware support, and paravirtualization.
##IMAGE_4_C##
Figure 3: Monolithic versus Virtual Platform RTOS
With full virtualization the complete hardware is simulated. While allowing a complete operating system, like Windows XP, to run without modifications. Full virtualization is highly dependent on the underlying processor features and is difficult to integrate into the ARINC 653 partitioning concept. As partitioning is totally dependent on extremely complex processor features it can not be addressed by established certification processes like DO-178B or DO-254.
Paravirtualization uses a Virtual Platform (micro kernel or hypervisor based), which is rich enough to implement a complete operating system above it. It does not use support from the hardware, so can be implemented on multiple CPU families, and while a guest operating system has to be ported to the Virtual Platform, requiring source code, once ported it can then run on all implemented hardware.
A Virtual Platform (VP) manages all aspects of partitioning, and a carefully designed VP concept can fully support both ARINC 653 partitions and partitions running guest operating systems. For certification, each application execution environment is seen as user-level code, and need only be certified to the level required by the hosted applications: even Linux and its applications running in a partition may be regarded as just another complex application. The VP, of course, has to be certified to the highest level of all applications hosted by the module. The MILS separation kernel architecture maps well onto an SSV RTOS architecture, with both reaching the highest security level. Since SSV reduces the security kernel to the smallest possible size, it becomes eligible for formal verification.
As discussed earlier, while multi-core technology is the path to higher performance, it is still not at the same level of determinism as single-core processors. With an SSV RTOS architecture, where application code is not linked to the hardware implementation, multi-core CPMs become possible.
SSV and reconfiguration
With SSV, the virtual address of an application is the same for all instances, whatever its location in the physical space: different configurations of the same application partition within CPMs running the same RTOS require only a single binary image and a single certification.
Drivers are usually implemented at the user level, and have no impact on the separation kernel or on other trusted code components, although real-time aspects will need careful consideration.
If we look back at the issues of reconfiguration discussed earlier, since a new partition has only one specific binary image for all configurations, the total size of loads is increased by the size of just one binary image for the function plus some specific configuration table descriptions, including the link to the shared binary image.
##IMAGE_5_C##
Figure 4: MILS separation kernel architecture
The new partition is configured only where it is active and each channel or file system is statically allocated for all configurations. The size of the required memory for the CPMs is the new MAX value of the size to be allocated in memory for the whole set of configurations. And the MAX value for one configuration is only the SUM of requirements for active partitions plus requirements for all channels and file systems in the configuration.
Safe and Secure Virtualization can already meet IMA1G requirements. But to meet the efficient implementation of reconfiguration while remaining compatible with incremental certification that IMA-NG requires, it looks as though SSV will be essential.
RTOS choice
Today PikeOS is the only COTS product that implements the SSV concept. Users have already deployed major security- and safety-critical systems, using PikeOS on various personalities, architectures and hardware platforms at different levels of certification.
By Chris Hills, CTO of Phaedrus Systems.