Micros
Moving Microcontrollers Towards Multicore
Microcontrollers are historically the workhorse of embedded systems and it’s not uncommon for a single system to feature multiple devices. But is this a good enough reason to adopt a multicore approach? And if it is, what architecture do you choose? ES Design Magazine Editor Philip Ling investigates one possible solution.
As aEven though MCU manufacturers are reacting to this trend by dedicating an increasing amount of their own resources to software development, it follows that many OEMs must now contemplate how a move to the 32bit domain will impact their own development team.
Choosing the right architecture is clearly critical and many OEMs will gravitate towards the ARM camp, if not least because of the growing number of IDMs now producing standard products based on this widely deployed architecture, and the equally buoyant eco-system that exists around it.
The drive towards 32bit architectures in microcontroller applications may have helped turn ARM’s licensable IP in to one of the widest deployed cores in standard products, and it may appear that standard parts are the new ASIC. However, for OEMs looking to deploy embedded products in volume it may still make financial sense to license a core as IP and deploy it in an ASIC. The custom route clearly has significant NRE costs, so standardising on a common architecture would help minimise this. But while ARM is perhaps the obvious choice, there are other options to consider.
A Growing Field
From an IP licensing point of view, there are perhaps more providers than expected to consider in the field of 32bit microcontroller cores. Some make use of the massively popular 8051 8bit instruction set, ported to an extended address map enabled by 32bits; an option that could have benefits in terms of familiarity for software developers. There are also proprietary architectures that have taken a ‘ground up’ approach to developing their architecture, in an effort to differentiate themselves. This has the clear advantage of not being confined by legacy support, but it does arguably require a greater R&D effort.
One such company is Cortus, based in Montpellier, France. It has recently extended its range of 32bit cores targeting microcontroller applications with the APS5; a multicore-capable version, which will sit at the high-performance end of the product line alongside the company’s floating-point option (FPS6) and above the existing APS1, APS3 and APS3R cores.
While most of the details are only released under NDA, Cortus’ Product Manager, David Kerr-Munslow, gave some insights in to the new core and the company’s strategy. The range of cores is clearly intended to provide a solution from entry-level to high-performance, and as the company doesn’t offer standard parts it is competing directly with other IP vendors. Predominantly its customers are deploying the cores in ASICs, although FPGA prototyping is also possible.
So why choose now to launch a high performance, multicore version of the core? Kerr-Munslow explained: “Our cores have a very modest silicon footprint, this means that a multicore processor sub-system may well still be the smallest IP on the ASIC.” When faced with demand for more processing power, Kerr-Munslow believes developers have two choices: “Increasing the clock rate increases the power requirements and may require a more expensive technology node or face a poorer yield. Adding more processors gives more flexibility and we believe that multicore systems are, in a large number of applications, the best and most cost-, power- and silicon-efficient way to increase processing performance, even though our cores can achieve at least 400MHz in 90nm and retain throughput thanks to the sophisticated out-of-order completion pipeline.”
According to Kerr-Munslow, the APS architecture has been based on providing what an embedded system needs and what a compiler wants: “It is based on a ‘ground-up’ analysis and design of a compiler/processor coupling for embedded systems.” He explained that the Cortus design teams have many years of experience in both processor design and embedded systems, and had become frustrated with the poor suitability of ‘old’ 8bit processors generally found in embedded systems for modern applications and programming techniques. “A typical example is the Bluetooth stack, which requires a program size that exceeds the 64kbyte size of an 8bit architecture, thereby requiring bank switching.”
This also illustrates why better support for high-level languages has also become essential: “While C can be made to work for 8bits, it is not a happy state of affairs; pointers are a typical example, with bank switched memory included in the mix, how many registers and register-memory transfers does it require to manipulate a simple pointer?”
Making The Move
Moving to a 32bit family that isn’t code-compatible with your existing solution will inevitably require some legacy software to be ported. For applications using microcontrollers, this software will most likely be closely coupled to the hardware platform and could even be in a low-level language. However, Kerr-Munslow doesn’t see this as a real issue: “I think this is a false problem. A large amount of legacy code is already in C, although it may be in a processor-specific dialect with proprietary extensions.”
He sees the issue being support for hardware-specific features: “In reality the majority of the porting effort in embedded systems is related to the hardware and peripheral interfaces. This is why we offer bridges to the bus standards of the legacy peripherals that the customer might have. The customer can upgrade their processing core and yet retain their familiar legacy peripherals; possibly adding USB or Ethernet on an APS bus.”
According to Kerr-Munslow, the move to a high-performance core also removes the need to optimise code, in many cases: “The compiler and processor were developed together, this means that the code generated by the compiler is very efficient. If a customer has a specific algorithm that would benefit from optimisation we have a very easy to use co-processor interface, this is very closely coupled and integrated with the pipeline and out-of-order completion, so co-processor instructions can take advantage of the pipelining.”
While the APS3R also supports dual-core configuration (and the FPS6 is dual- and multi-core capable), the APS5 offers more expandability: “We also support — and encourage — heterogeneous systems,” added Kerr-Munslow. In theory, the shared memory and coherent data cache approach taken by Cortus can support an arbitrary number of cores, but Kerr-Munslow acknowledges the practical limits: “In reality things become increasingly inefficient when the number of CPUs rises above four, and eight might be considered a realistic and sensible limit.”
While it may still be relatively uncommon to use multicore MCUs in embedded systems — at least from a standard product point of view — it is emerging; many IDMs have published roadmaps that include heterogeneous multicore MCU variants in their main (mostly ARM Cortex based) families. It would seem inevitable that the embedded space will move more towards this model, based on the benefits seen in other sectors, so perhaps the only question is when.