1. Introduction

An extrapolation from today’s high performance deep-submicron electronic devices implies the need for packing ever-increasing amounts of power into very small volumes. Consequently, if left unchanged over the next decade, system temperatures would rise from the equivalent of a warm dinner plate to that of the surface of the sun. Design of systems and complex integrated circuits (ICs) in sophisticated deep submicron technologies — 65 nm and below — requires hugely innovative solutions to control dynamic and static power consumption.

Low-power design is the enabling technology and a critical prerequisite for the technical and commercial success of many future applications, especially those in mobile telecommunications. Previous design solutions involved tradeoffs between delay, throughput and area; new complex system design solutions must now balance delay, throughput and power consumption. The MEDEA+ 2A708 LoMoSA project (=Low-power expertise for Mobile & multi-media System Applications) was set out to take on this challenge.

Per end 2008 the LoMoSA consortium consists of world-class experts from the industry (NXP, STM, Thales, Thomson, ST-NXP Wireless), a number of university research labs and institutes (CEA-LIST, CEA-LETI, TIMA, AlaRI and the University of Cantabria) and 1 SME (DS2).

2. Goals and ambitions of LoMoSA

LoMoSA aimes to create the required low-power expertise for mobile and multimedia applications by initiating the development of a European low-power System-on-Chip (SoC) platform. This platform should consist of an interacting combination of (architectural) models, design flows and methodologies, hardware design components, embedded software and test-benches. The project did not only focus on bus-controlled SoCs, but also on-chip communication-based solutions of multi-processor SoC infrastructures were investigated by using the concept of hardware-dependent software (HdS).

The quantitative target of the LoMoSA project was set to reduce overall system power consumption, i.e. active and standby power, up to 70% by the end of 2008. As a reference power consumption of (mobile) systems with comparable functionality would be taken, at the start of the project (Jan. 2005).

To meet the above objectives, the necessary work was divided into 5 work packages (WPs) with the following main themes:

  • WP 1: Survey and definition of platform architectures
  • WP 2: Hardware platform components
  • WP 3: Hardware-dependent Software (HdS) & Real-time Operating Systems (RT-OS)
  • WP 4: Design methodology and methods
  • WP 5: Definition and development of joined mobile & multimedia demonstrators

3. Overall project results – LoMoSA Success Stories

Technological Innovations - Achievements

The major technological innovations in LoMoSA concerned:

  • Definition of integrated low-power optimised platform architectures
  • Development of a power-aware design methodology and methods
  • Development of reusable, power-efficient digital and analogue HW components that are the building blocks of the platform,
  • Development of HdS technology for the application driven design of architectures built on top of NoC (Network-on-Chip) as an enabler of 65 nm technology platforms for European SoC applications

Validation of the newly developed technologies was done through a set of jointly developed practical application demonstrators at the end of the project.

Status of LoMoSA goals at the end of the project

Following some major reductions in partner contributions, caused by funding difficulties in Germany and Belgium, not all initially planned activities in LoMoSA could be executed. Already at the start of the project German partners (Infineon and 5 German universities) had to leave the consortium and Belgian partners (Philips, IMEC and Target) left by end 2005.

At the end of LoMoSA it can be stated that:

  • Qualitative goals of the project were reached. The quantitative goal of >70% power reduction had to be adjusted. For some LoMoSA results under specific use-cases power reduction comes close or even extends the 70%. In general the quantitative goal should be: power reduction up to 70%. For a number of demonstrators this quantitative goal has been met.
  • Project milestones & workpackage deliverables have been achieved, except for a part of one milestone (M15); which will be finished by the French partners in the 1st half of 2009.
  • There has been a good cooperation between the partners in all work packages.
  • A number of publications & contributions to workshops, exhibitions and conferences have been reported. LoMoSA members are involved in the organization of a number of conferences that cover the low-power domain (ISLPED, PATMOS, DATE, MEDEA+ DAC,..).

Mid 2008 the LoMoSA project was ‘hit’ by major reorganizations within NXP, Thomson and ST. The newly-formed ST-NXP Wireless JV (split-off from NXP and STM) was introduced as a new partner in LoMoSA in October 2008. A large part of the LoMoSA activities of NXP (NL,F) and a part of ST activities is now housed under this new top-3 vendor of silicon contents in mobile phones!

The LoMoSA team will continue the successful cooperation in LoMoSA after 2008 in a new low power initiative under Catrene flag [the COMCAS* project]. The project has been labelled and is now in the "negotiation phase".

* COMCAS = COmmunication-centric heterogeneous Multi-Core ArchitectureS.

Industrial exploitation

All partners are exploiting (or will do so at short notice) the innovative results, i.e. ICs, methodologies and tools. For Thomson however, the decision of Thomson management to stop its silicon components activities (TSC= Thomson Silicon Components) from July 2008 onwards, means that the Thomson results developed by TSC in the frame of LoMoSA, will not (directly) be exploited by Thomson.

Market relevance

Market relevance of LoMoSA is high and the low power consumption issue will stay one of the key market drivers for SoCs in the years to come.

Results WP1

General

The WP1 project has a major outcome with the architectural template document. This document includes the contribution of many LoMoSA partners and describes most of the activities to be covered or extended in the LoMoSA project. The architectural platform and the attached concepts as exposed will be suitable for a variety of standards and future applications.

The WP1 project has a major outcome with the architectural template document. This document includes the contribution of many LoMoSA partners and describes most of the activities to be covered or extended in the LoMoSA project. The architectural platform and the attached concepts as exposed will be suitable for a variety of standards and future applications.

The corresponding document, where most of LoMoSA Partners are directly or indirectly involved, gives the constraints for the development of the platform components that are generated in WP2, WP3 and WP4.

As a reminder, this document covered the following subjects: 

  • NOC (architecture, topology, security etc)
  • Power management on NOC
  • Guidelines & Recommendation for power aware HDS architectures & design
  • Architectural template for low-power mobile multimedia & HdS recommendation
  • Processors & multiprocessors systems, architecture & design
  • Power management in processors & multiprocessors system

STMicroelectronics:

Thanks to LoMoSA, the AST division of STMicroelectronics was able to promote and demonstrate the feasibility of power-aware multiprocessors and associated NOC solutions within the company, with the goal to embed such solutions in the next generation chips. Moreover, this allowed to extend the subject to target novel architecture and low-power management techniques.

In this project, STMicroelectronics worked on shared memory node and as a result, showing that this is good for computationally intensive applications with a balance of control and data processing. The scalability allows to increase the level of performance, even it is however limited to a small number of them. And the programming model is simple and ‘familiar’ based on thread level concurrency and lock-based synchronization.

At architectural level, the low-power management is present by construction (several processors at low frequency consume less than one at high frequency) and with additional features at multiprocessors level to disconnect a node and shutting it down. Some preliminary studies were done on software management of low-power, to control load-balancing or to switch-on or switch-off processors.

The results of these studies are currently used to develop the STMicroelectronics multiprocessors roadmap with the goal to enlarge the scope and overcome the limitation. The main multiprocessors development direction, based on LoMoSA results, are the ones detected during the studies, such as scalability increase, hardware low-power management techniques extension (frequency/voltage scaling, use of dedicated IPs such as local clock generator, VDD Hopping etc), software management (scheduling of tasks based on power & performance needs) as well as the ones driven by the technology evolution (fault-tolerant, design regularity etc)

THOMSON:

Based on the studies and the architecture definition done in the WP1, Thomson developed two System on Chip used in the video/audio coding/decoding domain.

One of these SoC is dedicated to Set-Top-Box products, the other to professional broadcast products. The SoC for professional broadcast products has permitted to develop professional coders, decoders and transcoders, which are for some of them already on the market.

Results WP2:

ST-NXP NL (NXP Semi before 1/8/09):

A DVB-H receiver chip and board has been developed with full functionality and performance and ready for production. Power consumption target has been achieved. NXP’s Business priorisation has led to the decision to abandon this activity completely. SAA8500 has been developed to improve the rendering of video on a mobile display panel and to significantly reduce the powerconsumption of this panel. This product is commercially available and is a unique product offering in the mobile space.

NXP Research NL:

The design of digital libraries optimised for low voltage supplies is technically recognized as a major benefit within the company for some power critical applications, however there are practical and economical reasons that hamper its adoption.

During this project we were capable to prove the advantage of a ULP digital library in term of operating frequency and of required power consumption, and to rise the interest of the BL for future development. Further research is needed to gain the sufficient confidence level required by the BL and to develop the interface circuitry.

ThThe changes of focus within the company has reduced the urgency of the commitments of the business lines in this area. However this experience is considered relevant, so that the possibility to further investigations on different technology nodes to meet the current portfolio are under discussion.

ST F:

During the overall project period (2005-2009) the WP2.4 activity has produced several important results for the company. Hereafter a short list of main successes.

  • Development of the advanced on-chip communication architecture Spidergon STNoC (RTL building blocks and SoC IP integration methodology), which is the enabling factor for new advanced SoC architectures in recent and future silicon technologies
  • RTL building blocks used for FPGA prototypes and one test-chip to test, analyze and demonstrate the NoC technology.
  • Knowledge sharing and dissemination within our product divisions for allowing the introduction of the NoC technology into ST (through FPGA demonstrations and RTL synthesis trials).
  • Dissemination through one book on Spidergon STNoC technology, one ST press release and several publications in international specific magazines.

THOMSON F:

The low power generic video processor architecture developed by Thomson in the frame of WP2 has been reused for:

  • A professional multi-standard video encoder/decoder IC that include 6 times that cell.
  • A multi-standard audio and video decoder Set-Top-Box IC

The professional multi-standard video encoder/decoder silicon has been delivered in Q3/2007. A first release multi-standard audio and video decoder Set-Top-Box silicon has been delivered in Q4/2006 (Milestone M2.c), a second release has been delivered in Q1/2008.

The professional multi-standard video encoder/decoder IC is currently used by the professional subsidiary of Thomson for their encoder, decoder and transcoder platform products. Thanks to their flexibility these products will be present in the Thomson portfolio for at least 3 to 4 years.

CEA LETI:

The CEA LETI has a upstream role in LoMoSA. We fit LoMoSA proposal concerning power reduction objectives, goals, technology (65 nm), low power techniques and today we are reaching really good results gathered in a unique demonstrator (ALPIN). Those contributions (hopping and DC-DC) are fully part of LoMoSA project.

A quantitative figure of power reduction can be given for ALPIN. Regarding dynamic power consumption, the two techniques implemented,Vdd Hopping and DC-DC converter (LoMoSA contribution), are used (with 95% efficiency for hopping and less for DC/DC) to perform a full DVS functionality. What means that the power gain strongly depends on the application. The minimum voltage (in an active mode) is around 0.7V (1.2V nominal supply voltage), memory blocks included. So, depending on the activity, each block is able to be functionnal at this voltage and the gain is then really significant: the power is proportional to VddČ, i. e. dynamic power reduction of 66% can be obtained from 1.2V to 0.7V without any latency cost.

Moreover, from an industrial point of view, using hopping technique permits to sale products including a low power feature while needing only 2 characterisation and sign-off points. The cost is drastically reduced compared to a DVFS approach for which each disceete V/F points have to be characterised.

A important work on DC/DC converter on-chip integration has been done, including above-IC passive components possibility. Full DVFS on complex digital circuits today is achieved using costly external DC/DC converters and integration could reduce costs and improve power efficiency.

NXP [F]:

Above table give a summary on high level system use cases performed with mobile phone platforms from 2007 and 2009 generation. Already the 2007 generation had some low power features implemented which were developed in LoMoSA. The 2009 generation however, implements all the low powere developed in LoMoSA. The 2009 generation however, implements all the low power
features described in WP2.

Below graphs give a comparison between the power figures achieved with the above mentioned 2009 platform (red ring) and current market data from 2006-2008 (various dots). Both telecom and multimedia major use cases are shown. This gives a good picture of the overall improvements achieved during the 4 year life time of LoMoSA.

In order to look into the power consumption improvements at the actual circuit design level, we have compared a significant amount of measured test cases (207 measurements, test cases from block level up to sub-system level) between Sillicon spins from end 2007 and Q3 2008. Below histogram gives the power reduction in % of the 207 test cases. Average improvement was 25%. In some test cases, improvements of up to 70% have been observed. This is a very significant reduction given that it covers only the power improvement features which were implemented during S1 2008 according to WP2.

 

Results WP3:

DS2:

SCoPE is fully available under GPL license through the web page www.teisa.unican.es/scope including all the project packets, documentation, the User’s Manual and additional information and advertising of the tool.

In the first half of the year, the first version of SCoPE tool has been finished and released. Thus, the following actions have been performed for the diffusion and dissemination of this tool. A project license was agreed with DS2 company for the tool distribution. The chosen license was the GPL version 3. It can be got at http://www.gnu.org/licenses/gpl.html. A release packet has been build with all source code of the project. This is possible because of the free GPL license adopted for the project. An user manual has been written describing the main SCoPE concepts and how to install it and run the examples.

The link is very active with more than 350 accesses per month. Several companies, like SidSA and VistaSilicon in Spain, NXP in The Netherlands and Thales and EADS in France have shown interest in the tool.

A demonstrator is now available and will be shown at the THALES internal Techno day at the beginning of 2009. It demonstrates the capabilities to engineer low-power real-time systems with the implementation of Power-management techniques for real-time applications on multicore systems (MPCORE platform) :

What is demonstrated is :

  • Early power estimations possible without final application software
  • The ability to use multi-core platform / processors potential for low-power optimisations
  • The ease of use and efficiency of these techniques.
  • Proof of concept on radio-application real-time benchmark

Results WP4:

NXP NL:

05S1: Efficiency improvements to Adapt optimiser realised
05S2: Study of multi-rate simulation algorithms
06S1: Prototype implementation of multirate simulation with static, two-part, partitioning.
06S2: Prototype implementation of multirate simulation with dynamic partitioning.
07S1: Test implementation in Pstar simulator of multirate simulation with dynamic partitioning.
07S2: Production implementation in Pstar simulator, release.
08S1: Study of pole-zero analysis algorithms, including the new concept of ‘dominant’ poles.
08S2: Implementation of the Dominant Pole Algorithm in Pstar, inclusive official release.

We developed a toolset that allows designers to characterize and instrument their IP in an efficient way, taking into account typical environment of development.
Those methods and flow are now in use at NXP, and further developments are already required. Those requirements come mainly from NXP, but some of them come from our partners, like the University of Santander.
All those developments also enabled a certain number of future developments, at NXP or in collaboration with partners, on the following subjects:

  • Power consumption based on Use-Case models
  • Temperature effects and floor-planning based on SLEEP tools
  • Detailed verification methodology of SLEEP power models
  • Detailed analog modelling and characterization techniques

All those items we put in the plans for 2009 and 2010 and are currently under discussion at NXP and with our partners.

Results WP5:

CEA-LETI [F]

The CEA LETI has a upstream role in LoMoSA. We fit LoMoSA proposal concerning power reduction objectives, goals, technology (65 nm), low power techniques and today we are reaching really good results gathered in a unique demonstrator (ALPIN). Those contributions (hopping and DC-DC) are fully part of LoMoSA project.
A quantitative figure of power reduction can be given for ALPIN. Regarding dynamic power consumption, the two techniques implemented,Vdd Hopping and DC-DC converter (LoMoSA contribution), are used (with 95% efficiency for hopping and less for DC/DC) to perform a full DVS functionality. What means that the power gain strongly depends on the application. The minimum voltage (in an active mode) is around 0.7V (1.2V nominal supply voltage), memory blocks included. So, depending on the activity, each block is able to be functionnal at this voltage and the gain is then really significant: the power is proportional to VddČ, i. e. dynamic power reduction of 66% can be obtained from 1.2V to 0.7V without any latency cost. Moreover, from an industrial point of view, using hopping technique permits to sale products including a low power feature while needing only 2 characterisation and sign-off points. The cost is drastically reduced compared to a DVFS approach for which each disceete V/F points have to be characterised.

A important work on DC/DC converter on-chip integration has been done, including above-IC passive components possibility. Full DVFS on complex digital circuits today is achieved using costly external DC/DC converters and integration could reduce costs and improve power efficiency.

NXP [F]

ST-NXP Wireless succeeded in introducing its new generation og 3G multimedia baseband processors to major customers during 2008. Amongst other parameters, an essential one was the world class power consumption figures achieved on this new platform in both telecom and multimedia use cases. Thanks to countless low power features implemented on this platform – ranging from early architectural choices over IC design elements and techniques up to system level optimization – many functions which formerly required external, power consuming components, could be successfully integrated on the same silicon. Besides obvious cost advantages, power consumption reduction of up to 70% could be achieved for certain use cases. Most of the implemented low power features have been described in WP1, 2 and 5 of the LoMoSA project. Some of the use cases are available for demo on the mobile phone demonstrator used in WP5.

NXP [NL]

  1. Demonstrated together with WP5-partners full performance DVB-H in combination with low-power video rendering on a mobile screen.
  2.  Demonstrated camcorder opportunity for recording and storing video content at lower image resolution and rates while displaying this content at resolution and rate capability of the display.
  3. SAA8500 commercially available (see announcement for Japanese market below) demonstrating:
  • Robust spatial and frame rate up-conversion saves:
  • Power: camcorder
  • Bandwidth: broadcast (ISDB), video telephony
  • Storage: camcorder while preserving motion portrayal and perceived picture quality
  • Adaptive backlight and ambient light correction saves significant power for display panels 50% for LCD, 30% for oLED, another 50% when indoors

Thomson [F]

The demonstrator shows the low power generic video processor architecture developed by Thomson in the frame of WP2 in use within a multi-standard audio and video decoder Set-Top-Box SoC.
Based on the same low power generic video processor architecture, a professional multi-standard video encoder/decoder SoC has been developed. The SoC for professional broadcast products has permitted to develop professional coders, decoders and transcoders, which are for some of them already on the market.

TIMA [F]

ST, Thales and TIMA have done a significant achievement in implementing a SystemC power aware multiprocessor simulation platform booting an SMP Linux kernel. It is planned to use internally this platform for new power aware developments.

STM [F]

Fruitful cooperation with our partners in the common demonstrator. Demonstrator used as proof of concept for this innovative Spidergon network on chip technology. This system interconnect solution was also delivered to CarVision, another Medea+ project, as main on-chip communication in the relevant FPGA demonstrator.