Introduction to Advanced Reliability as one key differentiator of electronic components and systems made in Europe
The history of reliability as we know it now goes back to the 1950s, when electronics started to play a major role for the first time. Now, 7 decades later, with million times more complex electronic systems, the industry is facing a continuous increase of early and wear-out failures with accompanying consequences. Figure 1 depicts the struggle for different high-tech industries, ranging from harsh environment suitability to long lifetime and warranty coverage. Nowadays, products with high failure rates may come under public scrutiny due to negative customer feedback publicly shared on websites, eventually building bad reputation for a company.
iRel40 aims to find solutions to cope with the ever increasing complexity of reliability topics and to find processes, methods and digitized universal procedures for covering the various ECS domains. High-tech industries are marked by concerns like direct finance loss, delayed product release, liability and reduced consumer confidence. Each industry has its own key focus areas regarding reliability.
Reliability and availability will become enablers for product designs. Big data will enable detailed understanding of failure mechanisms, usage scenarios, technology and optimal designs. For example, products will be combined with sensors that can be used to capture information about how and when and under what environmental and operating conditions products are being used. But the data can also be used for pure reliability analysis. Examples are signal-detection algorithms to detect unsafe operating conditions or precursors to system failure that can be used to protect a system by shutting it down or by reducing load to safe levels.
It is primarily customer satisfaction, which differs from an application point of view. Customer satisfaction is seen as a key differentiator and increasingly has become a key element of business strategy. It is without any doubt that the occurrence of an unexpected product failure directly relates to amount of customer dissatisfaction. While customer satisfaction depends on the context of the application, it heavily contributes to the overall perception of a particular supplier.
To cope with customer satisfaction, there is a need to predict the remaining life of the system (or the remaining life of its most important life limiting components). This topic is named as prognostics and health monitoring (PHM). PHM refers to the process of predicting the future reliability or determining the remaining useful lifetime of a product by assessing the extent of deviation or degradation of a product from its expected normal operating conditions . Prognostics and monitoring are not about trouble-shooting reliability issues; rather, they constitute a new control point enabled by many industries’ steady transition to a services business.
It is the combination of data and deep physical (and technological) insight that will give a unique “right to win” in the semiconductor industry. The future possibilities for using big connected data in reliability applications are unbounded. Lifetime models that are based on this data have the potential to explain much more variability in field data than has been possible be-fore. Figure 3 schematically displays the change from a classical reliability test-to-pass approach to prognostics and health monitoring.
What will iRel40 improve?
The reliability of electronic hardware, that's what iRel 40 is all about
iRel40 does that with the focus on the value chain chip-package-board/system, through modeling and simulation, through deepened understanding using Physics of Failure, with new materials, designs for reliability, real time feedbacks in production lines, improved tests, predictive algorithms and the use of all available data to learn faster with AI and ML.
- Quality is defined as the degree of compliance to specifications with regard to functionality and parameter of electronic components and systems at a given time (eg. at the end of the manufacturing process)
- Reliability describes the quality of electronic components and systems over a defined time period (functionalities and parameters have to meet specifications for 10 years or under harsh environmental conditions)
- Quality and reliability in electronic components isn’t given for free ! It can’t be simply tested, it must be inherently build into electronic components and systems along all processes and along the full value chain.
Quality and reliability is a “must have” for ECS made in Europe! No business is the consequence if the quality level does not fulfill customer demands. This is particularly true for the already mentioned areas, where systems became already exceptionally complex. For some of the new applications presently discussed, reliability can even become a road-blocker, e.g. for autonomous driving or “always on” electronics during battery charging! For autonomous driving applications, the demand for average use of the electronic systems increases to > 20 h per day instead of 1.5 h per day as it is assumed in current cars. The increasing reliability challenges especially for the automotive industry were also highlighted at the 2018 AEC reliability workshop at Munich last November.
iRel40 will drive improvements in both areas:
- Reliability over time, according to ever increasing requirements and specifications
- Reliability at time 0 (Quality) by support of advanced AI and ML supported methods to improve processes in development and production. (Within iRel40 these methods are called “Quality4.0” and “Testing 4.0”).
Meeting the reliability targets is crucial in all safety relevant applications. Even if the error level in many areas is already in the sub ppm range, enhanced requirements are already evident. The needed improvements force completely new approaches to achieve the targets. iRel40 will tackle with this huge challenge, and will drive continuous reliability improvements as an ambitus goal. Europe already has good competence in respect to research and development on reliability. Research centers have specific reliability competences and reliability topics are often included as a work package in public funded projects.
iRel 40 improves the Reliability of electronic components and systems along the value chain chip-package-board/system, throughout the whole lifecycle in the domains of Transport and Smart mobility, Energy and Digital Industry Our sensitive indicator to measure the success of the project is the failure rate in ppm. Further Improvements from an already high level are requested to increase the availability of systems in the various domains and applications but all require specific increasing efforts, the creation of new reliability community with outmost knowledge in physics, chemistry, material science, electronics and of course AI and ML to develop new solutions with increased cooperation.
Reliability experts often describe the lifetime of a population of products using a graphical representation called the bath tub curve (see figure below). The bath tub curve includes three periods of the whole lifetime of products and systems:
This means, for electronic components and systems iRel40 will improve:
- “early life failure rate”: with a decreasing failure rate, indicating learnings after the development phase and during the ramp up
- "useful life”: indicating the intrinsic failure rate in ppm (parts per million). Customer quality requirements are based on this ppm level.
- “wear-out periode”: indicating the end of life of products with an again increasing failure rate and by this iRel40 will minimize
- The impact of unexpected quality events in manufacturing
- The total cost of ownership
Why do we need to improve Reliability?
After years of small incremental steps, which led to a just acceptable reliability level fulfilling actual demands, the time has come to go for a disruptive initiative by means of digitalization. Zero failure is the ultimate goal, but almost impossible to reach. Therefore, iRel40 aims to be a coherent European initiative on chip, package, and board/system level to tackle reliability over the whole value chain strengthening the strategic ECS domain. Shared mobility concepts, as one example, will increase the time in operation of ECS in cars by a factor of more than ten. Reliability of electronic components and systems have to keep up with the new requirements ahead of us. Of same importance is also prediction, having indications by sensing combined with the physical system understanding.
This knowhow will enable prognostics of upcoming failures of systems so that Smart Maintenance or Predictive Maintenance or Smart Redundancy concepts can be implemented to minimize the risk of sudden break downs of electronic components and systems. AI supported solutions are required to handle the enormous amount of data that will be generated in real time and that has to be calculated at the edge and in many cases on the cloud to enable future behavior of electronic systems. Enabling higher degrees of Control Intelligence by Quality driven approaches and methods, that are supported by Machine Learning and AI, as they are envisaged in iRel40 will be of utmost importance for future market success and the profitability of the capital intensive semiconductor plants and assembly lines.
New types of sensors, new possibilities of communication possibilities (5G), offered by all digital approaches in manufacturing are flooding semiconductor industry and the relevant applications with still increasing “data lakes”. Intelligent Reliability 4.0 makes use of this by extending Big Data analysis and AI, to improve feedback loops further and to enable real time control of all process steps, resulting in better predictability for all kind of deviations.
The future control concepts will focus on process performance and system applications in real time, within the expected, known and analyzed parameter space, developed in iRel40 (digital twins).
However, we face extreme challenges because our semiconductor devices as well as the electronic systems reach limits for complex system integration and become extremely sensitive to the package and board environment. Therefore, we need in Europe a coherent project initiative on chip, package, and board/system including heterogeneous system integration to tackle reliability over the whole value chain and to learn from each other with focus on Europe’s key market segments and Europe’s key technologies.
Reliability is considered a prerequisite in the industry, but it is not! Guaranteeing reliability requires efforts. This is best understood by explaining it with an example that is emotionally charged for us humans, health. Reliability for electronic components and systems is like health for humans, this is why we can state:
Reliability isn‘t everything, but without it, everything else is nothing If we as consumers buy products that don't work, we will change suppliers or manufacturers sooner rather than later. The customers at the end of the ECS value chain behave the same way. But there are usually only a few reported effects when talking about unreliable products, but there are a lot of critical implications evident for semiconductor and electronics companies. iRel40 minimizes these effects, it lets the ice melt!
Not-meeting future reliability requirements will have drastic consequences in case we do not react in time. Figure 5 reflects in the iceberg model showing the consequences of not-meeting reliability requirements. Poor reliability has not only the obvious consequences on top of the water, but also the many consequences below the water surface that even can lead to loss of customer loyalty and loss of innovation. As consequence for poor reliability, we do not just have to cope with the cost which are obviously visible above the “water surface” (e.g. warranty claims, in case of safety critical applications safety relevance, recalls, maintenance and service), but we also have to face the consequences, which are hidden below the “water surface”. This includes, among others, longer cycle times, increased testing efforts, audit costs etc., but also loss of customer loyalty and loss of innovation are real consequences.