Reliability - more than just testing
The design of any high tech product is a long process and this is especially true in the optical transceiver and interconnectivity business. Often overlooked, however, is the crucial role played by the designer in ensuring a product’s reliability. Giacomo Losio, Head of Technology, ProLabs.
Irrespective of how well a product works, or how advanced the design, without reliability, the product is effectively worthless. There are a large number of firms out there in the industry capable of making functioning optical transceivers, but not so many can make reliable ones.
Reliability embodies not only a product’s ability to function but also a product that works without risk of failure for a long period of time. For instance, ‘carrier class’ products are often required to operate for as long as 20 years. Of course it is not possible to test it for such a long time, so accelerated ageing and statistical analysis are often used to shorten the process which still can take up to seven months.
Whilst reliability is important all across the entire technology industry spectrum, it is in data centre and telecom products that are held in especially high regard. The reason for this is financial costs associated with their failure as in the digital era - our society depends highly on these installations (banks, utilities, transportation systems and education).
To create a truly successful reliability testing process, an optical transceiver company must involve its entire team, from product concept to manufacturing and product support. Reliability is an integrated system and only a company with an organic end to end approach can be successful in this area. Reliability comes in the form of different disciplines - mechanics, electronics, materials science, etc.
Within this a number of other parameters that also comes into play – temperature, humidity, electrical and mechanical stresses, corrosion. The tests largely fit into two categories - mechanical and environmental - testing both the physical resilience of a product as well as its capability to withstand extreme operating conditions. Reliability test plans are compiled starting from a set of standards issued by Telcordia or IEC or military standards. Important items are the test procedure, which has to be clearly defined to allow repetition of tests, the sample size and the pass/fail criteria. Every device/sub-assembly used in the transceiver has to be qualified independently and internal interconnects have to be verified with particular attention as it is in this area where mechanical stresses can arise.
Mechanical tests will involve everything from tests performed on the stand alone device (vibration, shock, drop), to its use on the real platforms (electrical and optical mating/de-mating cycles, operational vibration, ESD, EMC) to make sure that it is not only intrinsically reliable, but can also operate reliably in the everyday environment. Particularly important are also transportation tests to make sure that ‘dead on arrival’ (DOA) parts are minimised.
Environmental tests involve alterations to temperature and humidity conditions and include temperature cycling, thermal shocks damp heating (humidity and heat are applied simultaneously) and prolonged operation at high temperature. Temperature is often used as an acceleration factor to minimise test duration and understand what the mortality curve of the device is. Usually these products have an ‘infant mortality’ and a wear out mortality. The outcome of the reliability tests needs to be used by manufacturing to design production tests that are able to eliminate parts that could fail at the beginning of the operative life.
A product has also to be manufactured reliably, and reliability has to be maintained and ensured for the whole lifecycle. This is why it is important to include manufacturers during the product concept and design phase, and a reliable product has also a better production yield which helps keeping costs under control.
To summarise, the responsibility for reliability lies with everyone within the company and at every stage of the product development cycle - from the conception of the product to each developer’s understanding of their role and responsibility in the reliability chain. It depends on the integrity of the components used as much as it does the materials used, as well as the operating environments under which manufacturing occurs. The efficacy of materials used and the handling and quality of workmanship employed can all potentially, and severely, affect reliability. Every step required in a product’s construction must be constantly measured and monitored. Such stringent reliability testing cannot rely on box ticking - it demands a scrupulous mind-set throughout the company.
Processes alone do not, and will not, guarantee reliability. Reliability must be embedded within a company’s foundations and among its most basic of building blocks - the individuals within that organisation. Without this, whilst products may well work, their reliability cannot be guaranteed.