BSMS/BSUH 7th Annual Research Symposium – Sexual Health Thursday 1st October, 9.00 – 5.00pm, Audrey Emerton Building Morning Session Programme Registration and Tea & Coffee Introduction Professor Jon Cohen, Dean, Brighton and Sussex Medical School Plenary Lecture – STI & HIV Prevention for Improved Population Sexual Health Professor Graham J Hart Ph
It-seminar.bz.uaEVALUATION AND DESIGN OF DEPENDABLE SYSTEMS WITH DESIGN DIVERSITY
Subhasish Mitra, Nirmal R. Saxena and Edward J. McCluskey Departments of Electrical Engineering and Computer Science Stanford University, Stanford, California Abstract
Design diversity was described in the past as a Design diversity has long been used to protect technique to avoid or tolerate CMFs in redundant redundant systems against common-mode failures and systems. In [Avizienis 84], design diversity was defined design mistakes. The conventional notion of diversity is as the independent generation of two or more software or qualitative and relies on “independent” generation of hardware elements (e.g., program modules, VLSI circuit “different” implementations. As part of the DARPA masks, etc.) to satisfy a given requirement. The basic sponsored ROAR project at Stanford CRC, we recently idea is that, with different implementations, common developed a quantitative measure of diversity and failure modes will produce different error effects. Forexample, chances of identical design errors may be demonstrated its use in the evaluation and design of minimized if two groups of designers are asked to dependable systems. In this paper, we present a independently design a hardware block or a software summary of our work on design diversity, and discuss module. A dip in a power supply may have different important problems related to the evaluation and design effects on two different hardware implementations of the of systems that use design diversity for dependability same logic function. Examples of design diversity include N-version programming [Lyu 91] for software 1. Introduction
systems or the use of different processors for redundant Redundancy techniques for designing systems with system designs in flight controllers (Boeing, Airbus) high data-integrity and availability have been studied extensively [Kraft 81, Siewiorek 92, Pradhan 96]. A There are two cost components associated with duplex system is a classical redundancy scheme used in diversity – design cost and manufacturing cost. The many commercial dependable systems [Webb 97, design time is roughly doubled for a duplex system Spainhower 99]. In a duplex system there are two design with diversity. The manufacturing cost can be modules that implement the same logic function. The avoided for systems built using reconfigurable logic outputs of the two modules are compared and an error is elements (e.g., FPGAs). For reconfigurable systems, reported when a mismatch occurs. Data integrity diversity can be created by downloading different means that the system either produces correct outputs or configurations instead of manufacturing different ASICs.
indicates an error when incorrect outputs are produced; While there is clear evidence that diversity can bring in the literature on fault tolerance, data integrity is also benefits in a redundant system, these benefits are referred to as the fault-secure property. In a duplex extremely difficult to quantify with the above qualitative system, data integrity is guaranteed as long as only one definition of diversity. Thus, as pointed out in [Littlewood 96], there is a need to answer questions such While most redundant systems are designed using as: “what is diversity? Are these designs more diverse the single fault assumption, in real life there can be than those? How diverse are these two designs?” In the sources of multiple and common-mode failures that next section, we present a quantitative definition of produce multiple faults. In the presence of these failures, system data integrity is not guaranteed.
Common-mode failures (CMFs) result from failures that 2. D: A Design Diversity Metric
affect more than one module of a redundant system, Assume that we are given two implementations generally due to a common cause. Sources of common- (logic networks) of a logic function, an input probability mode failures can be a dip in the power-supply, a single distribution and faults fi and fj that occur in the first and source of radiation creating multiple upsets, or even the second implementations, respectively. The diversity design errors. Common-mode failures are surveyed in di,j with respect to fault pair (fi, fj) is the conditional probability that, with the faults fi and fj present, the two implementations do not produce identical errors.
combinations that detect f1). The only input combination The main idea behind the above definition of di,j is that causes an error at Z2 with f2 present is ABC = 101.
illustrated using the example in Table 2.1. We have two (This is the input combination that detects f2). If a duplex implementations N1 and N2 of the same logic function, system consisting of the two implementations in Fig. 2.1 and faults fi and fj affect N1 and N2, respectively. When is affected by the fault pair (f1, f2), then ABC = 101 is the input combination 0000 is applied, both implementations produce correct outputs in the presence implementations will produce identical errors. This of the faults. When the input combination 0110 is erroneous output will escape detection. If we assume that produced, N1 produces erroneous outputs while N2 all input combinations are equally likely, then the d1,2 produces correct outputs in the presence of the faults. Ifa comparator is used to compare the outputs of N N2, a mismatch will be reported and hence, dataintegrity is guaranteed for this input combination. A similar situation happens for the input combination1101. For the input combination 1010 both N1 and N2produce erroneous outputs in the presence of the faults.
However, the erroneous outputs are different and a mismatch will be reported when the outputs of N1 andN2 are compared. Hence, data integrity is guaranteed for the input combination 1010. For the input combination1110, both N1 and N2 produce identical erroneous outputs and the errors will not be detected by thecomparator comparing the outputs of N1 and N2. Hence, data integrity is compromised for the input combination1110.
Table 2.1. Example to illustrate behaviors of faulty The above illustration of the design diversity metric combinational logic circuits [Mitra 99] and sequential circuits [Mitra 01a]. There are two fundamental problems involved in the computation of the design diversity metric: (1) there can be too many fault pairs to be considered; (2) the problem of computing the di,j value for a fault pair is NP-complete. Fast techniques for estimating diversity using a reduced list of fault pairs andadaptive For a given fault model, the design diversity metric, D, between two designs is the expected value of the Figure 2.2 illustrates the use of the di,j values to diversity with respect to different fault pairs.
evaluate diversity between different implementations ofthe same logic function. The following functions are implemented: W = A′C + BC, X = ABC, Y = BC, and Z = A′B + BC. The three implementations shown in Fig.
P(fi, fj) is the probability of fault pair (fi, fj).
2.2a, 2.2b and 2.2c use the same logic gates but differ in The main motivation behind using D as a metric for the sharing of the logic gates among the output functions design diversity is to combine the effects of the di,j (also called the fanout structures). It was shown in [Mitra values of different fault pairs into a single number.
00b] that diversity in the fanout structure is important for For example, consider the two implementations of generating diverse implementations of the same logic the logic function Z = AB + AC shown in Fig. 2.1.
Consider the fault f1 = w stuck-at-0 in the Consider the following three duplex system designs: implementation of Fig. 2.1a and the fault f2 = y stuck- 1. A duplex system is designed with two identical at-0 in the implementation of Fig. 2.1b. With f1 present implementations corresponding to Fig. 2.2a. Consider the in N1, the input combinations ABC = 111, 101 and 110 fault m/1 (at the output of the AND gate ABC) in both all produce errors at Z1. (These are the input implementations. In the presence of the m/1 fault in both implementations, the two (identical) implementations the fault p/1 (at the output of the AND gate ABC) in the produce identical erroneous outputs for 7 input second implementation (Fig. 2.2b). The reader can easily combinations (ABC = 000, 001, 010, 011, 100, 101, check that in the presence of this fault pair the two 110). Hence, the di,j value for this fault pair (m/1 in implementations produce identical erroneous outputs for both implementations) is 1/8 = 0.125 (assuming that all only 2 input combinations (ABC = 010, 011). Hence, in input combinations are equally likely).
this case the di,j value of the fault pair (m/1 in the firstimplementation and p/1 in the second implementation) is 6/8 = 0.75.
3. Finally, consider a duplex system such that the firstimplementation corresponds to Fig. 2.2a and the secondimplementation corresponds to Fig. 2.2c. Consider the fault m/1 (at the output of the AND gate ABC) in the first implementation (Fig. 2.2a) and the fault r/1 (at theoutput of the AND gate ABC) in the secondimplementation (Fig. 2.2c). The reader can easily check that in the presence of this fault pair the twoimplementations produce identical erroneous outputs for only 1 input combination (ABC = 011). Hence, in this case the di,j value of the fault pair (m/1 in the first implementation and r/1 in the second implementation) is The diverse duplex system designed in the third scenario (the first implementation corresponding to Fig.
2.2a and the second implementation corresponding to Fig. 2.2c) is better than the other two scenarios for thefaults considered. This fact can be verified by calculating the diversity metrics for these three scenarios. Wecalculated the diversity metrics for the above scenarios intwo different ways: (i) If we assume that all possible single-stuck-at fault pairs are equally probable, then the values of the D metric for the first, second and third scenarios are 0.96,0.97 and 0.98, respectively.
(ii) For each fault fi in the first implementation (Fig.
constituting the duplex systems in the above threescenarios such that the di,j value of the fault pair (fi, fj) isthe minimum over all fj’s; hence, (fi, fj) is called a worst- case fault pair. These worst-case fault pairs were foundthrough exhaustive simulation of all input combinations and all fault pairs. Finally, we calculated D as theaverage value of di,j’s over the worst-case fault pairs.
The D values obtained were 0.67, 0.72 and 0.75 for thefirst, second and third scenarios, respectively. Moreoverthere are many worst-case fault pairs with di,j values equal to 1 in the second and third scenarios.
The above results demonstrate that the duplex system designed using the implementations in Fig. 2.2a Figure 2.2. Diverse implementations with diversity in and Fig. 2.2c is the most diverse one. When we calculate the diversity metric assuming that all fault pairs are 2. Consider the case where a duplex system is designed equally probable, we do not see a big difference because a such that the first implementation corresponds to Fig.
large fraction of all fault pairs have the di,j value equal to 2.2a and the second implementation corresponds to Fig.
1 for all the three scenarios. Hence, the values of the D- 2.2b. Consider the fault m/1 (at the output of the AND metric become very close to 1 for all the three scenarios gate ABC) in the first implementation (Fig. 2.2a) and when it is assumed that all fault pairs are equally using the expressions derived in [Mitra 99] to understand various trade-offs and choose an appropriate diverse Diversity in software programs can be used to system which best fits their application requirements.
detect hardware and software faults. The above The diversity metric can also be used as a cost diversity metric has been extended for diverse software programs used to detect hardware failures [Oh 00]. For implementations of the same function. While the process detecting or tolerating software faults (design mistakes), of writing software programs for a given application still the idea of the design diversity metric can be used as relies on manpower, automated or semi-automated long as we have a fault model available. While there is techniques are used for generating hardware designs.
no clear consensus about “good” software fault models, Synthesis programs used in CAD tools are cost function several fault-injection techniques available in the optimization programs where the cost functions are area, literature [Hudak 93, Christmansson 98] inject software delay and power consumption of the designed circuit. A faults in a system; our diversity metric can be used in design diversity metric allows us to use diversity as the context of these software faults.
another cost function component during synthesis ofredundant systems for error detection or fault-tolerance.
3. Applications of the Design Diversity Metric in
System Analysis and Design
implementations of a combinational logic function is In [Mitra 99], the design diversity metric was used described in [Mitra 00b]. Techniques for enhancing the to analyze the reliability and availability of redundant self-testability of diverse duplex systems through test systems. The analysis showed simple relationships point insertion are described in [Mitra 00c]. There are between reliability, availability, design diversity, system many opportunities to develop new architectural, logic failure rate, mission time and self-testability. A duplex and layout synthesis techniques for redundant systems system is self-testing with respect to a fault pair (f1, f2) taking into account the diversity cost function in addition (f1 affecting the first implementation and f2 affecting to the standard cost functions used in CAD tools.
the second implementation) if and only if there exists aninput combination for which the two implementations 4. System-level Error Detection using Diverse
produce different outputs in the presence of the faults.
The following important observations were made Duplication (identical or diverse) is not the only way from the analysis: (1) When the failure rate is high, of detecting errors in dependable systems; parity even a small diversity can help enhance the system prediction techniques have been used in many reliability over simple replication; (2) If the failure rate commercial systems for error detection purposes.
Simulation results presented in [Mitra 00d] for general combinational circuits demonstrate: (1) Area overhead of circuits with parity prediction is comparable to that of improvement in reliability or data integrity obtained by duplication; (2) Diverse duplication provides significant using diversity diminishes with long mission times; (4) improvement in data integrity compared to parity System availability is significantly increased when a prediction in the presence of multiple and common-mode diverse system with many self-testable fault pairs is failures – the problem of theoretically proving this result for general error models is still open.
System designers can obtain quantitative estimates for system reliability, data integrity and availability Figure 4.1. Systems with CED: (a) Example. (b) Diverse duplication for combinational logic; parity In Fig. 4.1, we present a system-level view of Principles for Safety-Critical Real-Time Applications,” concurrent error detection (CED). The system in Fig.
Proc. of the IEEE, vol. 82, no. 1, pp. 25-40, Jan. 1994.
4.1a contains a combinational logic block implementing a [Littlewood 96] Littlewood, B., “The Impact of Diversity logic function f; the logic block obtains its inputs from upon Common Mode Failures,” Reliability Engineering register X and the outputs are stored in register Z. Figure and System Safety, Vol. 51, No. 1, pp. 101-113, 1996.
4.1b presents a CED scheme which uses diverse [Lyu 91] Lyu, M. R. A. Avizienis, “Assuring design duplication for combinational logic blocks and parity diversity in N-version software: a design paradigm for prediction for registers and bus lines. Thus, we can N-version programming,” DCCA, pp. 197-218, 1991.
achieve significant improvement in data integrity for [Mitra 99] Mitra, S., N. R. Saxena and E. J. McCluskey, multiple and common-mode failures (through diverse “A Design Diversity Metric and Reliability Analysis for duplication) without doubling the number of register flip- Redundant Systems,” Proc. Intl. Test Conf., pp. 662- flops and bus lines. Note that, the XOR tree may have 671, 1999. (CRC-TR-99-4, http://crc.stanford.edu).
significant delay overhead. This delay overhead can be [Mitra 00a] Mitra, S., N. R. Saxena and E. J. McCluskey, reduced by increasing the number of parity bits (i.e., the “Common-Mode Failures in Redundant VLSI Systems: number of extra flip-flops in the registers). Interesting A Survey,” IEEE Trans. Reliability, Sept. 2000.
problems analyzing this area-delay trade-off can be [Mitra 00b] Mitra, S., E. J. McCluskey, “Combinational studied in this context. For performance constrained Logic Synthesis for Diversity in Duplex Systems,” Proc. designs, the parity of register Z can be directly predicted International Test Conf., pp. 179-188, 2000.
from the contents of register X using another [Mitra 00c] Mitra, S., N. R. Saxena and E. J. McCluskey, “Fault Escapes in Duplex Systems,” Proc. IEEE VLSI 5. Conclusions
Test Symposium, pp. 453-458, 2001.
A quantitative metric for design diversity opens up [Mitra 00d] Mitra, S., and E. J. McCluskey, “Which opportunities for efficient design of dependable systems.
Concurrent Error Detection Scheme to Choose?,” Proc. While some problems and solutions are summarized in International Test Conf., pp. 985-994, 2000.
this paper, there are many other interesting open [Mitra 01a] Mitra, S., E. J. McCluskey, “Design Diversity questions and problems related to design diversity that for Concurrent Error Detection in Sequential Logic must be understood for efficient design of dependable Circuits,” Proc. VLSI Test Symp., pp. 178-183, 2001.
systems. These include architectural synthesis of [Mitra 01b] Mitra, S., N. R. Saxena and E. J. McCluskey, dependable systems with design diversity and estimation “Techniques for Calculating Design Diversity for of design diversity of large systems using simulation Combinational Logic Circuits,” Proc. Intl. Conf. Dependable Systems and Networks, 2001, To appear.
[Oh 00] Oh, N. S., S. Mitra and E. J. McCluskey, “ED4I: The research on design diversity was done as part of Error Detection by Diverse Data and Duplicated the DARPA sponsored ROAR (Reliability Obtained by Instructions,” CRC-TR-00-8, (http://crc.stanford.edu), To appear in the IEEE Trans. Computers.
http://crc.stanford.edu/projects/roar/roarSummary.html) [Pradhan 96] Pradhan, D. K., Fault-Tolerant Computer System Design, Prentice Hall, 1996.
[Riter 95] Riter, R., “Modeling and Testing a Critical [Avizienis 84] Avizienis, A. and J. P. J. Kelly, “Fault Fault-Tolerant Multi-Process System,” Proc. FTCS, pp.
Experiments,” IEEE Computer, pp. 67-80, Aug. 1984.
[Saxena 00] Saxena, N. R., et al., “Dependable [Briere 93] Briere, D. and P. Traverse, “Airbus Computing and On-Line Testing in Adaptive and A320/A330/A340 Electrical Flight Controls: A family Reconfigurable Systems,” IEEE Design and Test of of fault-tolerant systems,” FTCS, pp. 616-623, 1993.
[Christmansson 98] Christmansson, J., M. Hiller, M.
[Siewiorek 92] Siewiorek, D. P., R. S. Swarz, Reliable Rimen, “An Experimental Comparison of Fault and Computer Systems: Design and Evaluation, Digital Engineering, pp. 369-378, 1998.
[Spainhower 99] Spainhower, L. and T. A. Gregg, “S/390 [Hudak 93] Hudak, J., B. Suh, D. Siewiorek, Z. Segall, Parallel Enterprise Server G5 fault tolerance,” IBM “Evaluation and Comparison of Fault-Tolerant Software Journal of Research Development, Vol. 43, pp. 863- Techniques,” IEEE Trans. Reliability, Vol. 42, pp. 190- [Webb 97] Webb, C. F., and J. S. Liptay, “A High [Kraft 81] Kraft, G. D., W. N. Toy, Microprogrammed Frequency Custom S/390 Microprocessor,” IBM Control and Reliable Design of Small Computers, 1981.
Journal Res. and Dev., Vol. 41, No. 4/5, pp. 463-474, [Lala 94] Lala, J. H. and R. E. Harper, “Architectural
Aspirin as a Cancer Prevention AgentPhilip J. MoosIntroductionPrevention of prevalent diseases, such as cancer, through the daily ingestion of lowcost drugs or vitamins has not general y met with great success based on evidencefrom large-scale clinical trials.[1,2] However, recent clinical data indicate that aspirin(acetylsalicylic acid, ASA) and its precursor, salicylate, which have been us