时间:2024-07-28
DIAO Ming,YU Lianhua,*,and CHEN Xiaobo
1.College of Information and Communication Engineering,Harbin Engineering University,Harbin 150001,China;
2.College of Automation,Harbin Engineering University,Harbin 150001,China
Abstract:In order to make systems that are based on unreliable components reliable,the design of fault tolerant architectures will be necessary.Inspired by von Neumann’s negative AND(NAND)multiplexing and William’s interwoven redundant logic,this paper presents a fault tolerant technique based on redundancy-modified NAND gates for future nanocomputers.Bifurcation theory is used to analyze fault tolerant ability of the system and the simulation results show that the new system has a much higher fault tolerant ability than the conventional multiplexing based on NAND gates.According to the evaluation,the proposed architecture can tolerate a device error rate of up to 10-1,with multiple redundant components.This fault tolerant technique is potentially useful for future nanoelectronics.
Keywords:fault tolerant,multiplexing,redundancy-modified negative AND(NAND)gate.
With the rise of nanoelectronic technology,we are inevitably faced with the question of how to build reliable systems out of unreliable components.To tackle this problem,several fault tolerant techniques based on redundancy have been investigated,such as N-tuple modular redundancy(e.g.,triple modular redundancy)[1-3],reconfiguration[3-6],quadded logic[7,8]and other redundancy techniques[9-14].However,with these techniques alone,high fault tolerance is hard to achieve for nanocomputers since faulty components are pervasive in space and time.Thus,von Neumann’smultiplexing,which essentially treats unreliable components as an integral part of the system,has received attention[15].In 1952,von Neumann initiated the study of multiplexing technique[16],since then,a wealth of papers reporting performance analysis of multiplexing have been published.The multiplexing technique theoretically demonstrated may work effectively with a practically acceptable redundancy overhead[17]and has been studied as an effective fault tolerant technique for protection against the increasing transient faults in nanoelectronic circuits[15,18-22].The vast majority of those studies have focused on negative AND(NAND)multiplexing[18-23],majority multiplexing[24-26]and negative OR(NOR)multiplexing[27],where NAND multiplexing obtained the most extensive attention.
However,the traditional NAND multiplexing architecture can only tolerate a device error rate of up to 10-2,in other word,its fault tolerant ability is not very satisfactory.In this paper,instead of the NAND gate which is used for two-layer error correction,we adopt the redundancy modified NAND gate[28]which is used for single-layer error correction to build restorative units.As a result,the traditional NAND multiplexing has been improved to have a much better fault tolerant ability.And different from other papers,here,we focus our study on the exclusive OR(XOR)multiplexing,which was presented for the first time in our last work[29].The improved multiplexing architecture consists of XOR gates and redundancy-modified NAND gates and comprises an executive unit and restorative units.Restorative units perform the error correction function,and they are composed of redundancy-modified NAND gates.The system performance of the architecture is evaluated by studying its fault tolerant ability,i.e.,gate error threshold and input signal error threshold,where the gate error threshold is the maximum gate error probability that the system can still work properly,and the input signal error threshold is the maximum input signal error probability that the system can tolerate.Multiplexing based on redundancy-modified NAND has a much better fault tolerant ability than traditional multiplexing based on NAND gates.XOR multiplexing system has a unique feature,we name it as critical point property,which can indicate the fault tolerant ability of the system.
The rest of this paper is arranged as follows.In Section 2,the error distributions in executive unit and redundancy-modified NAND multiplexing system are pre-sented.In Section 3,we discuss the bifurcation analysis which is followed by Section 4:fault tolerant ability analysis of redundancy-modified NAND multiplexing system.And Section 5 is the conclusion.
The multiplexing technique uses redundant components to obtain reliable synthesis from unreliable components.To realize this technology,two types of units are usually required:they are executive unit and restorative unit.
The executive unit of redundancy-modified NAND multiplexing could be XOR,exclusive negative OR(XNOR),NOR or NAND,and here,we choose the XOR unit as the executive unit to analyze redundancy-modified NAND multiplexing.As shown in Fig.1,the XOR multiplexing unit has the same structure with the NAND multiplexing unit.Consider an XOR gate with an error probability ε.Duplicate the XOR gate N times,and replace each input of the XOR gate as well as its output by a bundle of N lines.The XOR multiplexing unit takes two bundles of N wires as inputs and generates a bundle of N wires as outputs.X,Y and Z are the inputs and output of the multiplexing unit,respectively.
Fig.1 NAND and XOR multiplexing units
The randomizing unit U performs random permutation.Despite this operation,the inputs from the first bundle are randomly paired with those in the second bundle to form the input pairs to the duplicated XOR gates[19].
In systems based on this construction,each signal is carried on a bundle of N wires instead of a single wire and every logic computation is done by N duplicated gates simultaneously.Assume that the error probability ε is a constant and the type of fault the XOR gate makes is that it inverts its output;i.e.,acts as an XNOR gate(a von Neumann error).
The XOR multiplexing unit is shown in Fig.1(b).Assume that(x,y,z)are relative levels of excitation of the two input bundles and of the output bundle,respectively.If the error probabilities in the two input wires are independent,the probability of the output being stimulated is z=x(1-y)+y(1-x)(assume that the XOR gate is fault free).If each gate has a probability ε of making a von Neumann error,the probability of its output is found stimulated as
And the probability of its output being non-stimulated is 1-z.If the N XOR gates function independently,the entire XOR multiplexing unit constitutes a Bernoulli sequence.The distribution of the probability of the stimulated output is,therefore,the binomial distribution,and the probability of exactly k outputs being stimulated is then
And when N is rather large(N>1 000),the probability density of k can be shown as
Therefore,the error probability of the XOR multiplexing unit is approximately normally distributed when N is rather large(N>1 000)[19].
If we assume that the two input bundles of the executive unit have almost the same stimulated or non-stimulated levels(which is likely in circuits),then for XOR,the error probability of the output bundle will always be larger than the error probability in either one of the in put bundles.Hence,we need a unit to restore the original stimulated level of the executive unit and this unit is the restorative unit,also known as the restoring organ.Any logic gates,like NAND gate,NOR gate,AND-OR gates,effectively alternate critical errors and subcritical errors,thereby performing error correction and could be used as restoring organs.Note that an error is critical if its occurrence on one of the inputs causes an incorrect output,while a subcritical error is any error which is not critical error.There are different types of restoring organs.Two layers restoring organ that uses two inputs NAND gates can be found in von Neumann’s NAND multiplexing[15,16,19-21],but here,instead of two layers organs,we use a single layer restoring organ to realize it.This single layer restoring organ consists of four inputs AND-OR-NOT gate,which is also called redundancy-modified NAND gate[28].The redundancy-modified NAND gate corrects all single error.Its notation is shown in Fig.2.
Fig.2 Notation of AND-OR-NOT gate
If four redundant signals x1,x2,x3,x4are used as inputs to this gate,the circuit can be arranged so that the output is
If none or one of the xisignals is in error,then the output will be the correct value of z.Assume that the redundancy-modified NAND gate has the same error probability ε of making a von Neumannerror.Denote the probabilities of the four inputs and output being stimulated by X1,X2,X3,X4and Z,respectively.And further assume that the four inputs are independent.Then the probability of the output of redundancy-modified NAND gate being stimulated is
Multiplexing system is composed of an executive unit and multiple restoring organs.In order to compare it with others,we consider three multiplexing systems based on three different restoring organs,which are redundancy-modified NAND restoring organ,von Neumann’s NAND restoring organ and another two layers restoring organ which uses AND gates and OR gates.One version of the AND-OR restoring organ is shown in Fig.3.Note that it is also possible to have the OR gates in the first layer and the AND gates in the second,or to use them randomly instead of systematic interconnections.
Fig.3 AND-OR restoring organ
Assume that AND gates and OR gates have the same error probability ε of making a von Neumann error and the probability of inputs being stimulated is X.Then the probability of the outputs being stimulated is
Three kinds of XOR multiplexing are shown in Fig.4,where the XOR multiplexing based on NAND restoring organs was presented in[29]for the first time.For simplicity,below,we use X NAND represents XOR multiplexing based on NAND restoring organs,X RMNAND represents XOR multiplexing based on redundancy-modified NAND restoring organs and X AND-OR represents XOR multiplexing based on AND-OR restoring organs,respectively.In order to make the system stable,multiply restoring organs would be necessary and note that the odd stage number is necessary to keep the XOR function.
Fig.4 Three kinds of XOR multiplexing system
In order to gain understanding of the system dynamics of probabilistic logics and the associated concept of reliable computation,here,we focus on the operation of individual unreliable two inputs XOR gates and four inputs redundancy-modified NAND gates in a binary tree of cascaded XOR gates and redundancy-modified NAND gates,which is shown in Fig.5[15,18].We assume that the XOR gates and redundancy-modified NAND gates make a von Neumann error with the same probability ε,while the input and output wires function reliably.
Fig.5 Schematic of a full binary tree whose nodes are faulty two inputs XOR gates and redundancy-modified NAND gates with error probability ε
Denote the probabilities of the two inputs of XOR gate being stimulated by X,Y and further assume that the two inputs are independent.Then the probability of the output of the XOR gate being stimulated is
Assume that this circuit is a discrete time system and all inputs to the XOR gates are independent and each two inputs of those XOR gates have the same probabilities X and Y being stimulated.This structure guarantees that the inputs to all redundancy-modified NAND gates at an arbitrary stage n are also independent and have same probabilities Znbeing stimulated[15].Then for the second stage,the first stage of redundancy-modified NAND gates,the probabilities of the output being stimulated would be Z2:
For such a construction,(8)reduces to a simple iterative equation
Do the same operation to X AND-OR and X NAND and their iterative equations are
In order to better understand the relationship between system dynamics and gate error probability,we carry out bifurcation analysis of(9),(10)and(11),respectively[15,18].For any ε∈ [0,1/2],we choose an initial arbitrary condition X,Y and generate sequence Zi(i=1,2,...),by iterating(9),(10)and(11)until they converge to an attractor.Those attractors for the sequences are then plotted against each ε[15,18].This leads to the bifurcation diagram in Fig.6(Δε=0.005).
Fig.6 Bifurcation diagram of XOR multiplexing system
As can be seen from Fig.6,X RMNAND has a larger error threshold than X NAND and X AND-OR.And X AND-OR has exactly the same bifurcation diagram as the X NAND.This means that although their components and iterative equations are different,they have the same gate error tolerant ability,and their abilities are much smaller than the ability of X RMNAND.Obviously,compared to the NAND restoring organ and the AND-OR restoring organ,the redundancy-modified NAND restoring organ has a more superior fault tolerant performance.The analysis of the fault tolerant ability of X RMNAND will be discussed in the next section.
From Fig.6 we can see that the bifurcation points(error thresholds)divide the diagram into two regions,which are(i)0 < ε < ε∗(0 < ε<)and(ii)ε∗< ε< 1/2(< ε< 1/2).By solving the equation
one obtains Z0,where Z0is the fixed point solution for interval 0<ε<1/2.
In region(ii),Z0is a stable fixed point.By stable,it means that for any arbitrary initial inputs condition X and Y,the output Znwill converge to Z0when n is large,which means in this region,the system no longer functions.Expanding(12),we get a univariate quartic equation
which may be written as
where
For further details,please refer to[30].For X NAND,when< ε< 1/2,the system also has a stable fixed point solution,by solving the equation z0=1-ε+(2ε-1),then we get
Since X AND-OR has exactly the same bifurcation diagram as X NAND,the fixed point solution of X AND-OR is(21)too.
When 0? ε< ε∗,Z0loses stability,and the motion is periodic with a period of 2.We denote those two points by Z+and Z-.In this region,at the nth stage,when Z+is the input,then Z-is the output and vice versa[15,18].That means
Clearly,when Z+=Z-,we have ε= ε∗.By solving(22)and(23),it can be derived that ε∗=0.170 63.Now it is easy to see that 0? ε< ε∗is the parameter interval where the system functions and ε∗is the gate error threshold.When ε> ε∗,the outputs converge to stable fixed point Z0regardless of what the initial inputs are.The gate error thresholds of X NAND and X AND-OR can be derived in the same way.They both are 0.088 56.Obviously,multiplexing architectures based on redundancy-modified NAND restoring organs can tolerate a device error rate of up to 10-1,while multiplexing architectures based on NAND restoring organs and AND-OR restoring organs can only tolerate a device error rate of up to 10-2.Properly speaking,the gate error tolerant ability of X RMNAND is almost twice as much as that of X NAND and X ANDOR.There is no doubt that through the use of redundancy modified NAND restoring organs,the gate error tolerant ability of multiplexing has been greatly improved.
In the last subsection,we analyze one aspect of fault tolerant ability of the system,namely,the tolerant ability for gate error.Now let us analyze the tolerant ability for input signal error.In order to map each output probability to a logic state,we need a threshold z∗.According to Fig.6,a simple scheme is to set threshold at z∗=Z0(ε∗).Z0(ε∗)is a good approximation of Z0(|Z0(ε)-Z0(ε∗)|?0.004 3).However,in order to obtain more precise analysis,it is better to set z∗=Z0(and set=z0for X NAND and X AND-OR).
Below,we shall interpret[0,z∗)as non-stimulated state and(z∗,1]as stimulated state or[0,)as non-stimulated state and(,1]as stimulated state for X NAND and X AND-OR.
For X RMNAND, fix the inputs Y=1 and Y=0,then we can get 3-D diagrams as showed in Fig.7.
Fig.7 3-D diagrams of XOR multiplexing
A fact has been noticed that for each different fixed Y,when 0? ε< ε∗,there are different values of X(here we call it as critical point and denoted by x0)dividing the output into two states.Take Y =1 as an example,for 0? ε < ε∗,when X < x0,the output would be stimulated and when X>x0,the output would be non-stimulated.The calculation of the critical point can help us more intuitively understand the fault tolerant ability of the system.Now let us discuss the mathematic relation between them.When n is large enough and 0? ε < ε∗,the output is only dependent on the input condition:input X and input Y have the same logic state(both stimulated or both nonstimulated)or have different logic states(one of the inputs is stimulated and the other one is non-stimulated).Let us denote the probability that X and Y have different logic states by P1,and denote the probability that two inputs X and Y have the same logic state by P2.Thus the ratios of P1and P2will be a key parameter to determine the final output Znis larger than z∗or not.X and Y are the probabilities of in puts being stimulated,then1-X and1-Y are the probabilities of inputs being non-stimulated.P1and P2are shown as follows:
If we need the output to be stimulated,then P1/P2must be larger than a specific value which is greater than one.Since the output logic state is associated with the output threshold z∗,so the specific value would be a function of z∗and the mathematic relation between them is shown as follows:
Clearly,P1+P2=1,hence,(26)is equivalent to
IfP1> z∗,the final out put would be larger than the threshold(stimulated).When the inequality becomes
the final output would be smaller than the threshold(nonstimulated).Hence,it is easy to obtain the critical point x0by solving the equality
X NAND and X AND-OR have the same property,and their critical pointcan be obtained by solving
Clearly,the critical point is a function of Y and ε.Since when ε∗< ε< 1/2,the system no longer functions,we only consider the region of 0?ε?ε∗.Threshold z∗()changes monotonically with each different gate error probability ε,here,we choose the minimum ε=0and the maximum ε = ε∗as representatives.Plot those corresponding critical points against each Y(ΔY =0.01),this lead to Fig.8.
Just like the bifurcation diagram,X NAND and X AND-OR have exactly the same critical points distribution.Fig.8 shows that there is slight difference between critical points corresponding to ε= 0 and those corresponding to ε= ε∗.And the difference of X NAND and X AND-OR is more obvious than the difference of X RMNAND.This indicates that compared to X NAND and X AND-OR,the critical points of X RMNAND are less affected by the gate error probability ε.As we can see,for each different Y,the critical point has different values and there is a parameter interval that makes the system no longer function even though the system is fault free(ε=0).If the stimulated probability of one of the inputs is in this interval,then the outputs will always be non-stimulated even though the other input is 100%being stimulated or non-stimulated,namely,the output is stuck at 0.
Fig.8 Critical point against each Y
For more rigorous analysis of the results,we analyze the maximum interval that corresponds to ε=0.This parameter interval of X RMNAND is about 0.475 1<Y<0.524 9.Thus,the invalid input signal stimulation probability interval occupies only 4.98%.While the parameter interval of X NAND and X AND-OR is about 0.382 0<Y<0.6180,and the invalid input signal stimulation probability interval occupies 23.6%,which is much larger than 4.98%.This means that the X RMNAND has a much better input signal error tolerant ability than X NAND and X AND-OR.To be more specific,compared to X NAND and X AND-OR,the invalid input signal stimulation probability interval of X RMNAND is reduced by 78.898%.There is no doubt that this is a significant improvement in input signal error tolerant ability.
In order to more intuitively understand tolerant ability for input signal error of the system,several fixed Y and the corresponding critical points are extracted from Fig.8.These lead to Table1,obviously,X RMNAND can tolerate much more input signal error than X NAND and X ANDOR do.
Table 1 Critical points for several fixed Y(ε=0)
Assume that a stimulated output is expected,and then two inputs should have different logic states.Let us take Y=0.7 as an example,in X RMNAND,it can be seen that when input Y has a probability of 70%being stimulated,any stimulated probability smaller than 43.78%of the other input X could be accepted,namely,any nonstimulated probability not larger than 56.22%of input X will work for the system.This means that the system can tolerate input signal error probabilities of 30%and 43.78%for the inputs Y and X.X NAND and X AND-OR can only tolerate error probabilities of 30%and 20.49%for the inputs Y and X.This indicates that in the case of tolerating error probability of 30%for input Y,the tolerant ability of X RMNAND for the other input signal X is about 2.136 6 times that of X NAND and X AND-OR.
The emerging nanometer-scale devices are harder to fabricate and more sensitive to external influences. This makes nano-meter devices more prone to transient faults than complementary metal oxide semiconductor(CMOS) devices.The same as NAND multiplexing,although the XOR multiplexing technique is efficient against the increasing transient faults,it is less efficient against manufacturing defects or permanent faults.And because of the high redundancy,it may be more suitable for high hardware cost applications rather than low hardware cost applications.
In order to make systems based on nanoelectronic devices reliable,the design of fault tolerant architectures will be necessary.This paper can be seen as a part of the endeavor devoted to this work.In this paper,we study multiplexing architectures based on redundancy-modified NAND restoring organs and compare it to two other multiplexing architectures based on traditional NAND restoring organs and two layers AND-OR restoring organs,respectively.Experimental results show that this new fault tolerant architecture has a much higher fault tolerant ability of gate error probability and input signal error probability.This architecture is potentially effective in protection against the increasing transient faults for systems based on unreliable nanometer-scale devices.
我们致力于保护作者版权,注重分享,被刊用文章因无法核实真实出处,未能及时与作者取得联系,或有版权异议的,请联系管理员,我们会立即处理! 部分文章是来自各大过期杂志,内容仅供学习参考,不准确地方联系删除处理!