_TOP_MENU

Showing posts with label MTBF. Show all posts
Showing posts with label MTBF. Show all posts

Jun 27, 2016

What is MTBF ?


Few Interview questions on MTBF.

Q1. What is MTBF and why it is important for analysis ?

Q2. How to calculate MTBF ?

Q3. How to increase MTBF time , On what factor MTBF is depend ?

Q4. Does MTBF calculate on all flops in the design ?

Few terms-
Set-Up Time The time required for the input data signal at a flip flop to be valid before the incoming clock edge arrives.
Hold Time The time required for the input data signal to remain valid after the clock edge as transitioned.
Resolve Time: The amount of time the Flip Flop's output must return to a valid level before it's used.
This is 1/{clock frequency} - path delay. The output must be valid by the next clock, minus any chip or routing delay.
Path Delay = Tcko + Troute + Tsu;
.... Tcko = Clock to Output time of the flip flop,
.... Troute = Any trace delay between the the Q of the flip flop and the next device reading that data,
.... Tsu = any Set-Up time required by the next device reading the data.

Skew {Clock or data}: The change in time of one signal compared to another, caused by timing delays or propagation delays. ~The timing differences developed by different devices performing the same function.

Ambiguity: The uncertainty in the amount of time it takes for a valid logic signal to change from one state to another.

metastability Window: The specific length of time, during which both the data and clock should not occur. If both signals do occur, the output may go metastable.

Lets understand about MTBF.

MTBF is Mean Time Between Failure for flops,  In the design , when Flop will have failure ?
Flop will fail when it is fail to produce required output. Now question is , why will it fail to produce required output ?

As we know the reason of flip-flop failure is timing violations, stuck at zero, stuck at one, manufacturing defects , etc .. others are permanent like timing violations .. it's kind of permanent on the given condition for a flip flop.

In a given design, once it is close for timing, it means there are no setup/hold violations. But in the design , where more than one clock is used, and signal has to travel from one clock domain to other clock domain , have to go through synchronizers. These synchronizer helps in preventing metastability in flip flop but they dont work 100% all the time. To get an idea about the failure of design because on Synchronizer , we calculate MTBF.

So MTBF is calculated mainly on synchronizers in design.

Lets understand what is Metastability.

Metastable State – signal not 1 or 0 or oscillating for a nondeterministic length of time.
Can occur when insufficient energy is applied to cause a
latch to switch to either a 1 or 0.

Examples:
Dual processor with shared memory
FIFO with asynchronous input and output
Processor interrupts

Why Metastability is a “Special” Problem 
  • Because it “breaks most of the conceptual and computatonal tools that we use from day to day (e.g.,binary or two state circuits)
  • It defies careful and accurate measurements.
  • It can produce failures that leave no discernible evidence
  • It can cause failures in systems whose software is “correct” and whose hardware passes all conventional tests.
  • It involves magnitudes of time and voltage to be removed from our daily experience



To understand the effective efficiency of the system, we need to calculate the Metastability of the system, which can be done by calculating MTBF.
MTBF for a synchronization flip flop can be estimated with the following formula.



Where
F(clk )    is the system clock frequency.
F(data)  is the data transfer frequency.
T(met) is the additional time allowed for flip flop to settle.

C1 and C2 are device specific parameter found by plotting the natural log of MTBF versus T(met) and performing linear regression analysis on the data.

These are device dependent. Resolve time (among others) has to be looked up, via the data sheet (if it's provided).
As a rule: The faster the flip flop used, the better the MTBF for a given circuit.
The faster device families have lower Set-up and Hold times. This reduces the window of occurrence.