The following content has been provided by the University of Erlangen-Nürnberg.
Okay, there we go again. So it's great to be back here in Erlangen. When I was here
last I had just started my postdoc at the University of Virginia talking about the work
that I was doing as a PhD student. That was lifetime reliability and design for manufacturability,
system level design type stuff. Today I'm going to be talking about something completely
different. I now have some research in safety critical system design. And fingerprinting
I have found is a very poor choice of a word, but now I'm stuck with it because I've done
some publishing using the word fingerprinting. Not talking about your fingerprints, talking
about compression for the purposes of easy redundancy checking. That's what I mean when
I say fingerprinting. So we'll see what exactly we're talking about in just a second there,
but wanted to clear that up so that you didn't start my talk with some wrong idea about what
I'm going to spend the next 45 minutes talking about. Technical difficulties abound. Okay,
I'll just use my finger instead. So safety critical systems are becoming a part of our
everyday life. Going back a couple of decades we've had computers integrated with our cars.
Now we're seeing drive by wire where when you push your foot on the brake that's not
the only input that might possibly control the brake. And very soon we'll see steering
by wire and other such things. But there's safety critical systems in far more places
too. We had Phillip's talk earlier about medical devices. So there's more and more computers
and medical devices. And what's interesting about automotive domain and the medical system
domain is that we want low cost reliability. Obviously we want reliability because these
systems could affect whether we live or die. And embedded system security is demonstrating
that it is very easy actually to kill someone that is in a car because you can hack the
car and so on. So I'm not going to talk about security today. Obviously we want these systems
to be safe. Automotive and medical systems are different from aerospace systems because
in an aerospace system you're spending hundreds of millions of dollars, most of it not on
the computers that go into it. You can spend whatever you want on the computers in an airplane
because it's just a fraction of the cost of an airplane. But in cars if we can reduce
the cost of the electronics in the car by a dollar or two and then you ship 20 million
cars then all of a sudden you've had a very significant impact on the cost of manufacturing,
maintaining and all these other things. When we talk about reducing cost though you might
get scared because you don't want it to become less safe. So the question today facing automotive
system designers, health system designers is how can we achieve the same level of reliability
or even better reliability while shaving costs or increasing the number of features that
we have. So just as a little bit of background, when we talk about reliability in an automotive
domain or a health system domain really what we are most concerned about is transient upset.
So transient upsets are caused by either radiation from the packaging for the most part or cosmic
radiation. So high energy particles that are coming from the sun that don't happen to be
filtered by our atmosphere for instance. And what happens is you have a particle strike
some silicon atom in the silicon lattice and then you have charge ionization as a result.
So all the energy from the particle is transferred into the silicon lattice which shakes a bunch
of electrons free essentially. These electrons then are collected by a diffusion area in
one of your transistors and what this can do is you can have a current spike. You can
have a current spike that occurs in your transistor and as a result you have some computation
you are supposed to get a one instead you get a zero or vice versa. Now what's interesting
about transistor scaling which has enabled so many of these fabulous applications like
putting 100 microprocessors in your car or putting semiconductors in a band aid for instance
like we were talking about this morning. The problem with transistor scaling is that when
we make a transistor half the size the amount of charge required to disrupt it also goes
down by half. So what we see when we move from 180 nanometer technology to 16 nanometer
technology is that the failure rate for a single bit goes up by sorry the failure rate
Presenters
Prof. Dr. Brett Meyer
Zugänglich über
Offener Zugang
Dauer
01:05:43 Min
Aufnahmedatum
2014-03-21
Hochgeladen am
2014-03-28 14:54:08
Sprache
de-DE
Prof. Brett Meyer (McGill University, Canada)
Recently, the combination of semiconductor manufacturing technology scaling and pressure to reduce semiconductor systemcosts and power consumption has resulted in the development of computer systems responsible for executing a mix ofsafety-critical and non-critical tasks. However, such systems are poorly utilized if lockstep execution forces all processor coresto execute the same task even when not executing safety-critical tasks. Execution fingerprinting has emerged as analternative to n-modular redundancy for verifying redundant execution without requiring that all cores execute the same taskor even execute redundant tasks concurrently. Fingerprinting takes a bit stream characterizing the execution of a task andcompresses it into a single, fixed-width word or fingerprint.
Fingerprinting has several key advantages. First, it reduces redundancy-checking bandwidth by compressing changes toexternal state into a single, fixed-width word. Second, it reduces error detection latency by capturing and exposingintermediate operations on faulty data. Third, it naturally supports the design of mixed criticality systems by making dual-,triple-, and n-modular redundancy available without requiring significant architectural changes. Fourth, while it can’tguarantee perfect error detection, error detection probabilities and latencies can be tuned to a particular application.Together, these advantages translate to improved performance for mixed-criticality systems.
In this talk, I will describe fingerprinting in safety-critical systems and explore the various trade-offs inherent in itsapplication at the architectural level and choices related to fingerprinting subsystem design, including: (a) determining whatapplication data to compress, as a function of error detection probability and latency, and (b) identifying a correspondingfingerprinting circuit implementation.