Detail publikace

Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems

Originální název

Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems

Anglický název

Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems

Jazyk

en

Originální abstrakt

This paper presents an analysis of the fault tolerance achieved by an autonomous, fully embedded evolvablehardware system, which uses a combination of partial dynamic reconfiguration and an evolutionary algorithm (EA). Itdemonstrates that the system may self-recover from both transient and cumulative permanent faults. This self-adaptive system, based on a 2D array of 16 (4×4) Processing Elements (PEs), is tested with an image filtering application. Results show that it may properly recover from faults in up to 3 PEs, that is, more than 18% cumulative permanent faults. Two fault models are used for testing purposes, at PE and CLB levels. Two self-healing strategies are also introduced, depending on whether fault diagnosis is available or not. They are based on scrubbing, fitness evaluation, dynamic partial reconfiguration and in-system evolutionary adaptation. Since most of these adaptability features are already available on the system for its normal operation, resource cost for self-healing is very low (only some code additions in the internal microprocessor core).

Anglický abstrakt

This paper presents an analysis of the fault tolerance achieved by an autonomous, fully embedded evolvablehardware system, which uses a combination of partial dynamic reconfiguration and an evolutionary algorithm (EA). Itdemonstrates that the system may self-recover from both transient and cumulative permanent faults. This self-adaptive system, based on a 2D array of 16 (4×4) Processing Elements (PEs), is tested with an image filtering application. Results show that it may properly recover from faults in up to 3 PEs, that is, more than 18% cumulative permanent faults. Two fault models are used for testing purposes, at PE and CLB levels. Two self-healing strategies are also introduced, depending on whether fault diagnosis is available or not. They are based on scrubbing, fitness evaluation, dynamic partial reconfiguration and in-system evolutionary adaptation. Since most of these adaptability features are already available on the system for its normal operation, resource cost for self-healing is very low (only some code additions in the internal microprocessor core).

BibTex


@inproceedings{BUT76495,
  author="Ruben {Salvador} and Andres {Otero} and Javier {Mora} and Eduardo {De la Torre} and Lukáš {Sekanina} and Teresa {Riesgo}",
  title="Fault Tolerance Analysis and Self-Healing Strategy of Autonomous, Evolvable Hardware Systems",
  annote="This paper presents an analysis of the fault tolerance achieved by an autonomous,
fully embedded evolvablehardware system, which uses a combination of partial
dynamic reconfiguration and an evolutionary algorithm (EA). Itdemonstrates that
the system may self-recover from both transient and cumulative permanent faults.
This self-adaptive system, based on a 2D array of 16 (4×4) Processing Elements
(PEs), is tested with an image filtering application. Results show that it may
properly recover from faults in up to 3 PEs, that is, more than 18% cumulative
permanent faults. Two fault models are used for testing purposes, at PE and CLB
levels. Two self-healing strategies are also introduced, depending on whether
fault diagnosis is available or not. They are based on scrubbing, fitness
evaluation, dynamic partial reconfiguration and in-system evolutionary
adaptation. Since most of these adaptability features are already available on
the system for its normal operation, resource cost for self-healing is very low
(only some code additions in the internal microprocessor core).",
  address="IEEE Computer Society",
  booktitle="Proc. of the 2011 International Conference on ReConFigurable Computing and FPGAs",
  chapter="76495",
  edition="NEUVEDEN",
  howpublished="print",
  institution="IEEE Computer Society",
  year="2011",
  month="december",
  pages="164--169",
  publisher="IEEE Computer Society",
  type="conference paper"
}