Publication detail

On Precise Fault Localization and Identification in NoC Architectures

ŠŤÁVA, M.

Original Title

On Precise Fault Localization and Identification in NoC Architectures

Type

conference paper

Language

English

Original Abstract

For network-on-chip (NoC), this paper presents a novel online fault-tolerance method based on precise fault localization and identification. We introduce a concept of distinguishing between intra-switch path faults, a concept of retransmission credit as a method of distinguishing between permanent and transient faults, and a concept of long transient recovery timeout as a method of distinguishing between short and long (or burst of) transient faults. Another concept of monitoring errors separately on links and switches is also employed. The fault-tolerance concepts introduced bring the higher performance of NoCs in comparison to existing error recovery schemes. Experimental results show the performance and resource utilization of the proposed NoC error recovery scheme.

Keywords

fault tolerance, error recovery, network on chip, permanent or transient fault, performance, resource utilization

Authors

ŠŤÁVA, M.

Released

28. 8. 2019

Publisher

EUROMICRO

ISBN

978-1-7281-2861-0

Book

22nd Euromicro Conference on Digital System Design, DSD 2019

Pages from

451

Pages to

457

Pages count

7

BibTex

@inproceedings{BUT157223,
  author="Martin {Šťáva}",
  title="On Precise Fault Localization and Identification in NoC Architectures",
  booktitle="22nd Euromicro Conference on Digital System Design, DSD 2019",
  year="2019",
  pages="451--457",
  publisher="EUROMICRO",
  doi="10.1109/DSD.2019.00075",
  isbn="978-1-7281-2861-0"
}