NCSU Libraries
Search the Collection|Browse Subjects|Services|Library Information|Community |News & Events

Title page for ETD etd-07162007-144331


Type of Document Dissertation
Author Reddy, Vimal Kodandarama,
Author's Email Address vkreddy@ncsu.edu
URN etd-07162007-144331
Title EXPLOITING MICROARCHITECTURE INSIGHTS FOR EFFICIENT FAULT TOLERANCE
Degree PhD
Graduate Program Computer Engineering
Advisory Committee
Advisor Name Title
Eric Rotenberg Committee Chair
Suleyman Sair Committee Member
Thomas M. Conte Committee Member
Warren Jasper Committee Member
Keywords
  • partial redundant threading
  • fault tolerance
  • predictive checking
  • microarchitecture insights
  • slipstream
  • inherent time redundancy
  • fault-tolerant superscalar processor
Date of Defense 2007-07-15
Availability unrestricted
Abstract
Technology scaling makes transistors more susceptible to transient faults. As a result, it is becoming increasingly important to incorporate transient fault tolerance in future processors. Traditional transient fault tolerance approaches duplicate in time or space for robust fault tolerance, but are expensive in terms of performance, area, and power, counteracting the very benefits of technology scaling. To make fault tolerance viable for commodity processors, unconventional techniques are needed that provide significant fault protection in an efficient manner. In this spirit, this thesis presents two low-overhead approaches to fault tolerance based on microarchitecture insights.

First, prediction-based partial redundant threading (PRT) is presented as a low-overhead alternative to full redundant multithreading (RMT). In RMT, two copies of a program are executed on a simultaneous multithreading (SMT) substrate. Outcomes of duplicated instructions are compared to detect transient faults in the processor. RMT incurs high performance and power overheads due to full redundant execution (as high as 40% slowdown). In prediction-based PRT, confident predictions are leveraged as effective proxies for redundant execution, based on the idea that a correct prediction of an instruction?s outcome is the same as the outcome produced by fault-free execution of the instruction. Confidently-predicted instructions and their producers are skipped in the redundant thread (as many as 57% instructions skipped). This predictive thread is shown to be as effective as a full thread for checking purposes, but much more efficient.

Second, a superscalar processor is designed with built-in checks that indirectly detect low-level transient faults, by observing microarchitecture-level anomalies they cause. A single check covers many logic blocks, similar in spirit to outcome checks in RMT, but without the overheads of redundant execution. This dissertation develops several novel microarchitecture-level fault checks for protecting critical superscalar processor structures. Most notably, 1) inherent time redundancy (ITR) exploits program repetition to detect faults in decode signals, thereby covering the fetch and decode units, 2) register name authentication (RNA) asserts consistencies among renaming structures to detect faults affecting register renaming, and 3) timestamp-based assertion checking (TAC) asserts sequential order among dependent instructions to detect faults affecting dynamic instruction scheduling. Based on these checks, a fault-checking regimen is engaged to comprehensively protect a superscalar processor pipeline. To evaluate fault tolerance of the processor, a new fault injection strategy is developed. It involves analyzing the microarchitecture of a superscalar processor in depth and identifying high-level faults which can be modeled in a timing simulator, enabling a fast and reasonably accurate evaluation. Exclusive fault injection experiments reveal that the new fault-checking regimen provides substantial fault coverage to the processor, making the case for a canonical fault-tolerant superscalar processor.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  etd.pdf 1.35 Mb 00:06:15 00:03:12 00:02:48 00:01:24 00:00:07