Implementation of Computation Group


Continuous Online Self-Monitoring Introspection Circuitry for Timing Repair by Incremental Partial-reconfiguration (COSMIC TRIP)

Hans Giesen, Benjamin Gojman, Raphael Rubin, Ji Kim, and André DeHon
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, (FCCM, May 1--3, 2016)

We show that continuously monitoring on-chip delays at the LUT-to-LUT link level during operation allows an FPGA to detect and self-adapt to aging and environmental effects on timing. Using a lightweight (<4% added area) mechanism for monitoring transition timing, a Difference Detector with First-Fail Latch, we can estimate the timing margin on circuits and identify the individual links that have degraded and whose delay is determining the worst-case circuit delay. Combined with Choose-Your-own-Adventure precomputed, fine-grained repair alternatives, we introduce a strategy for rapid, in-system incremental repair of links with degraded timing. We show that these techniques allow us to respond to a single aging event in less than 300ms for the toronto20 benchmarks. The result is a step toward systems where adaptive reconfiguration on the time-scale of seconds is viable and beneficial.

