| Programmable Interconnect Design |
As we continue to build larger computing systems and scale to
smaller devices, interconnect becomes an increasingly important
design concern and design issue.
- As system size increases, the number of components we
are interconnecting increases.
- As feature sizes decrease, the relative costs of interconnect
are increasing.
A specific goal of the interconnect research is to make it straight
forward for future designers of large-scale computing systems (
e.g. multiprocessors, FPGAs, and System-on-a-Chip designs) to
evaluate the available technology parameters and application needs and
systematically engineer a suitable level and style of interconnect.
Specific targets of this research include: characterizing interconnect
requirements and matching these with universal networking structures,
understanding fundamental switching requirements, understanding the
design impact of multilayer wiring, characterizing delay impacts on
switching requirements, understanding fundamental algorithmic
difficulty of routing and tradeoffs therein, and understanding when
routing delays dominate the benefits of spatial parallelism.
The key problem in programmable and custom interconnect is
figuring out how to build interconnection structures to take
advantage of any locality and regularity in the computational
tasks in order to minimize resource usage (energy, area, routing
time) and maximize performance (maximize throughput, minimize
latency). Flat networks that provide uniform connectivity (e.g.
crossbars, multistage permutation networks) are moderately well
understood, but cost O(n2) in wiring area and O(n) switching
latency between computational nodes. Since application requirements
vary widely between needing O(n) wiring area and O(n2), and
there are clearly cases where O(1) latency between computational
nodes is achievable, the costs of these flat networks is
unreasonably high, especially as n becomes large
(already in the 10,000--100,000 range for FPGAs and in the millions
for full-custom designs; soon to be 1000's even for RISC processor
granularity nodes on chip). Consequently, the key challenge in
custom and programmable interconnect is to understand the locality
structure of the computational task (or set of tasks) which the
device should implement and design the interconnect structure to
support this with minimum resources.
The vision here is two fold:
- Identify key characteristics of the design which
define its resources requirements and performance; i.e.
find properties we can measure or estimate from the graph topology
(e.g. Rent parameters or bifurcator ratios), and use these
to provide bounds on the requisite area and perhaps performance and
energy.
- Identify and quantify key tradeoffs available to designs so
practitioners understand the options they have to minimize the cost
of their critical resources or quality metrics at the expense of
other metrics in their systems (e.g. reduce latency at the
expense of additional area; reduce switches at the expense of more wiring).
With these effects characterized and systematized, the practitioner
should be able to make some standard measurements of the characteristics
of his application set and get an initial idea of the interconnect
requirements for his design. This should lead him to the appropriate
topology and help him pick parameters within that design space to
meet his general needs. As he pushes further to tune an architecture
to meet his requirements, the characterization of key tradeoffs
help him understand the knobs available to him to tune the interconnect
and the kind of effects he can expect from these knobs. This
understanding should guide him quickly to a general design point.
For further reading, see Prof. DeHon's short article
on the role of interconnect.
Examples from our recent work include:
For a more complete list, see: André
collection of interconnect papers.