Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks (FCCM 2006)


Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks

Article by Nachiket Kapre, Nikil Mehta, Michael deLorimier, Raphael Rubin, Henry Barnor, Michael J. Wilson, Michael Wrighton, and André DeHon published in Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM2006, April 24--26, 2006), pp. 205--216.

Dedicated, spatially configured FPGA interconnect is efficient for applications that require high throughput connections between processing elements (PEs) but with a limited degree of PE interconnectivity (e.g. wiring up gates and datapaths). Applications which virtualize PEs may require a large number of distinct PE-to-PE connections (e.g. using one PE to simulate 100s of operators, each requiring input data from thousands of other operators), but with each connection having low throughput compared with the PE’s operating cycle time. In these highly interconnected conditions, dedicating spatial interconnect resources for all possible connections is costly and inefficient. Alternatively, we can time share physical network resources by virtualizing interconnect links, either by statically scheduling the sharing of resources prior to runtime or by dynamically negotiating resources at runtime. We explore the tradeoffs (e.g. area, route latency, route quality) between time-multiplexed and packetswitched networks overlayed on top of commodity FPGAs. We demonstrate modular and scalable networks which operate on a Xilinx XC2V6000-4 at 166MHz. For our applications, timemultiplexed, offline scheduling offers up to a 63% performance increase over online, packet-switched scheduling for equivalent topologies. When applying designs to equivalent area, packetswitching is up to 2× faster for small area designs while timemultiplexing is up to 5× faster for larger area designs. When limited to the capacity of a XC2V6000, if all communication is known, time-multiplexed routing outperforms packet-switching; however when the active set of links drops below 40% of the potential links, packet-switched routing can outperform timemultiplexing.

© 2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

This paper was recognized in the FCCM20 collection of the most significant papers during the first 20 years of the conference. See retrospective endorsement.

Paper

IEEE Xplore link: abstract
DOI: 10.1109/FCCM.2006.55
Author's personal copy in PDF: ps_tm_networks_fccm2006.pdf [180KB]