Previous: Case Study: Multiply Up: Empirical Review
We have already noted that conventional FPGAs are poor at handling a functional diversity which is larger than the aggregate functional capacity provided by a single device (Section ). Handling larger diversity may require reloading the FPGA programming, a slow process for conventional FPGAs. During the reload time, the device goes largely unused. Alternately, a more generic processing unit can be built on top of the FPGA and microsequenced like a processor. In the most extreme case of spatial limitations, we might end up building a processor-like design on top of the FPGA. Table summarizes the capacity density provided by several processors which have been built on top of FPGAs.
From Table , we see that such processors, when optimized for the FPGA, have a peak capacity of about 2 ALU bit operations/, or about one fourth the capacity of a custom processor. The architecture for R16 and jr16 are moderately straight RISC processor architectures, and are likely to yield about the same fraction of this capacity as most other RISC processors.
At a 4 penalty from custom processors, for high diversity operations, one would certainly be better off using, or building, a custom processor. As the commonality in the computational task increases and the area available to the FPGA increases, the FPGA can build more application specialized structures, realizing higher capacity density. This suggests there is a continuum between the most highly diverse functional operations, where FPGAs are 4 less dense than processors, to the most regular operations, where FPGAs provide 10-100 more performance density.
It is also interesting to note that the performance density penalty for handling these highly diverse operations on an FPGA is much less than the performance density penalty associated with implementing a multiplication on the FPGA.
With only a 4 performance density penalty, an FPGA processor is roughly equivalent to a 4 smaller processor. From table , we have seen aggregate processor capacity increase from 15M in 1984 to 5G in 1995, or about 70% per year. The 4 capacity density thus puts a processor implemented on an FPGA implemented in a modern processes roughly equivalent to a 2.5-3 year old processor. As such, FPGA processors -- which can ride the FPGA technology to track technology advances -- may be an attractive option for running legacy assembly code.