Previous: Case Study: Multiply Up: Empirical Review
We have already noted that conventional FPGAs are poor at handling a
functional diversity which is larger than the aggregate functional capacity
provided by a single device (Section ). Handling larger
diversity may require reloading the FPGA programming, a slow process for
conventional FPGAs. During the reload time, the device goes largely
unused. Alternately, a more generic processing unit can be built on top of
the FPGA and microsequenced like a processor. In the most extreme case of
spatial limitations, we might end up building a processor-like design on
top of the FPGA. Table
summarizes the capacity
density provided by several processors which have been built on top of
FPGAs.
From Table , we see that such processors, when
optimized for the FPGA, have a peak capacity of about 2 ALU bit
operations/
, or about one fourth the capacity of a
custom processor. The architecture for R16 and jr16 are moderately
straight RISC processor architectures, and are likely to yield about the
same fraction of this capacity as most other RISC processors.
At a 4 penalty from custom processors, for high diversity
operations, one would certainly be better off using, or building, a custom
processor. As the commonality in the computational task increases and the
area available to the FPGA increases, the FPGA can build more application
specialized structures, realizing higher capacity density. This suggests
there is a continuum between the most highly diverse functional operations,
where FPGAs are 4
less dense than processors, to the most regular
operations, where FPGAs provide 10-100
more performance density.
It is also interesting to note that the performance density penalty for handling these highly diverse operations on an FPGA is much less than the performance density penalty associated with implementing a multiplication on the FPGA.
With only a 4 performance density penalty, an FPGA
processor is roughly equivalent to a 4
smaller processor. From
table
, we have seen aggregate processor capacity
increase from 15M
in 1984 to 5G
in 1995, or about
70% per year. The 4
capacity density thus puts a processor
implemented on an FPGA implemented in a modern processes roughly equivalent
to a 2.5-3 year old processor. As such, FPGA processors -- which can ride
the FPGA technology to track technology advances -- may be an attractive
option for running legacy assembly code.