Today's FPGA compilation is slow because it compiles and co-optimizes the entire design in one monolithic mapping flow. This achieves high quality results but also means a long edit-compile-debug loop that slows development and limits the scope of design-space exploration. We introduce PRflow that uses partial reconfiguration and an overlay packet-switched network to separate the HLS-to-bitstream compilation problem for individual components of the FPGA design. This separation allows both incremental compilation, where a single component can be recompiled without recompiling the entire design, and parallel compilation, where all the components are compiled in parallel. Both uses reduce the compilation time. Mapping the Rosetta Benchmarks to a Xilinx XCZU9EG, we show compilation times reduce from 42 minutes to 12 minutes (one case from 160 minutes to 18 minutes) when running on top of commercial tools from Xilinx. Using Symbiflow (Project X-Ray/Yosys/VPR), we show preliminary evidence we can further reduce most compile times under 5 minutes, with some components mapping in less than 2 minutes.
© 2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
This material is presented to ensure timely
dissemination of scholarly and technical work. Copyright and all
rights therein are retained by authors or by other copyright
holders. All persons copying this information are expected to
adhere to the terms and constraints invoked by each author's
copyright. In most cases, these works may not be reposted without
the explicit permission of the copyright holder.