Parallelizing Sparse Matrix Solve for SPICE Circuit Simulation using FPGAsNachiket Kapre and André DeHon
Proceedings of the IEEE International Conference on Field-Programmable Technology, pp. 190--198 (FPT, December 09--11, 2009)
Fine-grained dataflow processing of sparse Matrix-Solve computation (Ax=b) in the SPICE circuit simulator can provide an order of magnitude performance improvement on modern FPGAs. Matrix Solve is the dominant component of the simulator especially for large circuits and is invoked repeatedly during the simulation, once for every iteration. We process sparse-matrix computation generated from the SPICE-oriented KLU solver in dataflow fashion across multiple spatial floating-point operators coupled to high-bandwidth on-chip memories and interconnected by a low-latency network. Using this approach, we are able to show speedups of 1.2-64x (geometric mean of 8.8x) for a range of circuits and benchmark matrices when comparing double-precision implementations on a 250MHz Xilinx Virtex-5 FPGA (65nm) and an Intel Core i7 965 processor (45nm).
© 2009 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
This material is presented to ensure timely
dissemination of scholarly and technical work. Copyright and all
rights therein are retained by authors or by other copyright
holders. All persons copying this information are expected to
adhere to the terms and constraints invoked by each author's
copyright. In most cases, these works may not be reposted without
the explicit permission of the copyright holder.
N.b. See journal version for
composite SPICE implementation.