Design and Performance Analysis of Hardware Accelerator for Deep Neural Network in Heterogeneous Platform




Sefat, Md Syadus

Journal Title

Journal ISSN

Volume Title



This thesis describes a new flexible approach to implementing energy-efficient DNN accelerator on FPGAs. Our design leverages the Coherent Accelerator Processor Interface (CAPI) which provides a cache-coherent view of system memory to attached accelerators. Computational kernels are accelerated on a CAPI-supported Kintex FPGA board. Our implementation bypasses the need for device driver code and significantly reduces the communication and I/O transfer overhead. To improve the performance of the entire application, we propose a collaborative model of execution in which the control of the data flow within the accelerator is kept independent, freeing-up CPU cores to work on other parts of the application. For further performance enhancements, we propose a technique to exploit data locality in the cache, situated in the CAPI Power Service Layer (PSL). Finally, we develop a resource-conscious implementation for more efficient utilization of resources and improved scalability. Compared with the previous work, our architecture achieves both improved performance and better power efficiency.



Hardware, Accelerator, DNN, FPGA


Sefat, M. D. S. (2018). <i>Design and performance analysis of hardware accelerator for deep neural network in heterogeneous platform</i> (Unpublished thesis). Texas State University, San Marcos, Texas.


Rights Holder

Rights License

Rights URI