Deep Learning on FPGAs
The recent successes of deep learning are largely attributed to the advancement of hardware acceleration technologies, which can accommodate the incredible growth of data sizes and model complexity. The current solution involves using clusters of graphics processing units (GPU) to achieve performance beyond that of general purpose processors (GPP), but the use of field programmable gate arrays (FPGA) is gaining popularity as an alternative due to their low power consumption and flexible architecture. However, there is a lack of infrastructure available for deep learning on FPGAs compared to what is available for GPPs and GPUs, and the practical challenges of developing such infrastructure are often ignored in contemporary work. Through the development of a software framework which extends the popular Caffe framework, this thesis demonstrates the viability of FPGAs as an acceleration platform for deep learning, and addresses many of the associated technical and practical challenges.