I am mainly interested in low complexity, resource constrained Machine Learning.
Below is a list of my projects
Much of the research in the machine learning community focuses on enhancing accuracy and functionality with minimal consideration of energy costs. Only in 2015-2016, did papers start to appear in machine learning conferences on determining precision requirements of neural networks. In this work I obtained theoretical guarantees on the minimum precision requirements of neural networks. The goal of the project is to determine precision assignments of weights and activations in an analytically sound manner, and while reducing the need to run lengthy simulations. A paper on the topic was published in ICML 2017 (PMLR version + personal version with supplementary material in same PDF). A follow up work on this topic with a fine-grained analysis and improved empirical results was published in ICASSP 2018 (PDF). The code needed to generate these results and evaluate the costs is available in the Codes page of this website (which will redirect you here).
In the summer of 2017, I had an internship at the IBM T.J. Watson Research Center. The internship focused on on the topic of training deep neural networks with limited precision. Specifically, I worked on a method to use gradient based learning for binary activated networks. This work was published in ICASSP 2018 (PDF).
This project is the precursor to the one above on numerical precision of deep neural networks. This research seeks to bring rigor to the design of fixed-point learning systems which is currently being done using trial and error. Specifically, we characterized the precision to accuracy trade-off of support vector machines (SVM). We came up with several bounds that analytically predict the precision requirements of fixed-point SVM for both classification and training using the Stochastic Gradient Descent (SGD). A paper about this topic appeared in ICASSP 2017 - check it out here. The extended version is posted on the arXiv.
This project proposes a simple but highly efficient architectural idea to reduce the computational cost of Convolutional Neural Networks (CNN). The idea, pitched by my collaborator Yingyan Lin, is to decompose the computation at each convolutional layer into MSB and LSB parts. If the result of the MSB part of some output is negative, it is highly likely that the overall output is negative itself. In such a case, the residual processing is by-passed (clock-gated). My contribution in this project was an analytical validation of the technique. One paper about this project was accepted in ISCAS 2017 and can be found here.
This is the work of my previous collaborator, Dr. Sai Zhang. The idea is to bring computation to the bitlines and cross-bitlines of a sensory array using mixed-signal techniques. My contribution was setting up the algorithm and validation dataset as well as post layout verification. Check out our arXiv paper on the topic.