Improved training of binary networks for human pose estimation and image recognition

A Bulat, G Tzimiropoulos, J Kossaifi… - arXiv preprint arXiv …, 2019 - arxiv.org
arXiv preprint arXiv:1904.05868, 2019arxiv.org
Big neural networks trained on large datasets have advanced the state-of-the-art for a large
variety of challenging problems, improving performance by a large margin. However, under
low memory and limited computational power constraints, the accuracy on the same
problems drops considerable. In this paper, we propose a series of techniques that
significantly improve the accuracy of binarized neural networks (ie networks where both the
features and the weights are binary). We evaluate the proposed improvements on two …
Big neural networks trained on large datasets have advanced the state-of-the-art for a large variety of challenging problems, improving performance by a large margin. However, under low memory and limited computational power constraints, the accuracy on the same problems drops considerable. In this paper, we propose a series of techniques that significantly improve the accuracy of binarized neural networks (i.e networks where both the features and the weights are binary). We evaluate the proposed improvements on two diverse tasks: fine-grained recognition (human pose estimation) and large-scale image recognition (ImageNet classification). Specifically, we introduce a series of novel methodological changes including: (a) more appropriate activation functions, (b) reverse-order initialization, (c) progressive quantization, and (d) network stacking and show that these additions improve existing state-of-the-art network binarization techniques, significantly. Additionally, for the first time, we also investigate the extent to which network binarization and knowledge distillation can be combined. When tested on the challenging MPII dataset, our method shows a performance improvement of more than 4% in absolute terms. Finally, we further validate our findings by applying the proposed techniques for large-scale object recognition on the Imagenet dataset, on which we report a reduction of error rate by 4%.
arxiv.org