实验结果
图像分类任务:
本文在CIFAR10和ImageNet等图像分类数据集进行INT8训练实验。从下表结果中可以看出,在大多数网络结构中均取得了比现有最好方法更优的精度,并且首次在MobileNet、Inception等网络上进行量化训练实验,精度损失也在1.5%以内。
目标检测任务:
同时,本文也首次尝试在PASCAL和COCO等目标检测数据集上进行INT8训练实验,精度损失也在2%以内。
已有的少量探究梯度量化的论文[4]均未报告算法在实际训练任务中的真实加速性能,为了最大限度将方法实用化,本文在GeForce GTX1080TI显卡上编写并优化了用于支持INT8训练的卷积前向和后向计算核心。实测结果表明,使用INT8卷积计算的前向和后向过程相比于浮点计算有明显的加速,其中前向过程平均加速1.63倍,后向过程平均加速1.94倍。如下图所示:
同时,本文在实际训练过程中进行了完整的端到端测试,可以看到,INT8训练可以将ResNet50的一轮训练过程从0.360秒降低到0.293秒,整体训练过程提速了22%。
References
[1] Ruihao Gong, Xianglong Liu, Shenghu Jiang, TianxiangLi,Peng Hu, Jiazhen Lin, Fengwei Yu, and Junjie Yan. Differen-tiable softquantization: Bridging full-precisionand low-bitneural networks. In ICCV, October 2019.
[2] RundongLi, Yan Wang, Feng Liang, Hongwei Qin, Junjie Yan, and Rui Fan. Fully quantizednetwork for object detection. In The IEEE Conference on Computer Vision andPattern Recognition (CVPR), June 2019.
[3] Benoit Jacob,Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, HartwigAdam, and Dmitry Kalenichenko. Quantization and training of neural networks forefficient integer-arithmetic-only inference. 2018 IEEE Conference on ComputerVision and Pattern Recognition(CVPR), June 2018.
[4] Yukuan Yang, Shuang Wu, LeiDeng, Tianyi Yan, Yuan Xie, and Guoqi Li. Training high-performance andlarge-scale deep neural networks with full 8-bit integers, 2019.