RUS  ENG
Full version
JOURNALS // Avtomatika i Telemekhanika

Avtomat. i Telemekh., 2022, Issue 10, Pages 35–46 (Mi at16049)

Distilling face recognition models trained using margin-based softmax function
D. V. Svitov, S. A. Alyamkin

References

1. Chen S., Liu Y., Gao X., Han Z., “Mobilefacenets: Efficient cnns for accurate real-time face verification on mobile devices”, Chinese Conference on Biometric Recognition, Springer, Cham, 2018, 428–438  crossref
2. Sandler M., Howard A., Zhu M., Zhmoginov A., Chen L. C., “Mobilenetv2: Inverted residuals and linear bottlenecks”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, 4510–4520
3. Deng J., Guo J., Xue N., Zafeiriou S., “Arcface: Additive angular margin loss for deep face recognition”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 4690–4699
4. Liu W., Wen Y., Yu Z., Li M., Raj B., Song L., “Sphereface: Deep hypersphere embedding for face recognition”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, 212–220
5. Wang H., Wang Y., Zhou Z., Ji X., Gong D., Zhou J., Li Z., Liu W., “Cosface: Large margin cosine loss for deep face recognition”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2018, 5265–5274
6. Huang G. B., Mattar M., Berg T., Learned-Miller E., “Labeled faces in the wild: A database forstudying face recognition in unconstrained environments”, Workshop on faces in'Real-Life'Images: detection, alignment, and recognition, 2008
7. Moschoglou S., Papaioannou A., Sagonas C., Deng J., Kotsia I., Zafeiriou S., “Agedb: the first manually collected, in-the-wild age database”, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, 51–59
8. Kemelmacher-Shlizerman I., Seitz S. M., Miller D., Brossard E., “The megaface benchmark: 1 million faces for recognition at scale”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 4873–4882
9. Hinton G., Vinyals O., Dean J., Distilling the knowledge in a neural network, 2015, arXiv: 1503.02531
10. Fukuda T., Suzuki M., Kurata G., Thomas S., Cui J., Ramabhadran B., “Efficient Knowledge Distillation from an Ensemble of Teachers”, Interspeech, 2017, 3697–3701  crossref
11. Sau B. B., Balasubramanian V. N., Deep model compression: Distilling knowledge from noisy teachers, 2016, arXiv: 1610.09650
12. Furlanello T., Lipton Z., Tschannen M., Itti L., Anandkumar A., “Born again neural networks”, International Conference on Machine Learning PMLR 2018, 1607–1616
13. Huang Z., Wang N., Like what you like: Knowledge distill via neuron selectivity transfer, 2017, arXiv: 1707.01219
14. Romero A., Ballas N., Kahou S. E., Chassang A., Gatta C., Bengio Y., Fitnets: Hints for thin deep nets, 2014, arXiv: 1412.6550
15. Chen H., Wang Y., Xu C., Xu C., Tao D., “Learning student networks via feature embedding”, IEEE Transactions on Neural Networks and Learning Systems, 32:1 (2020), 25–35  crossref
16. Park W., Kim D., Lu Y., Cho M., “Relational knowledge distillation”, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, 3967–3976
17. Feng Y., Wang H., Hu H.R., Yu L., Wang W., Wang S., “Triplet distillation for deep face recognition”, 2020 IEEE International Conference on Image Processing (ICIP), IEEE, 2020, 808–812  crossref
18. Duong C. N., Luu K., Quach K. G., Le N., Shrinkteanet: Million-scale lightweight face recognition via shrinking teacher-student networks, 2019, arXiv: 1905.10620
19. Nekhaev D., Milyaev S., Laptev I., “Margin based knowledge distillation for mobile face recognition”, Twelfth International Conference on Machine Vision ICMV 2019, Proc. SPIE, 11433, International Society for Optics and Photonics, 2020, 1143300
20. He K., Zhang X., Ren S., Sun J., “Deep residual learning for image recognition”, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 770–778
21. Zhang K., Zhang Z., Li Z., Qiao Y., “Joint face detection and alignment using multitask cascaded convolutional networks”, IEEE Signal Processing Letters, 23:10 (2016), 1499–1503  crossref  adsnasa
22. Guo Y., Zhang L., Hu Y., He X., Gao J., “Ms-celeb-1m: A dataset and benchmark for large-scale face recognition”, European conference on computer vision, Springer, Cham, 2016, 87–102
23. Ng H.W., Winkler S., “A data-driven approach to cleaning large face datasets”, IEEE international conference on image processing, ICIP, IEEE, 2014, 343–347
24. Robbins H., Monro S., “A stochastic approximation method”, The annals of mathematical statistics, 1951, 400–407  crossref  mathscinet  zmath
25. Grabovoy A. V., Strijov V. V., “Bayesian Distillation of Deep Learning Models”, Autom. Remote Control, 82:11 (2021), 1846–1856  mathnet  crossref  mathscinet  zmath
26. Grabovoy A. V., Strijov V. V., “Probabilistic Interpretation of the Distillation Problem”, Autom. Remote Control, 83:1 (2022), 123–137  mathnet  crossref  mathscinet  zmath
27. MarginDistillation: distillation for margin-basedsoftmax, https://github.com/david-svitov/margindistillation (data obrascheniya: 08.01.2022)


© Steklov Math. Inst. of RAS, 2026