RUS  ENG
Полная версия
ЖУРНАЛЫ // Компьютерная оптика

Компьютерная оптика, 2023, том 47, выпуск 4, страницы 627–636 (Mi co1164)

A joint study of deep learning-based methods for identity document image binarization and its influence on attribute recognition
R. Sánchez-Rivero, P. V. Bezmaternykh, A. V. Gayer, A. Morales-González, F. J. Silva-Mata, K. B. Bulatov

Список литературы

1. Doermann D, Tombre K, Handbook of document image processing and recognition, Springer Publishing Company Inc, 2014  zmath
2. Arlazarov VV, Andreeva EI, Bulatov KB, Nikolaev DP, Petrova OO, Savelev BI, Slavin OA, “Document image analysis and recognition: a survey”, Computer Optics, 46:4 (2022), 567–589  crossref
3. Bulatov KB, Bezmaternykh PV, Nikolaev DP, Arlazarov VV, “Towards a unified framework for identity documents analysis and recognition”, Computer Optics, 46:3 (2022), 436–454  crossref
4. Arlazarov VL, Arlazarov VV, Bulatov KB, Chernov TS, Nikolaev DP, Polevoy DV, Sheshkus AV, Skoryukina NS, Slavin OA, Usilin SA, “Mobile ID document recognition-coarse-to-fine approach”, Pattern Recognit Image Anal, 32:1 (2022), 89–108  crossref
5. Arlazarov VV, Bulatov K, Chernov T, Arlazarov VL, “MIDV-500: a dataset for identity document analysis and recognition on mobile devices in video stream”, Computer Optics, 43:5 (2019), 818–824  crossref
6. Bulatov K, Emelianova E, Tropin D, et al., MIDV-2020: A comprehensive benchmark dataset for identity document analysis, 2021, arXiv: 2107.00396  crossref
7. Sánchez-Rivero R, Bezmaternykh P, Morales-González A, Silva-Mata FJ, Bulatov K, “Assessing the relationship between binarization and OCR in the context of deep learning-based ID document analysis”, Progress in artificial intelligence and pattern recognition, eds. Heredia YH, Núñez VM, Shulcloper JR, Springer International Publishing, Cham, 2021, 134–144  crossref  mathscinet
8. Lins RD, Almeida MMD, Bernardino RB, Jesus D, Oliveira JM, “Assessing binarization techniques for document images”, DocEng 2017: Proc 2017 ACM Symposium on Document Engineering, 2017, 183–192  crossref
9. Mustafa WA, Kader MMMA, “Binarization of document images: A comprehensive review”, J Phys: Conf Ser, 1019 (2018), 012023  crossref
10. Tensmeyer C, Martinez T, “Historical document image binarization: A review”, SN Comput Sci, 1:3 (2020), 173  crossref
11. Pratikakis I, Zagoris K, Barlas G, Gatos B, “ICFHR2016 handwritten document image binarization contest (H-DIBCO 2016)”, 2016 15th Int Conf on Frontiers in Handwriting Recognition (ICFHR), 2016, 619–623  crossref
12. Pratikakis I, Zagoris K, Karagiannis X, Tsochatzidis L, Mondal T, Marthot-Santaniello I, “Document image binarization (DIBCO 2019)”, 2019 Int Conf on Document Analysis and Recognition (ICDAR), 2019, 1547–1556  crossref
13. Smith EHB, “An analysis of binarization ground truthing”, Proc 8th IAPR Int Workshop on Document Analysis Systems (DAS ’10), 2010, 27–34  crossref
14. Ntirogiannis K, Gatos B, Pratikakis I, “Performance evaluation methodology for historical document image binarization”, IEEE Trans Image Process, 22:2 (2013), 595–609  crossref  mathscinet  zmath
15. Rani U, Kaur A, Josan G, “A new binarization method for degraded document images”, Int J Inf Technol, 15:1 (2019), 1035–1053  crossref
16. Milyaev S, Barinova O, Novikova T, Kohli P, Lempitsky V, “Image binarization for end-to-end text understanding in natural images”, 2013 12th Int Conf on Document Analysis and Recognition, 2013, 128–132  crossref
17. Chou C-H, Lin W-H, Chang F, “A binarization method with learning-built rules for document images produced by cameras”, Pattern Recogn, 43:4 (2010), 1518–1530  crossref  zmath
18. Wen J, Li S, Sun J, “A new binarization method for non-uniform illuminated document images”, Pattern Recogn, 46:6 (2013), 1670–1690  crossref  mathscinet
19. Tafti AP, Baghaie A, Assefi M, Arabnia HR, Yu Z, Peissig P, “OCR as a service: An experimental evaluation of google docs OCR, tesseract, ABBYY FineReader, and transym”, Advances in visual computing, eds. Bebis G, Boyle R, Parvin B, Koracin D, Porikli F, Skaff S, Entezari A, Min J, Iwai D, Sadagic A, Scheidegger C, Isenberg T, Springer International Publishing AG, Cham, Switzerland, 2016, 735–746  crossref
20. Li Z, Yang C, Shen Q, Wen S, “A document image dataset for quality assessment”, J Phys: Conf Ser, 1828:1 (2021), 012033  crossref
21. Ye P, Doermann D, “Document image quality assessment: A brief survey”, 2013 12th Int Conf on Document Analysis and Recognition, 2013, 723–727  crossref
22. Polevoy DV, Bulatov KB, Skoryukina NS, Chernov TS, Arlazarov VV, Sheshkus AV, “Key aspects of document recognition using small digital cameras”, RFBR J, 4:92 (2016), 97–108  crossref
23. Chernov T, Ilyuhin S, Arlazarov VV, “Application of dynamic saliency maps to the video stream recognition systems with image quality assessment”, Proc SPIE, 11041 (2019), 110410T  crossref
24. Shemiakina J, Limonova E, Skoryukina N, Arlazarov VV, Nikolaev DP, “A method of image quality assessment for text recognition on camera-captured and projectively distorted documents”, Mathematics, 9:17 (2021), 2155  crossref
25. Bezmaternykh PV, Ilin DA, Nikolaev DP, “U-Net-bin: hacking the document image binarization contest”, Computer Optics, 43:5 (2019), 825–832  crossref
26. Calvo-Zaragoza J, Gallego AJ, “A selectional auto-encoder approach for document image binarization”, Pattern Recogn, 86 (2019), 37–47  crossref
27. Masyagin M, Robust document image binarization tool, 2021 https://github.com/masyagin1998/robin
28. Otsu N, “A threshold selection method from gray-level histograms”, IEEE Trans Syst Man Cybern Syst, 9:1 (1979), 62–66  crossref
29. Lins RD, Simske SJ, Bernardino RB, “DocEng'2020 time-quality competition on binarizing photographed documents”, Proc ACM Symposium on Document Engineering, 2020, 2  crossref
30. Yu D, Li X, Zhang C, Liu T, Han J, Liu J, Ding E, “Towards accurate scene text recognition with semantic reasoning networks”, Computer Vision and Pattern Recognition (CVPR), 2020, 12113–12122  crossref
31. Du Y, Li C, Guo R, Cui C, Liu W, Zhou J, Lu B, Yang Y, Liu Q, PP-OCRv2: Bag of tricks for ultra lightweight OCR system, 2021, arXiv: 2109.03144  crossref
32. Lee J, Park S, Baek J, Oh SJ, Kim S, Lee H, “On recognizing texts of arbitrary shapes with 2D self-attention”, Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops, 2020, 546–547  crossref
33. Baek J, Kim G, Lee J, Park S, Han D, Yun S, Oh SJ, Lee H, “What is wrong with scene text recognition model comparisons? dataset and model analysis”, 2019 IEEE/CVF Int Conf on Computer Vision (ICCV), 2019, 4714–4722  crossref
34. Cai H, Sun J, Xiong Y, Revisiting classification perspective on scene text recognition, 2021, arXiv: 2102.10884  crossref
35. Smith R, “An overview of the tesseract OCR engine”, IEEE Int conf on Document Analysis and Recognition (ICDAR’07), 2 (2007), 629–633  crossref
36. Michalak H, Okarma K, “Robust combined binarization method of non-uniformly illuminated document images for alphanumerical character recognition”, Sensors, 20:10 (2020), 2914  crossref
37. Yujian L, Bo L, “A normalized Levenshtein distance metric”, IEEE Trans Pattern Anal Mach Intell, 29:6 (2007), 1091–1095  crossref
38. Schulz D, Maureira J, Tapia J, Busch C, “Identity documents image quality assessment”, 2022 30th European Signal Processing Conf (EUSIPCO), 2022, 1017–1021  crossref


© МИАН, 2025