RUS  ENG
Полная версия
ЖУРНАЛЫ // Компьютерная оптика // Архив

Компьютерная оптика, 2021, том 45, выпуск 1, страницы 66–76 (Mi co883)

Эта публикация цитируется в 11 статьях

INTERNATIONAL CONFERENCE ON MACHINE VISION

A generalization of Otsu method for linear separation of two unbalanced classes in document image binarization

E. I. Ershova, S. A. Korchagina, V. V. Kokhanba, P. V. Bezmaternykhcb

a Institute for Information Transmission Problems, RAS, 127051, Moscow, Bolshoy Karetny per., 19, str. 1
b Smart Engines Service LLC, Moscow, Russia, 117312, pr. 60-lettya Oktyabrya, 9
c Federal Research Center "Computer Science and Control" of Russian Academy of Sciences, Moscow, Russia, 117312, pr. 60-lettya Oktyabrya, 9

Аннотация: The classical Otsu method is a common tool in document image binarization. Often, two classes, text and background, are imbalanced, which means that the assumption of the classical Otsu method is not met. In this work, we considered the imbalanced pixel classes of background and text: weights of two classes are different, but variances are the same. We experimentally demonstrated that the employment of a criterion that takes into account the imbalance of the classes' weights, allows attaining higher binarization accuracy. We described the generalization of the criteria for a two-parametric model, for which an algorithm for the optimal linear separation search via fast linear clustering was proposed. We also demonstrated that the two-parametric model with the proposed separation allows increasing the image binarization accuracy for the documents with a complex background or spots.

Ключевые слова: threshold binarization, Otsu method, optimal linear classification, historical document image binarization.

Поступила в редакцию: 14.05.2020
Принята в печать: 26.11.2020

Язык публикации: английский

DOI: 10.18287/2412-6179-CO-752



© МИАН, 2024