Abstract:
This paper presents a new approach to modelling the structure of document images for classification tasks. Each of the document images is considered as a realization of a stochastic point process. Estimates of the properties of the point process are used to describe the document structure. The main objective of this paper is to determine the type of a new document using a nonparametric classification method. A method of classification of functional properties of point processes based on the concept of statistical depth is proposed. Practical issues of experimentation are considered. Modeling on real data showed the effectiveness of the proposed approach.
Keywords:documents with flexible structure, classification, spatial point process, reproducible point patterns, depth, $DD$-plot, $\alpha$-procedure.