RUS  ENG
Full version
JOURNALS // Computer Optics // Archive

Computer Optics, 2022 Volume 46, Issue 4, Pages 590–595 (Mi co1049)

IMAGE PROCESSING, PATTERN RECOGNITION

Investigation of the applicability of natural language processing methods to problems of searching and matching of machinery drawing images

K. N. Figura

Bratsk State University

Abstract: In this work it is shown that the application of the technique of local feature descriptors in its pure form to the task of searching and matching of drawings is ineffective. It is revealed that this is mainly due to the presence in the drawings of a large number of identical elements (frames, a title block, extension lines, font elements, etc.). It is proposed that this problem should be solved using a tf-idf (term frequency-inverse document frequency) method, which is widely known in natural language processing. In the study, instead of the word vectors used in the original tf-idf technique, descriptors of image feature points calculated using the ORB and BRISK algorithms were used. The study has led to the following conclusions: 1) the proposed approach offers high efficiency in finding a copy of the image-query in the database. Thus, copies of all images presented for search and having their full analogs in the database are revealed. 2) The identification rate of modified image-queries varies, depending on the algorithm used for finding keypoints and descriptors. So, the maximum percentage of identified modified analogs is 60% when using ORB and 80% when using BRISK - out of all image analogs in the database. 3) The proposed approach shows a limited efficiency in finding images that can be attributed to the same class as the image queries (for example, a drawing of an excavator, a bulldozer, or a truck crane). Here, the maximum proportion of false identification has reached 60%.

Keywords: natural language processing, tf-idf method, image retrieval, image analysis, pattern recognition, digital image processing

Received: 24.08.2021
Accepted: 31.10.2021

DOI: 10.18287/2412-6179-CO-1030



© Steklov Math. Inst. of RAS, 2024