Abstract:
This paper focuses on the problem of reduction of the computation load for road scene text recognition by making a stopping decision which cuts off further recognition. The contribution of the paper is the construction of stopping rules for real-time text recognition systems with results combination, with an experimental evaluation on an open dataset RoadText-1k. We found that for fast-working systems the ROVER (Recognizer Output Voting Error Reduction) combination method and majority voting are best for Levenshtein and direct match metrics respectively, however, with an increase of per-frame processing time, ROVER becomes consistently better. Furthermore, while the selection of a single most focused frame is the worst strategy for fast-working systems, its comparative rank increases with the increase of processing time. Moreover, choosing one most focused frame and combining three most focused frames are preferable for fast-working systems when decreasing load on the device is needed.
Keywords:combination method, reducing computational load, real-time recognition, road scene analysis, text recognition, video stream recognition.