Abstract:
We consider the problem of complex time expressions recognition in Russian news texts with application to automatic information extraction. We describe an algorithm for finding noun phrases that contain time expressions. This algorithm has two parts: the pre-segmentation and the selection of noun phrase borders inside the segments via machine learning (CRF-model). We receive results of experiments.
(In Russian).
Key words and phrases:information extraction, named entities recognition, noun phrase chunking, time expressions, CRF.