RUS  ENG
Full version
JOURNALS // Program Systems: Theory and Applications // Archive

Program Systems: Theory and Applications, 2016 Volume 7, Issue 1, Pages 153–170 (Mi ps209)

This article is cited in 2 papers

Artificial Intelligence, Intelligent Systems, Neural Networks

To the noun phrase recognition problem in application to automatic information extraction from Russian texts

N. A. Vlasova, A. V. Podobryaev

Ailamazyan Program Systems Institute of Russian Academy of Sciences

Abstract: We consider the problem of complex noun phrase recognition in Russian news texts with application to automatic information extraction. By complex noun phrases we mean long noun phrases that contain genitive or/and prepositional constructions and named entities. We describe a plan of noun phrase recognition that begins with a selection of the sentence fragments that undoubtedly contain noun phrases. The fragments selection algorithm is developed. The fragments are classified by frequency of their types, number of words in the fragment, part of speech structure, presence of extracted named entities, some complex prepositions and stable expressions. We introduce a feature system to make automatic noun phrase recognition inside selected fragments. In experiments we have selected 58032 fragments from 1000 documents collection of Russian news. We consider some complex cases. (In Russian).

Key words and phrases: information extraction, named entities recognition, noun phrase chunking.

UDC: 004.89:004.912

Received: 02.02.2016
Accepted: 15.03.2016



© Steklov Math. Inst. of RAS, 2024