RUS  ENG
Full version
JOURNALS // Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki // Archive

Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, 2013 Volume 155, Book 4, Pages 99–108 (Mi uzku1245)

Semi-automatic generation of linear event extraction patterns for free texts

D. Dzendzikab, S. Serebryakovb

a Saint-Petersburg State University, Saint Petersburg, Russia
b Hewlett-Packard Laboratories, Saint Petersburg, Russia

Abstract: In this paper we describe semi-automatic approach to generating event extraction patterns for free texts. The algorithm is composed of four steps: we automatically extract possible events from a corpus of free documents, cluster them using dependency-based parse tree paths, validate random samples from each cluster and generate linear patterns using positive event clusters. We compare it with the system that uses handcrafted patterns.

Keywords: event extraction, linear patterns, regular expressions, TextMARKER, RUTA.

UDC: 004.912

Received: 31.07.2013

Language: English



© Steklov Math. Inst. of RAS, 2024