Abstract:
Many of the idiomatic expressions can be used both in literal and
non-literal ways. The recognition of such cases is an important problem in many
natural language processing applications, namely, in machine translation. We
propose automatic idiom usage recognition method based on the analysis of
local contexts of such expressions. We apply recurrent neural networks to solve
this problem. Two types of neural networks are investigated — simple and
bidirectional recurrent networks. We compare two forms of representation of
context words — the canonical form (by lemmas) and by source word forms. We
describe construction and parameters of the distributive model which stores the
vector representations of single words and target idiomatic expressions. Due to
the great diversity of approaches to solving the idiom usage recognition problem,
we provide an extended survey of basic efforts in this domain.
Key words and phrases:idiomatic expressions, neural networks, recurrent neural networks, vector representations of words and expressions, named entity recognition.