Аннотация:
Growing amount of information on the Internet and rapid development of social networks make the task of text processing increasingly actual. In this paper we propose an algorithm for the comparison of sentences and introduce certain measures of the closeness (similarity) between the sentences. The estimation of the relevance of documents should be based on the context of a search query and should not be limited only by keywords, their similarity or frequency. So proposed measures take into account lexical, syntactic and semantic relations between words. One of the problems we solve in the current time is the development of a parser like Link Grammar Parser for Turkic languages most frequent in the Internet, such as Kazakh, Uzbek (Cyrillic and Roman alphabets), and Turkish. The results of our research are planned to be used in different information retrieval systems.
Ключевые слова:natural language processing, syntactic analysis, Link Grammar Parser, relevance, Turkic languages.