Abstract:
During our previous research, we found that the grammatical ambiguity of most frequent words of European languages has a different distribution in comparison with less frequent ones. In the current research, we investigate in more details the reasons of such a phenomenon; we pay a special attention to the first thousand of most frequent tokens. Our investigation of modern disambiguation systems demonstrated that the increase of language diversity, we had found for most frequent words, leads to increase of number of mistakes made by those systems..
Keywords:grammatical ambiguity, quantitative analysis, distribution, statistics, the Russian language.