Abstract:
The Russian-French parallel corpus as a part of the Russian National Corpus (RNC) is being transformed into a multivariant corpus with several translations corresponding to each original texts. Concurrently, a Database of functionally equivalent lexicogrammatical verbal forms is being created using the multivariant corpus. The main purpose of database creation is to calculate the statistical estimates of the equivalences between Russian and French verbal forms. The paper discusses an information technology for creating the Russian-French multivariant parallel corpus and the database simultaneously.
Keywords:parallel multivariant corpora; Russian National Corpus; information technologies; XML marking up Russian–French parallel texts; lexicogrammatical form; functional equivalence; statistical estimates of equivalences.