Abstract:
The paper considers the developed technology aimed at avoiding duplication of bibliographic descriptions in the scientific database Bibliographic Information-Analytical System (BIAS) of IPI RAS. The analysis of the reasons of duplications is given. The constituent parts of the developed software are the modules of definition of similarity using the methods of fuzzy search based on the Oliver algorithm and the modules of visualization of the results which are built into the system at the level of formation of the database content. Program modules of visualization allow moderators of BIAS IPI RAS to receive full information about the conflicts. They will be able to decide on further action using additional information. The concept of similarity index used in the software modules of definition of similarity is introduced. The paper considers the formal data model underlying construction of the database, built on the principles of facet navigation. Application of the developed software made it possible to detect and remove duplicate bibliographic descriptions in the scientific database.
Keywords:similarity index; software modules of definition of similarity; method of fuzzy search on the Oliver algorithm; facet navigation.