Abstract:
During the integration or unification of heterogeneous information systems it is required to identify entities which describe the same real world entity in different information systems. This problem cannot be effectively solved by deterministic algorithms. This paper describes a machine learning based approach for obtaining entity identification rules based on decision trees.
Keywords:entity identification, entity resolution, matching, machine learning, decision tree, information systems integration.