Abstract:
We consider the computational complexity of one extremal problem of choosing a subset of $p$ points from some given $2$-clustering of a finite set in a metric space. The chosen subset of points has to describe the given clusters in the best way from the viewpoint of some geometric criterion. This is a formalization of an applied problem of data mining which consists in finding a subset of typical representatives of a dataset composed of two classes based on the function of rival similarity. The problem is proved to be NP-hard. To this end, we polynomially reduce to the problem one of the well-known problems NP-hard in the strong sense, the $p$-median problem. Bibliogr. 15.
Keywords:NP-hard problem, typical representative, rival similarity, $p$-median problem, data mining.