Abstract:
This paper describes the method for detecting topic in short text documents developed by the authors. The method called Feature BTM, based on the modification of the third step of the generative process of the well-known BTM model. The authors conducted experiments of quality evaluation that have shown the advantage of efficiency by the modified Feature BTM model before the Standard BTM model. The thematic clustering technology of documents necessary for the creation of thematic virtual museums has described. The authors performed a performance evaluation that shows a slight loss of speed (less than 30 seconds), more effective using the Feature-BTM for clustering the virtual museum collection than the Standard BTM model.
Keywords:topic model, biterm, short text, BTM, clustering, thematic virtual museums.