Methods for data and knowledge mining
Gennady Lbov (passed away on June 30, 2010)
The project aims to develop and investigate methods and algorithms for solving clustering problems characterized by a combination of heterogeneity, incompleteness and noise effects in data. In this case the classified objects are described by heterogeneous (quantitative, ordinal or qualitative) variables; they may be characterized by partially differing feature systems. There are exist missed values for some characteristics; there are "noisy" objects; present non-informative variables. Such problems may arise from the analysis of biological, sociological and medical information, web data, satellite images etc.
In this project, we suggest to use a combination of logical, probabilistic and the ensemble approaches to construct models for classification and forecasting. The novelty of the project consists in extending of these approaches to a problem of cluster analysis, and also in use of original methods for constructing ensembles of logical-and-probabilistic models and algorithms of nonparametric cluster analysis.
Some recent papers:
I. A. Pestunov, V. B. Berikov, E. A. Kulikova and S. A. Rylov Ensemble of clustering algorithms for large datasets // Optoelectronics, Instrumentation and Data Processing. 2011. Vol. 47, N 3. P. 245-252.
Berikov V.B. A latent variable pairwise classification model of a clustering ensemble // C. Sansone et al. (Eds.): Multiple Classifier Systems, MCS-2011. Lecture Notes in Computer Science, LNCS 6713. 2011. P. 279-288.