Pattern Classification for Incomplete Data Using PPCA and KNN

Author Wael M. Khedr, Ahmed M. Elshewey
ISSN 2079-8407
On Pages 628-632
Volume No. 4
Issue No. 8
Issue Date August 01, 2013
Publishing Date August 01, 2013
Keywords Pattern classification, imputation, probabilistic principal component analysis (PPCA),K-nearest neighbour(K-nn), Neural networks.


Pattern classification has been successfully applied in many problem domains, such as biometric recognition, document classification and medical diagnosis. Missing data or unknown data is a common problem in data quality of a pattern classification. Such missing data are generally ignored or simply imputed in pattern classification, which will affect the performance of the classification. We applied two methods K-nearest neighbour and probabilistic principal component analysis to impute the missing values of patterns. In the K-nearest neighbour method, the missing data is imputed using values from K most similar cases. In probabilistic principal component analysis, the missing values can be imputed through probability mode of PCA. The aim of this work is to analyze and improve the imputation of missing data in pattern classification tasks. We use discriminant analysis and the back propagation algorithm to perform the classification of imputed patterns using artificial neural networks. The algorithm is applied on Iris dataset and Shuttle Landing Control dataset. The performances of classification of imputed data are better than ignored missing data.

