One small step for a man
One Giant leap for the mankind

There is no wealth like Knowledge
                            No Poverty like Ignorance
Journal of Emerging Trends in Computing and Information Sciences Logo

Journal of Emerging Trends in Computing and Information Sciences >> Call for Papers Vol. 8 No. 3, March 2017

Journal of Emerging Trends in Computing and Information Sciences

An Associative Classification Data Mining Approach for Detecting Phishing Websites

Full Text Pdf Pdf
Author Suzan Wedyan, Fadi Wedyan
ISSN 2079-8407
On Pages 888-899
Volume No. 4
Issue No. 12
Issue Date January 01, 2014
Publishing Date January 01, 2014
Keywords Associative classification, Data Mining, Phishing Websites, Machine Learning


Phishing websites are fake websites that are created by dishonest people to mimic webpages of real websites. Victims of phishing attacks may expose their financial sensitive information to the attacker whom might use this information for financial and criminal activities. Various approaches have been proposed to detect phishing websites, among which, approaches that utilize data mining techniques had shown to be more effective. The main goal of data mining is to analyze a large set of data to identify unsuspected relation and extract understandable useful patterns. Associative Classification (AC) is a promising data mining approach which integrates two known data mining tasks, association rule mining and classification. This paper, proposes a new AC algorithm called Phishing Associative Classification (PAC), for detecting phishing websites. PAC employed a novel methodology in construction the classifier which results in generating moderate size classifiers. The algorithm improved the effectiveness and efficiency of a known algorithm called MCAR, by introducing a new prediction procedure and adopting a different rule pruning procedure. The conducted experiments compared PAC with 4 well-known data mining algorithms, these are: covering algorithm (Prism), decision tree (C4.5), associative Classification (CBA) and MCAR. Experiments are performed on a dataset that consists of 1010 website. Each Website is represented using 17 features categorized into 4 sets. The features are extracted from the website contents and URL. The results on each features set show that PAC is either equivalent or more effective than the compared algorithms. When all features are considered, PAC outperformed the compared algorithms and correctly identified 99.31% of the tested websites. Furthermore, PAC produced less number of rules than MCAR, and therefore, is more efficient.

    Journal of Computing | Call for Papers (CFP) | Journal Blog | Journal of Systems and Software | ARPN Journal of Science and Technology | International Journal of Health and Medical Sciences | International Journal of Economics, Finance and Management     
© 2015 Journal of Computing