Machine Learning Approach for Identifying Malicious Websites
Author : Dr. K Kalyani and KR Ragamaliga
Abstract :
Phishing is one of the most common and dangerous cyber-attacks in today’s digital world. Attackers create fake websites that closely resemble legitimate websites such as banking portals, e-commerce platforms, and social media sites in order to steal sensitive user information like usernames, passwords, credit/debit card details, and OTPs. With the rapid growth of online transactions and digital services, traditional blacklist-based detection systems are no longer sufficient to identify newly created phishing websites. Hence, there is a strong need for an intelligent, automated, and real-time detection system. This project proposes a Phishing Website Detection System using Machine Learning to accurately classify websites as legitimate or phishing based on various extracted features. The system collects URL-based features (such as URL length, presence of special characters, use of HTTPS, number of subdomains), domain-based features (age of domain, DNS record availability), and content-based features (presence of iframe tags, suspicious scripts, redirection behavior). These features are preprocessed and used to train supervised machine learning algorithms.
Keywords :
Phishing Attack, Machine Learning, Cyber Security, URL Feature Extraction, Website Classification, Supervised Learning, Random Forest, Support Vector Machine.