Download PDF

Enhanced Heart Disease Prediction Using Hybrid Feature Selection and Ensemble Learning Techniques

Author : S Saranya and R Gomathi Jayam

Abstract :

Feature selection techniques are applied to identify the most relevant attributes influencing heart disease prediction. Ensemble methods like Random Forest and Gradient Boosting enhance model performance by combining multiple predictors. The approach ensures improved accuracy, reducing false positives and negatives. Insights derived aid in better understanding risk factors and optimizing predictive models. This study proposes a machine learning model that leverages various preprocessing steps, hyper parameter Hybrid Algorithm, and ensemble learning Techniques to predict heart disease. Enhancing heart disease prediction through ensemble learning techniques with hyper parameter optimization is a sound approach to improve the accuracy and robustness of predictive models. Ensemble learning involves combining multiple base models to create a more robust and accurate model. The collect the heart diseases data and processing data using Different type algorithms logistic regression, naive Bayes, support vector machine, k nearest neighbours, decision tree, random forest, XG boost, neural network. Logistic regression is a machine learning classification algorithm that is used to predict the probability of certain classes based on some dependent variables. Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. Features may include age, gender, blood pressure, cholesterol levels and other relevant health indicators. Preprocess the data by handling missing values, normalizing features and encoding categorical variables. When working on a heart disease prediction model, its crucial to split your dataset into training, validation and test sets. This helps you train the model on one subset, tune hyper parameters on another and assess the models generalization on the final subset. The typical split ratio for training, validation and test sets, respectively. Use a function or library to split your dataset into three subsets training, validation and test sets. After testing is performed deploy dataset model and classifier in heart disease dataset and check the heart disease result is normal or heart disease perdition finally the result is display on graphical output. These findings demonstrated the potential of our model for accurately predicting the presence or absence of heart disease. Such accurate predictions could significantly aid in early prevention, detection, and treatment, ultimately reducing the mortality and morbidity associated with heart disease.

Keywords :

Heart Disease Prediction, Machine Learning, Feature Selection, Ensemble Learning, Hyperparameter Optimization, Logistic Regression, Random Forest, XGBoost, Support Vector Machine, Data Preprocessing.