Title : ( Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach )
Authors: Elham Nasarian , Moloud Abdar , Mohammad Amin Fahami , Roohallah Alizadehsani , Sadiq Hussain , Mohammad Ehsan Basiri , Mariam Zomorodi-Moghadam , Xujuan Zhou , Pawel Plawiak , U. Rajendra Acharya , Ru-San Tan , Nizal Sarrafzadegan ,Access to full-text not allowed by authors
Abstract
Coronary artery disease (CAD) is a leading cause of death worldwide and is associated with high health- care expenditure. Researchers are motivated to apply machine learning (ML) for quick and accurate detec- tion of CAD. The performance of the automated systems depends on the quality of features used. Clinical CAD datasets contain different features with varying degrees of association with CAD. To extract such fea- tures, we developed a novel hybrid feature selection algorithm called heterogeneous hybrid feature selec- tion (2HFS). In this work, we used Nasarian CAD dataset, in which work place and environmental features are also considered, in addition to other clinical features. Synthetic minority over-sampling technique (SMOTE) and Adaptive synthetic (ADASYN) are used to handle the imbalance in the dataset. Decision tree (DT), Gaussian Naive Bayes (GNB), Random Forest (RF), and XGBoost classifiers are used. 2HFS-selected features are then input into these classifier algorithms. Our results show that, the proposed feature se- lection method has yielded the classification accuracy of 81.23% with SMOTE and XGBoost classifier. We have also tested our approach with other well-known CAD datasets: Hungarian dataset, Long-beach-va dataset, and Z-Alizadeh Sani dataset. We have obtained 83.94%, 81.58% and 92.58% for Hungarian dataset, Long-beach-va dataset, and Z-Alizadeh Sani dataset, respectively. Hence, our experimental results confirm the effectiveness of our proposed feature selection algorithm as compared to the existing state-of-the-art techniques which yielded outstanding results for the development of automated CAD systems.
Keywords
, Machine learning, Data mining, Heart disease, Coronary artery disease, Feature selection@article{paperid:1083783,
author = {Elham Nasarian and Moloud Abdar and Mohammad Amin Fahami and Roohallah Alizadehsani and Sadiq Hussain and Mohammad Ehsan Basiri and Zomorodi-Moghadam, Mariam and Xujuan Zhou and Pawel Plawiak and U. Rajendra Acharya and Ru-San Tan and Nizal Sarrafzadegan},
title = {Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach},
journal = {Pattern Recognition Letters},
year = {2020},
volume = {133},
month = {May},
issn = {0167-8655},
pages = {33--40},
numpages = {7},
keywords = {Machine learning; Data mining; Heart disease; Coronary artery disease; Feature selection},
}
%0 Journal Article
%T Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach
%A Elham Nasarian
%A Moloud Abdar
%A Mohammad Amin Fahami
%A Roohallah Alizadehsani
%A Sadiq Hussain
%A Mohammad Ehsan Basiri
%A Zomorodi-Moghadam, Mariam
%A Xujuan Zhou
%A Pawel Plawiak
%A U. Rajendra Acharya
%A Ru-San Tan
%A Nizal Sarrafzadegan
%J Pattern Recognition Letters
%@ 0167-8655
%D 2020