PREDICTIVE HEART HEALTH ANALYSIS: MACHINE LEARNING WITH THE CARDIOVASCULAR DISEASE DATASET
Keywords:
Cardiovascular Disease Prediction, Deep Neural Network, Transformer Model, Machine Learning, Medical Data Analysis, Risk AssessmentAbstract
Cardiovascular diseases (CVDs) are the leading cause of global mortality, requiring accurate prediction systems for early detection and prevention. This study investigates predictive modeling of CVD risk using two benchmark datasets: Dataset 1 (Kaggle Cardiovascular Disease Risk Prediction, 70,000 records with demographic, clinical, and lifestyle features) and Dataset 2 (Early Medical Risk Dataset, 65,535 samples with clinical symptoms and risk factors). Two deep learning approaches were implemented and compared: a Deep Neural Network (DNN) baseline and a Transformer-based model tailored for tabular healthcare data. The DNN achieved consistent results with accuracies of 85.3% (Dataset 2) and ~90% (Dataset 1), demonstrating balanced precision and recall but limited ability to capture complex feature dependencies. In contrast, the Transformer achieved superior performance, recording precision and recall above 99% with an ROC-AUC of 0.999 on Dataset 2, and consistently higher metrics on Dataset 1. These results confirm that attention-based architectures are more effective in modeling non-linear, interdependent risk factors, offering near-perfect classification outcomes. The findings demonstrate that integrating advanced deep learning models with structured clinical datasets can significantly improve cardiovascular risk prediction, supporting clinical decision-making by reducing misclassification rates and enabling timely, personalized healthcare interventions













